Forums > Basics > Machine Learning with Multivariate Data vs Multivariate Time Series Data

 Page 1 of 1
Display using:
 Jurassic Total Posts: 367 Joined: Mar 2018
 Posted: 2020-01-28 19:32 What are the differences when training models when you think about multivariate data (no time) vs multivariate time series data? What I mean by this is that we have X_1,....X_N where N is large and for each each for these we have a Y. Assume that we have 10000 samples of (X_1,....X_N, Y).Does it make a difference when you think about statistical methods vs machine learning methods vs time series analysis methods?
 Nonius Founding MemberNonius Unbound Total Posts: 12799 Joined: Mar 2004
 Posted: 2020-02-05 18:10 the main difference is if data is temporal in nature, then you want to be careful about not randomizing/shuffling data so that out-of-sample (and/or validation) sets occur before training sets.more minor differences are a) there are lots of ways of natural downsampling of temporal data and b) time (or some transformation thereof) itself can be a feature in time series data.On your second question, I personally don't make much distinction between those three viewpoints, although ML people slice up datasets into three subsets (training, validation, testing) whereas the old skool stats way of doing things is sort of binary, ie, in-sample/out-of-sample. Chiral is Tyler Durden
 Maggette Total Posts: 1251 Joined: Jun 2007
 Posted: 2020-02-10 12:45 I do lot's of time series analysis. And I don't make an distinction between "classical techniques" (SARIMAX,ARCH Exponential Smoothing, State Space/Kalman Filter, Elastic Mode Decomposition, Fourier Transformation) and more ML driven stuff. More often than not a stats-machine learning hybrid approach is a good way to handle things. Have a look at everything signal processing has to offer. The classical train/validation/test split is often dangerous. You should in addition work with some adapted form of cross validation, that takes intoc account the autocorrelation or even mutual information with respective lags. Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...
 gaj Total Posts: 112 Joined: Apr 2018
 Posted: 2020-02-10 14:31 @Maggette: what are the benefits of using classical techniques compared to machine learning?Correct me if I'm wrong, but it seems that most classical techniques fall under the category of generative models. This means you start with a parameterized model of the system, then estimate the parameters using MLE on the observed data, and finally make a prediction using the estimated parameters. In contrast, machine learning methods make the prediction directly by minimizing some objective function. So I feel like classical methods are often weaker than ML methods.
 Jurassic Total Posts: 367 Joined: Mar 2018
 Posted: 2020-02-10 17:55 @Maggetteyou cant use cross validation easier with time structure as you need to avoid sampling points in the future and testing them on the past
 Maggette Total Posts: 1251 Joined: Jun 2007
 Posted: 2020-02-10 18:22 @Jurassic That's why I wrote you should use some variant nested cross validation. The normal nested CV does not take autocorrelation into account. Something along the line like this:https://robjhyndman.com/papers/cv-wp.pdfhttps://www.sciencedirect.com/science/article/abs/pii/S0304407600000300And by the way: IMO it is important to "retrofit" your final model. Not to use it for forecasting, but for model selection and check your hyperparameter set. If you train your model on "future data" and A) its hyperparameters look vastly different than on your normal train test splitB) does not fit past data when trained on future data and tested on past data you more porbably than not have overfitted garbage from the start @gaj: I don't think so. IMHO Boltzman Machines are generative models. Also in my books regulation methods for linear regression like ridge or lasso tend to be from the ML tribe. But I do think I get what you are saying. And a Random Forrest comes with a strict set of preset parameters as well (depth, leaves etc). That's in general a very strong assumption about how the world you try to model works. And I don't feel that they are weaker. They can be more complex. It's all rather pointless semantics to me. I use what I think make sense....and check it by using stuff that I doesn't make sense. Computaion time is cheap these days. I recently had success in time series with Deep ANNs where I at first sight had the opinion that classical models will be hard to beat. And of course I had the reverse case more than once. At the end of the day, allmost all time series models I have in production are some kind of hybrid.Pure LSTMs sucked for me so far. DeepAR or a hybrid method of CNNs and other stuff worked fine so far. Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...
 ronin Total Posts: 594 Joined: May 2006
 Posted: 2020-02-11 20:46 > Does it make a difference when you think about statistical methods vs machine learning methods vs time series analysis methods?Well.When you are doing things like option pricing or portfolio optimization, your worst case scenario is that your time series is completely random. So that's what people looked at, and that is how we ended up with theories for option pricing and portfolio optimization for lognormal random variables. But then, you can't arbitrage a lognormal process. We also employ people whose job is to arbitrage things. If that is you, you have to start with the assumption that things are slightly less random, and slighty more time-series-y. Otherwise you are out of the job.> In contrast, machine learning methods make the prediction directly by minimizing some objective function.Yeah. That's a bit older than machine learning. It is called non-parametric statistics. It existed long before there was machine learning. Or even machines.Ed Thorp did a fair amount of work on that. But it was always niche. Why? Because it is really difficult to judge when you are generalizing and when you are overtraining. Especially if your data set isn't unlimited. Does that sound familiar? "There is a SIX am?" -- Arthur
 nikol Total Posts: 1172 Joined: Jun 2005
 Posted: 2020-02-17 12:41 @Maggette> Pure LSTMs sucked for me so far. Same impression with LTSM. But I used something like diff(prices) at input and positive PnL at output. Had to rethink the training strategy.>DeepAR or a hybrid method of CNNs and other stuff worked fine so far.Hm, I came to similar conclusion trying to solve LTSM problem. For the links
 Jurassic Total Posts: 367 Joined: Mar 2018
 Posted: 2020-03-15 16:51 " the main difference is if data is temporal in nature, then you want to be careful about not randomizing/shuffling data so that out-of-sample (and/or validation) sets occur before training sets."@Nonius thats the answer I was hoping for. The problems regarding slicing of datasets is usually just to apply common sense.
 Page 1 of 1