
We have found something like that too, ~2 SR before t costs, pretty much 0 SR after. I am not giving up on the idea though, one day we might make that work. 
Inflatable trader 



chiral3

Founding Member

Total Posts: 4983 
Joined: Mar 2004 


First time it happened to me was many years ago. I was using a modified Bayesianesque Kalmanlike filter to get hedge ratios calculated for this dynamic basket. Insample and outofsample the rsq was through the roof. Residuals were nonexistent. Then I looked at the stability of the coefficients and it was clear that it was not worth trying to reap the benefit of the explanatory power. 
Nonius is Satoshi Nakamoto. 物の哀れ 


ronin


Total Posts: 183 
Joined: May 2006 


Nonius,
This sounds more like a perturbation problem, like a slow envelope on a fast wave.
From the fluid dynamics analogy, there is no reason why the weights would be mean reverting. But perturbation analysis can probably still help.

"People say nothing's impossible, but I do nothing every day" Winnie The Pooh 



Nonius

Founding Member Nonius Unbound

Total Posts: 12661 
Joined: Mar 2004 


yes, that's true Ronin, I guess there's no reason to suspect that.
if I had
(Cos(X), Sin(X)) where X is some random variable derived from estimating a PCA, that could be all over the map (on the circle). if you updated it every second based on a moving window of 4 hours with observations every minute, that vector is going to be wildly volatile. If you updated it once at the end of the day based on a 20 day window on 1 minute time increments, it's not very volatile and fairly stable. tried autoregression on X. no go.
Chiral: never had any luck whatsoever with Kalman, but I guess this is sort of quasi Kalman concept. 
Chiral is Tyler Durden 


ronin


Total Posts: 183 
Joined: May 2006 


Right, but that is more about different correlations coming out of different window/frequency combinations than about the dynamics of the PCA vectors.
What I meant was assume your PCA vector depends on epsilon*time for some small epsilon and expand the PCA vector in powers of epsilon. At leading order it is just your static PCA vector, but at higher orders you may get something interesting.

"People say nothing's impossible, but I do nothing every day" Winnie The Pooh 



Nonius

Founding Member Nonius Unbound

Total Posts: 12661 
Joined: Mar 2004 


In fact it's the higher orders I was looking at!

Chiral is Tyler Durden 


ronin


Total Posts: 183 
Joined: May 2006 


Fair enough. You have done the calcs and I haven't, so I'll shut up now... 
"People say nothing's impossible, but I do nothing every day" Winnie The Pooh 



FatChoi


Total Posts: 125 
Joined: Feb 2008 


Looking at this as a way of determining an APT style factor decomposition, if the model is stationary the best estimator will be the longest sample available. If the model is not stationary, what is happening? A simple thing would be stationary weightings on GARCHy factors. I think this would give more salient factors than might seem reasonable from a typical PCA analysis and the best estimators would need a long sample to estimate GARCH parameters than only become visible occasionally. It would of course also require a state estimator. This could be interpreted as a stable long term structure requiring highfrequency current state estimator. I found nice discussions of GARCH type covariance estimators here and here. When things get this complicated is the covariance matrix still an effective summary of the joint distribution of returns? 



ronin


Total Posts: 183 
Joined: May 2006 


Nonius, you have got me intrigued with this  enough to do the perturbation calc myself.
Bottom line is:
 the eigenvalues and eigenvectors come out as nice and smooth functions of slow time, without finite time singularities.
 the problem is that when the eigenvalues and eigenvectors evolve, the eigenvalue that started as the smallest may grow over time, and another eigenvalue may shrink. When two such eigenvalues cross, your PCA vector shifts discontinuously from the eigenvector of one eigenvalue to the eigenvector of another eigenvalue.
Does that make sense in terms of what you are seeing?
r 
"People say nothing's impossible, but I do nothing every day" Winnie The Pooh 



Nonius

Founding Member Nonius Unbound

Total Posts: 12661 
Joined: Mar 2004 


ronin, for the eigenvectors associated with "low" eigenvalues, there is definitely a risk that things could cross because I do not believe you can "choose" in a consistent manner the "kth" PCA. for example, suppose we are talking about a 10x10 matrix and we want to look at the eigenvectors of the last 5 smallest eigenvalues, call it at time t v6(t)...v10(t). Then, for example, what does it mean for us to specify v8(t)? at time s do we assign a "random path" from v8(t) to v8(s) or is it more natural to associate v9(t) with v8(s)? maybe this is what you're talking about? in the 2x2 case this doesn't really happen as the first PCA you could think of as a sort of "momentum" direction and the second is sort of a mean reverting direction, and those don't really cross for obvious reasons.
one way to avoid the lack of knowledge of how to see those low eigenvectors being ordered would be to concentrate on the high PCA, then just say you're interested in the orthogonal complement of the high PCA in one fell swoop. that's basically like looking at the "residual risk" component. I looked into that but there are other complexities that arise doing that. 
Chiral is Tyler Durden 



@Nonius
Quick and dirty approach. Roll a Nwindow PCA for [[T, T+N],[T+1,T+N+1], ...]. For a large enough N relative to sample time, the periodtoperiod change in PCA eigenvectors should be sufficiently smooth. It should be trivial to map the eigenvectors at time T to its counterpart at time T+1 simply by looking at the crosscorrelation of projections in the common dataset [T+1, T+N]. Even if eigenvector ranks change, just follow the map. You could even get a little fancier by setting a correlation threshold where you assume an eigenvector has "dropped out". 





> Insample and outofsample the rsq was through the roof. Residuals were nonexistent. Then I looked at the stability of the coefficients and it was clear that it was not worth trying to reap the benefit of the explanatory power.
Best guess to what was happening here: as sampling frequency increases correlations start to fall to 0. Pricediscreteness and microstructures effects dominate over smooth stochastic diffusion. At subsecond intervals, you're mostly modeling which stocks are ticking together. Particularly thickbook, lowpriced stocks. The dynamic eigenvectors are proxying largescale portfolios that are currently rebalancing.
E.g. say [IBM, MSFT and AAPL] are highly correlated on a lowfrequency basis. But currently there's a large rebalancing portfolio that's concentrated in IBM and MSFT, but not AAPL. The execution algo is probably sending orders at the same time for IBM and MSFT. AAPL will reconverge, but it will take time for pricediscovery to disperse across symbols. Sampled at highenough frequency you'll see a high crosscorrelation at that time for IBMMSFT, but low for AAPLIBM/MSFT.
A simple way to make money off this might be to check for eigenvectors that persistently form at certain times of day or at common rebalance times (e.g. on the hour, end of month, etc.). Large portfolios often trade on fairly predictable schedules. By the time the dynamic eigenvector appears in your Kalman filter its probably too late to monetize it. but if you can reliably predict a similar eigenvector at the same time tomorrow, then you can get in front of it. Particularly if you can predict its directional bias. 




The low eigenvalues/eigenvectors are very unstable because of estimation error (see MacenkoPastur) and nonstationarity. Only a handful of eigenvalues are significant (1? 3?, 5?), with their corresponding eigenvectors drifting slowly enough in time to be useful. In my tests, correl(v1[t], v1[t+1]) was 90%+, but correl(v10[t],v10[t+1]) was ~0 
Inflatable trader 



ronin


Total Posts: 183 
Joined: May 2006 


Nonius,
In some respect this is now analogous to worstof options or firsttodefaults.
The question really shouldn't be "what is my lowest eigenvalue", it should be "what is the probability that this eigenvalue will be the lowest over the period I am interested in".
In the textbook stationary example it is (100%, 0, 0 ....), but in reality it is (delta1, delta2, ....)
I.e. you have a delta wrt each eigenvalue, with the associated cross gammas, you are tracking them all, and you adjust your trading to account for the deltas and the cross gammas.
Your last point  I don't think that will lead anywhere. There is no difference between looking at 5 lowest eigenvalues and the complement of 5 highest eigenvalues.

"People say nothing's impossible, but I do nothing every day" Winnie The Pooh 


afekz


Total Posts: 24 
Joined: Jun 2010 


ronin wrote: "Your last point  I don't think that will lead anywhere. There is no difference between looking at 5 lowest eigenvalues and the complement of 5 highest eigenvalues." I think (suspect?) that the point was more that one should look at them collectively ("one fell swoop"), rather than doing anything that assumes that there's structure outside the top 5: "5 lowest" becomes "a residual".





Nonius

Founding Member Nonius Unbound

Total Posts: 12661 
Joined: Mar 2004 


@afekz...yes, bingo. it's to look at the complement as a residual in a regression against the higher, more stable stuff; of necessity, that residual is a weighted sum of all that lower crap, but you don't necessarily explicitly compute those weights. 
Chiral is Tyler Durden 


ronin


Total Posts: 183 
Joined: May 2006 


Oh I agree with that  compute what you need, cut off the rest  no argument.
The point I was making was about snapshots of nonstationary systems. Especially intraday vols are seasonal and random. In my mind the biggest problem is not taking that into account, much more than what ever time you can save by not calculating residual eigenvalues.

"People say nothing's impossible, but I do nothing every day" Winnie The Pooh 




saw this: https://jwindle.github.io/doc/JSMpresentation.pdf seems relevant. closed form, which is nice. never implemented it though 



Nonius

Founding Member Nonius Unbound

Total Posts: 12661 
Joined: Mar 2004 


thanks. I just thumbed through it but it looks kind of interesting. 
Chiral is Tyler Durden 



akimon


Total Posts: 566 
Joined: Dec 2004 


I found the concept of using Variational Bayes and Auto Encoders quite fascinating, and I view it as an extension to PCA.
Basically, rather than assuming the data is representable as a linear combination of eigenvectors, we get a neural network to represent the data as a highly nonlinear function of simple gaussian "Latent Vectors" (which work like PCA components).
Links to the original paper: AutoEncoding Variational Bayes
Link to something I did: Generating Large Images from Latent Vectors



