Forums  > Pricing & Modelling  > window similarity search  
     
Page 1 of 1
Display using:  

procrastinatus


Total Posts: 9
Joined: Oct 2012
 
Posted: 2020-07-10 05:15
Not sure if this is widely know or even practically useful to anyone here, but I hadn't encountered this fun gizmo until the other day:
https://www.cs.ucr.edu/~eamonn/PID4481997_extend_Matrix%20Profile_I.pdf

Basically, you take a short window from a time series, take the convolution of that window with another time series (or itself), take the minimum value of the convolution, and plot the minimums as you slide the window along - essentially you're plotting how close each window motif is to an existing window from the time series. Since the convolution can be computed efficiently in fourier space, with FFT this process ends up being pretty fast (apparently O(nlogn)).

Seems useful in general for signal anomaly detection. Also, it's efficient to index where (what time point) the convolution min for each window is, so you can use this detect changes in regimes (indexes will tend not to point across the change boundary). Seems like there might be other fun variations of this to play with.

tbretagn


Total Posts: 293
Joined: Oct 2004
 
Posted: 2020-07-10 08:46
Yes you should look at that post too

https://nuclearphynance.com/Show%20Post.aspx?PostIDKey=183329

Et meme si ce n'est pas vrai, il faut croire en l'histoire ancienne

Alfa


Total Posts: 15
Joined: Jun 2018
 
Posted: 2020-07-11 02:59
Very nice find. My thesis due to some reasons had to be about graphs and I ended up extending this exact work for discord discovery (anomaly detection) in temporal graphs. As data I calculated correlated implied volatilities for dozen randomly selected stocks options over time.

Prof Keogh is very well known in time series circles and reviews papers from major research places as I heard.

I struggle to find use for it given that You are operating wo prediction part.

It seems it may be useful to find two related time-series where one precedes the other. Potentially orderbook structure changing significantly while price didn't absorb the change yet.

Or for labeling data and building prediction engine based on (dis)similarity.

impedance


Total Posts: 2
Joined: Aug 2020
 
Posted: 2020-12-03 03:25
Just to check...when you say "take the convolution of that window with another time series (or itself)..." you are suggesting that you convolve the subsection with the _entire_ time series?

ronin


Total Posts: 630
Joined: May 2006
 
Posted: 2020-12-03 11:59
Yes, poor man's wavelet transform. Been there, done that.

It's not particuarly useful in financial time series. There ain't no bot stupid enough to get caught out by something like that any more.

"There is a SIX am?" -- Arthur

doomanx


Total Posts: 94
Joined: Jul 2018
 
Posted: 2020-12-03 13:56
@ronin is on the money. There's also some work that I think is quite similar on 'Signatures' which is essentially a continuous-time version of this by Terry Lyons (https://arxiv.org/abs/1405.4537 is an example, but not sure it was the paper I was thinking of).

The problem with this and Wavelets/spectral theory in general for financial time series in general is simply the noise. The two time series might look similar in two portions, but did the noise just look the same in two places or is there actually some signal there? To answer this question you would need some sort of null distribution for the joint probability that both time series looked the way they did in the interval, which is not something easy to do. Furthermore does this actually matter? Trading opportunities do not require that the time series behaves the same as before, just that the general pattern is the same. Condensing the time series into a set of values that matter for your strategy and comparing those is much more likely to be relevant.

That being said the ideas are not useless, but you need to use them to answer the right question. The work by some very famous authors I have mentioned in previous posts is the most useful 'version' of this approach.

did you use VWAP or triple-reinforced GAN execution?

ronin


Total Posts: 630
Joined: May 2006
 
Posted: 2020-12-03 15:19
> The problem with this and Wavelets/spectral theory in general for financial time series in general is simply the noise.

It's not really about noise. Wavelets are pretty good at filtering noise - that's what they were made for.

It's the interactivity and nonlinearity that beats them.

If the market was just a bunch of twap bots, wavelets would be at home. But it isn't. The number of twap bots, rounded to the nearest integer, is zero.

"There is a SIX am?" -- Arthur

doomanx


Total Posts: 94
Joined: Jul 2018
 
Posted: 2020-12-03 16:04
Was more focusing on specifically the Matrix profile and signature approaches when I wrote that comment, but some thoughts:

> It's not really about noise

If your definition of noise is just the variance, I totally agree. But generally finding a robust basis for an arbitrary signal is non-trivial. You start by projecting onto your wavelet basis and get a matrix of coefficients (one for each basis element on each observation of the signal). Usually you then perform dimensionality reduction on this matrix before some kind of thresholding/shrinkage and then reconstruction. But this dimensionality reduction suffers from the same issues as in other noisy contexts (i.e. Marcenko-Pastur/RMT type concerns about the covariance matrix and such).

Thresholding also requires some understanding of the 'usual' noise/covariance in the either the signal or the coefficients, which is again a whole problem in and of itself across financial economics (due to nonstationarites and all that).

> It's the interactivity and nonlinearity that beats them
Given that wavelet bases are an orthonormal function basis I don't see nonlinearity of a signal being a problem, but I assume you're not talking about 'that' non-linearity. Non-stationarity is a key issue though, both in the cross-section (which is I assume what you mean by interactivity) and time-series directions.

Denoising ideas are not useless in practice, but as always the devil is in the details.

did you use VWAP or triple-reinforced GAN execution?

ronin


Total Posts: 630
Joined: May 2006
 
Posted: 2020-12-03 21:52
> Non-stationarity is a key issue though

I'm not sure that it is. Compactly supported bases like Haar, Morlet etc do a decent job of localising the signal in time. So the fact that the ampitudes change with time is OK.

Speech and eeg are both non-stationary, and wavelets work fine with them. But then, they are highly periodic.


> I assume you're not talking about 'that' non-linearity.

I sort-of am.

If the system was linear, the appropriate wavelet transform would diagonalise it and each wavelet amplitude would exist in isolaton of other amplitudes. When it's nonlinear, each amplitude is affected by all other amplitudes, and the wavelet transform does nothing to simplify it.

At least, that was the end conclusion I reached when I played with this.

"There is a SIX am?" -- Arthur

doomanx


Total Posts: 94
Joined: Jul 2018
 
Posted: 2020-12-04 00:54
'When it's nonlinear, each amplitude is affected by all other amplitudes, and the wavelet transform does nothing to simplify it'
That's why you do dimensionality reduction on the coefficients - you de-correlate the representation and use this for reconstruction. See section 3 in https://arxiv.org/pdf/0901.4392.pdf for example. But obviously if you use PCA for this it only decorrelates them, doesn't mean all dependence is gone.

> Discussion of Non-Stationarity
Think I phrased my point poorly here again, as you're of course right that wavelets work on a number of non-stationary signals (that's kind of the point), but it depends on exactly which non-stationarity that you're considering. Stochastic vol type issues make it harder to do denoising via thresholding/shrinkage as you need an (at least local) estimate of noise. Correlations shifting make it harder to use a multivariate basis etc. You can get it to 'work' with a few smarts and a very small amount of the more advanced stuff, but doesn't mean it's the best tool to use.

did you use VWAP or triple-reinforced GAN execution?

ronin


Total Posts: 630
Joined: May 2006
 
Posted: 2020-12-04 15:03
> obviously if you use PCA for this it only decorrelates them, doesn't mean all dependence is gone.

Well, yeah. You project stuff to wavelets, then locally aproximate the amplitudes with a multivariate Gaussian. If they happen to really be globally multivariate Gaussian, you're doing great. Otherwise, you're not.

Does it really help with index arb? It's a pretty crowded trade, and it's pretty much down to speed. I don't see that being really smart with your wavelets has a non-negative expectation.

But I may be wrong, it's been years since I did anything remotely like that.

"There is a SIX am?" -- Arthur

doomanx


Total Posts: 94
Joined: Jul 2018
 
Posted: 2020-12-05 20:29
> Does it really help with index arb?

I didn't use it for index arb so I couldn't tell you ;) But I would imagine not, for the reasons you stated.

did you use VWAP or triple-reinforced GAN execution?

ronin


Total Posts: 630
Joined: May 2006
 
Posted: 2020-12-07 20:00
> I didn't use it for index arb so I couldn't tell you ;)

Fair enough. It just sounded very index arbitrage-y...

"There is a SIX am?" -- Arthur
Previous Thread :: Next Thread 
Page 1 of 1