nikol


Total Posts: 987 
Joined: Jun 2005 


I trade on X and monitor Y which is leading X but with variable latency. Apparently this number impacts performance.
Sornette's paper "Optimal termal casual path" is nice but seems too "expensive" for real time. Similar to that I thought to scan cross correlation with variable intervals but that is also expensive.
Can you suggest any other methods ? 




goldorak


Total Posts: 1090 
Joined: Nov 2004 


The day Sornette will make actual money with his research... except from selling signals... His latest venture with CS looked like a disaster the last time I cared to have a look.
Nce guy though.

If you are not living on the edge you are taking up too much space. 


nikol


Total Posts: 987 
Joined: Jun 2005 


That paper was not related to trading but attempt to give solution to Granger criteria. Or you measure success through the profit lenses ? I guess he is doing well publishing his articles, books and lecturing.
That idea is cool, but yet slow. It led me to other couple of ideas, but will take a lot of implementation. :(
PS. Perhaps EMA is used in zillions of trading algos. Only 5% of those algo using EMA are making profits. Is it still worth employing EMA?





doomanx


Total Posts: 62 
Joined: Jul 2018 


@nikol I can't give you the full method, but there's a way to do this with a directed graph and spectral clustering. 



nikol


Total Posts: 987 
Joined: Jun 2005 


@doomanx
Thank you for this direction.
My other thoughts were:  "Optimal Energy" idea fits well into FPGA combinatorial techniques (I have proposed such idea in my PhD exploiting geometry of decays. Now it is used in LHC/CERN to measure luminosity). Main problem is  I am not FPGA programmer and it will take time to become one (with assembler coding experience not afraid, but it is time...).  make MC for X and Y (=X+time shift+noise) and use this sample to train ANN. Then use "transfer learning" to adopt trained ANN to real sample. I am aware that it takes quite a time without guarantee of outcome with wanted level of accuracy ("guaranteed" ~ well controlled like in case of pure numerical methods). 




Maggette


Total Posts: 1212 
Joined: Jun 2007 


Hi nikol, I am a bit confused by the term "lead&lag latencz" here. That's obviously my problem, since doomax seems to understand it.
So please help me to understand the problem.
Assuming the relationship between x and y were linear: are we talking about the fact that x leads y by a varying correlation?
In a simple model where y follows x exactly by a fluctuating lag?
y[t] = x[tlag[t]] ?
To be more a clear here an quick (and ugly!!!) python thingy...but I have to get off the train in a minute, so no time to make it nice (sorry):
import numpy as np import pandas as pd import matplotlib.pyplot as plt
np.random.seed(seed=10) p_of_switch_lag = 0.15 min_lag = 1 max_lag = 10 length_of_series = 60
# assuming the lag is "sticky" and the lag doesn't switch to often switch_lag_state = np.random.binomial(1,p_of_switch_lag,(length_of_series))
# picking lags..keep old one or randomly pick a new one lags = [] lag = np.random.randint(min_lag,max_lag,(1))[0]
for state in switch_lag_state: if(state != 0): lag = np.random.randint(min_lag,max_lag,(1))[0] # lags.append(lag)
#x and y y exactly folloes x by random lag x = np.random.standard_normal(length_of_series) y = [] for t in np.arange(0,length_of_series): lag = min(max(tlags[t],0),t) y.append(x[lag])
df = pd.DataFrame(switch_lag_state,columns=["switching_lag"]) df["lag"] = lags df["x"] = x df["y"] = y
df.head() df[["lag","x","y"]].plot() plt.show()
edit: damn it is loosing tabs... 
Ich kam hierher und sah dich und deine Leute lächeln,
und sagte mir: Maggette, scheiss auf den small talk,
lass lieber deine Fäuste sprechen...



nikol


Total Posts: 987 
Joined: Jun 2005 


> lag = np.random.randint(min_lag,max_lag,(1))[0]
In my case lag is more or less grouped around single value. But yes it can move wildly at some events, so I want to avoid them (maybe later want to exploit).
I measure xcorr as this:
def xcorr(x,y, rng, nodiff=False): ....if nodiff: ........dx = x ........dy = y ....else: ........dx = np.diff(x, 1) ........dy = np.diff(y, 1) ....sel = (np.abs(dx) > 0) & (np.abs(dy) > 0) ....xx = [(i, np.corrcoef(dx[sel], np.roll(dy[sel], i))[0][1]) for i in range(rng, rng)] ....xxr = list(zip(*xx)) ....return xxr[0], xxr[1]
here is picture of cross_correlations found from synchronized tick series sampled at 100 ms (resolution of 1 lag): pd.DataFrame(...synchronized tick data...).resample('100L').ohlc()
If I sample at 200, 250, 500, 1000 ms than correlations become stronger but the lag is less "pronounced" from which I conclude that I have ~100300 ms advantage.
Still, this result is obtained in backtest (postproduction) mode and completely unusable in realtime. Besides, I cannot claim that my advantage is permanent. At some events I anticipate that this lag can change. Therefore, I want to know it right at the moment of trade (+/ 100 us)
PS. I know, there is "wealth" of underlying structure behind these series which might impact leadlag pattern, but anyway. 




nikol


Total Posts: 987 
Joined: Jun 2005 


The model behind my problem is this: X(t) is continuous price process with Px  pointlike "realizations" of X with Poisson rate Rx: X(t_i) Y(t) = X(t+tau) is also continuous process with Py  pointlike "realizations" of Y with rate Ry: Y(t_j)
Let say, typically Ry/10 ~ Rx. More rigorously, Y updates if change of X is "large".
I want to know tau.
Pointlike process is emulating asynchronous tick data. It can be trade prices, but in my case it is some kind of midprice, which is inside bidask interval. 



Maggette


Total Posts: 1212 
Joined: Jun 2007 


Ahh. Ok. Thx. And suddenly the posts by you and doomax make a lot of sense now:). Should have been able to figure it out myself. Sorry.
Thx again. Will think about it. 
Ich kam hierher und sah dich und deine Leute lächeln,
und sagte mir: Maggette, scheiss auf den small talk,
lass lieber deine Fäuste sprechen...




doomanx


Total Posts: 62 
Joined: Jul 2018 


@nikol something I can talk about that may or may not be relevant  do you actually need to estimate tau or is this propagated into some higherlevel decision making? If you're just looking for an optimal forecast based on some variable lags try searching around for 'mixed delay filter' signal processing literature. 



nikol


Total Posts: 987 
Joined: Jun 2005 


@doomanx
Lead value (tau) enters into risk estimation, hence, yes, it has an impact on market order submission policy and impacts limit order price. In a sense tau may change such that my signal is lagging prices. In this case I have to be aware and switch quoting machine from the signal into 'actual market'.
Thank you for directions, I m looking into it (Spectral clustering is a thing). 





I was also looking into this a while back, it didn't go anywhere on account of other things but I remember stumbling across https://github.com/philipperemy/leadlag. I've not actually looked at it so no idea if it's suitable for purpose but who knows, maybe it'll help :) 


