Forums  > Trading  > Machine Learning Trading Strategy  
     
Page 1 of 1
Display using:  

ahgt_123


Total Posts: 24
Joined: May 2020
 
Posted: 2021-01-15 09:09
I have been toying with ML models for a couple of months now.I now have a model with sharpe 1.3 . The paper trading performance has been flat for 20 days(though its not uncommon for this sharpe and daily frequency).

Wanted to allocate some money to the model though i am a bit skeptical as it would be an uncharted territory. Mostly i am worried about curve fitting as it took me about
200 simulations to get there. It also uses ~80 features.

Can anyone guide me on how to proceed.

ronin


Total Posts: 647
Joined: May 2006
 
Posted: 2021-01-15 09:26
80 features?

So say you try to not go overboard. You limit your grid to 10 possible values for each feature. Which gives you 10^80 configurations.

And your best path out of these 10^80 has Sharpe 1.3 on paper?

I wouldn't.

"There is a SIX am?" -- Arthur

ahgt_123


Total Posts: 24
Joined: May 2020
 
Posted: 2021-01-15 09:38
The sharpe 1.3 is on "untouched" testing data.

I did had doubts about using 80 features but it seemed pretty common
in the few papers i read :

https://www.hillsdaleinv.com/uploads/Machine_Learning_for_Stock_Selection.pdf

and also this,

https://www.kaggle.com/c/jane-street-market-prediction

ronin


Total Posts: 647
Joined: May 2006
 
Posted: 2021-01-15 10:44
Yes...

I assume you've put in realistic transaction costs, slippage, interest, short interest etc? If it's intraday, realistic fill rates, queue positioning etc?

What is the actual distribution of paths? Do you have like one path with Sharpe 1.3, and 10^80 paths with Sharpe -10?

At the minimum, get really, really comfortable with stability of your path wrt the features. What happens if you bump the features - how stable is the pnl, how stable is the max drawdown?

Can you reduce it to a couple of features, and can you make sense of them? What is the signal actually saying? Does what it's saying even make sense?

I wouldn't even think about putting money in it until you get comfortable with all of those.

Chances are that all you are seeing is some randomness due to the large feature space. Nothing about this so far says predictive value.

"There is a SIX am?" -- Arthur

ahgt_123


Total Posts: 24
Joined: May 2020
 
Posted: 2021-01-15 11:04
TC ,Slippage are all accounted. Its daily rebal so no microstructure worries.I am using Adaboost with decision stumps & max tree depth 2 so its more like 10*80 paths and 80^2*10^2.

Can you please explain "bump the features"?

half of the features are Technical Indicators,some are common macro factors but i still wont say it makes sense as a whole.

ronin


Total Posts: 647
Joined: May 2006
 
Posted: 2021-01-15 11:36
> Can you please explain "bump the features"?

Say your optimal path has feature 29 = -8.2763

What happens when feature 29 = -7 or -9? Bump sizes based on the confidence interval for feature 29.

"There is a SIX am?" -- Arthur

Maggette


Total Posts: 1288
Joined: Jun 2007
 
Posted: 2021-01-15 14:19
My suggestions:

IMHO the 80 features are weird and I would guess they are not necessary.

But what you should worry about is the 200 simulations. Even though I do not exactly know what you simulated and what you optimized while simulating and how exactly you simulated. What exactly were you optimizing in the 200 runs?

1) Forget trading strategy and sharpe ratios first. At the end of the day, every trading strategy makes a prediction. Can you phrase (to yourself if you don't want to post here): what are you actually predicting? If you can, check if your model has predictive power when compared to a simple benchmark? I think that is what ronin meant with
"What is the signal actually saying?"

2) If you either don't know what you are predicting because your model is that black-boxy or your model doesn't have predictive power, you are in the realm of "I might be wrong a lot, but I when I am right, I am right big time".

That means your strategy probably relies on the optimal bet size and "cut your losses/ exit timing" parameters a lot. And that is prone to over optimizing. I would suggest to have a good look on histograms/distribution of your holding periods and PNL.

EDIT: and pump random gaussian noice in your whole pipeline and check your PNL and sharpe, just to be sure :).

Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...

ahgt_123


Total Posts: 24
Joined: May 2020
 
Posted: 2021-01-15 14:36
Yes they are all not necessary. Only around 20 are "important".
200 simulations resulted due to optimizing hyperparameters ( I tried other models too) and adding more features (started with 10).

The strategy is pretty simple : predict the direction of next day and trade. I haven't tried optimizing bet size and others things.

At core, the model(adaboost) is assigning weights to individual trees and giving me
a net direction.

Should i add gaussian noise to individual features?
Also , will optimizing bet size add value?



Maggette


Total Posts: 1288
Joined: Jun 2007
 
Posted: 2021-01-15 15:13
If you have information of the distribution (which you I guess won't have by applying a plain vanilla ada boosted tree ensemble) it will have an impact. Betting less money, when you are not sure is kind of a good thing:).

By "adding gaussian noise" I meant something more stupid: generate an artificial asset from a random process and put in your whole pipeline (training, optimization and evaluation). If your pipeline tells you you found something, your pipeline is buggy. Happened to me more than once. This is more an integration test of your pipeline than something that gives insight.

"The strategy is pretty simple : predict the direction of next day"
Ok. That's good to know. Hence I guess you phrased it as an classification problem, not as an regression problem? Compare that to simple

naive 1: "long only" => always up.
naive 2: same as yesterday



Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...

ronin


Total Posts: 647
Joined: May 2006
 
Posted: 2021-01-15 16:26
> Should i add gaussian noise to individual features?

You should add noise to *everything*. Litreally, everything.

> Also , will optimizing bet size add value?

No. It will just give you extra rope to overtrain.

Look, the best you can say at this point is that your data mining may have flagged something.

Next steps, in this order:
1. Has it *really* found something?
2. What has it found?
3. Now that I know what it is, is it really interesting?
4. Now that it is interesting, what's the best way to turn it into a strategy?

Probability of passing each stage is maybe 1-2%.

"There is a SIX am?" -- Arthur

doomanx


Total Posts: 103
Joined: Jul 2018
 
Posted: 2021-01-15 16:40
I think the advice given in this thread is very solid and don't have much more to add before you have taken these steps, but nonetheless:

'will optimizing bet size add value' - if you have some signal, then yes. Otherwise if anything will make things worse.

'Its daily rebal so no microstructure worries' - how many assets are you trading and how liquid are they? This might be a poor assumption.

When you're trying to figure out what you've actually found, do check your factor exposure. Not necessarily a problem if you have a good implementation of some factor-based strategy, but you need to know.

EDIT: you mention that you're using data up to time t to predict time t+1. Which timepoints within the day are you using to predict time t+1? You may be capturing some end of day effect that is difficult to implement.

did you use VWAP or triple-reinforced GAN execution?
Previous Thread :: Next Thread 
Page 1 of 1