
What did you get for IBM? 




 


What period did you fit the model to? What was the holdout period? 





Here you go about a course on ML and RL:
https://www.coursera.org/specializations/machinelearningreinforcementfinance
Arrogant traders welcome :) 




I'm not a trader, and I'm not particularly arrogant.
In what way does it answer my question? Are you saying that RL doesn't need holdout data? that's not true. 





It’s not related to your question. Calibration was done on yearly periods separately for each year. All details and numbers are in the paper. Let me know if it looks meaningful or not. 




Can you give the link again? 





https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3174498 




Some questions:
"To show detailed results, we use the DJI index instead of the S&P500 index that is more commonly used as a market portfolio."  why? any comment on the possible problems due to how DJI is constructed?
"While the results are only weakly dependent on the value of regularization parameter in the range λ ∼ 10−3 − 10−2, we report the results for the value of λ = 10−2."  why this range? why did you settle on 1e2?
"The first set of predictors includes two predictors for each stock: a perfect signal and a random signal."  nobody trades on a random signal... why don't you use a set perfect signal + very simple momentum signal?
Did you do any out of sample tests? it's not mentioned anywhere in the paper as far as I can tell.
My comment: you have some interesting ideas, but you didn't test them rigorously. It will look more convincing if you choose one concrete model (the paper seems to contain several versions) and test the shit out of it. 





That’s correct, we did not show out of sample in the paper. They behave good in general. The paper was too long without detailed analysis of data.
This is planned indeed for the next paper.
Another point is that it is not about the quality of our signals  we picked the dumbest ones to illustrate the structure of the model with a nonlinear mean reversion, which I believe is a better model for stock prices. The paper is not at all about the choice of signals, and it states so in a third or a forth paragraph in the introduction. 




Maybe I'm thick, but for me it's not clear what the paper is about. I think it's too long. 





You see, you cannot be pleased. Don’t read the whole paper, start with eq.(107). The paper is about what is stated in the abstract. 




Abstract:
1. what is the agent's goal? the abstract doesn't clarify this. 2. the model "can be implemented" using TensorFlow  did you implement it in TF? if yes, write "is implemented", it will be clearer and stronger.
If you have to tell an interested reader to skip ~100 equations before getting to the core matter, then the paper has ~100 equations too many :) or at least you should move a lot of stuff to the appendix.
BTW, "which makes an intuitive sense" should be "which makes intuitive sense". 





The goal of the agent is explained in the main text. There is no rule prescribing that everything should be explained in the abstract. Thanks for correcting my grammar though.





Well, it would help to explain major points in the abstract though :)
"the objective of a boundedrational agent can be viewed as the problem of rebalancing its own fictitious ”shadow” portfolio, such that it is kept as close as possible to the market portfolio in such continuous selfplay."
Is that it? 





what do you mean "is that it?" This is a mathematical model that leads to some observable consequences. I do not understand your question, sorry. 




Is that the goal of the agent? 





This is one way to phrase it. The goal is to maximize the objective function that is defined in the paper. 




Aw come on, man. You should be able to express it clearly. E.g. the goal of the Alpha Zero agent is to win the game. The goal of trading agent should be to make a ton of money. If you want to model reality, set a realistic goal. 





It is a reality  it produces observable consequences. An alternative and simpler (less mysterious) meaning of this agent will be given in the next paper (in progress), along with some improvements of the resulting model :)
Also, if you are familiar with biology models based on the free energy, you should know that exact separation of a 'self' from 'nonself' is not trivial in these models. The same thing here. Any model is judged by its consequences. An insistence on 'purity' is quite subjective. 




Right... so you're modelling an agent coupled to an external environment? 



