Forums > Pricing & Modelling > Model-free option pricing by Reinforcement Learning

 Page 3 of 4 Goto to page: 1, 2, [3], 4 Prev Next
Display using:
 katastrofa Total Posts: 458 Joined: Jul 2008
 Posted: 2018-05-12 23:50 Most DeepMind papers are sent to conferences, where they are peer-reviewed. You can even read the reviews: https://openreview.net/group?id=ICLR.cc/2018/Conference etc.
 Nudnik Shpilkes Total Posts: 47 Joined: Jan 2009
 Posted: 2018-05-12 23:50 What do you mean to say by that?
 katastrofa Total Posts: 458 Joined: Jul 2008
 Posted: 2018-05-12 23:53 "As you saw, words "model-free" have some marketing appeal :)"Is that why you called your approach "model-free"? ;-)"My current view is that there is no strict separation between 'model-free' and 'model-based' approaches."I agree, to some extent. Implicitly, there's always a model of reality. There's no learning without bias. Maybe we should talk about "(explicit model)-free" methods."Whenever you add a regularization and it's role is important, you admixture a 'model-based' component to a 'model-free' component."Ditto with bootstrapping ;-)"It's been a while since I looked into Google's Atari paper, so do not recall details of what exactly they did with regularization. But in general, my sense is that deep learning is not a good approach for trading applications, so I largely stopped paying much attention lately to this. I think RL is more promising."Why the opposition RL vs DL? Synthesis is "statistical learning".
 Nudnik Shpilkes Total Posts: 47 Joined: Jan 2009
 Posted: 2018-05-13 00:43 Yeah man, now you are getting it!Smart marketing is important in science too :)Do you know the history of deep learning revolution?Re RL vs DL: RL is a paradigm, DL is a method. RL focuses on the *main* task - to act/trade optimally. DL focuses on an *auxiliary* task (prediction of returns or something). Its is a huge difference of the paradigm. In practice, these two things can be blended of course - see Deep RL, model-based RL, etc.
 katastrofa Total Posts: 458 Joined: Jul 2008
 Posted: 2018-05-13 00:51 "What do you mean to say by that?"You seemed to imply that peer review is finished in ML. It isn't.
 Nudnik Shpilkes Total Posts: 47 Joined: Jan 2009
 Posted: 2018-05-13 01:20 No, that's not what I meant to say. What I meant is that even by high standard of peer-reviewed journals, statistically most of published work in branches of science I am most familiar with (physics, statistics, ML) is either crap of 'quiet pathology' (using the words of Landau). If you think that everything that comes out of Google is genius and smart, you might find it interesting to read a series of blogs on RL by Ben Recht. One example he mentioned - a recent Google paper claimed that evolutionary algorithms work as well as policy gradients in RL. As Ben points out, it actually means that policy gradients are as inefficient as evolutionary algorithms, because they essentially amount to a pure random search. And so on :) Enjoy: http://www.argmin.net/2018/02/20/reinforce/
 katastrofa Total Posts: 458 Joined: Jul 2008
 Posted: 2018-05-13 02:27 "you might find it interesting to read a series of blogs on RL by Ben Recht"Yeah, I went to his talk once. He is a very eloquent guy, but he set up himself a bit of a strawman - he showed that for a simple problem, a model-based approach kicks the shit out of a model-free. I don't think anyone in Google (or Facebook, or Microsoft) disagrees with that. And you can take the model-based approach much further, to a level of a nuclear plant or further (but let's not mention Chernobyl, OK?). But the problem is that the problems which e.g. DeepMind is interested in (bulding a real AI, basically) have no models. If you can come up with a solvable model of intelligence, pack your suitcase and hop on a plane to Stockholm ;-)
 Nudnik Shpilkes Total Posts: 47 Joined: Jan 2009
 Posted: 2018-05-13 02:42 Yes, but I thought here we are not as ambitious as Google, and not really after models of general intelligence? Ben's blog talks about basic RL algorithms, not the "AGI'.I was very enthusiastic on DL initially, but the more I understood about it, the less I actually liked it. I am very skeptical about learning in a completely 'model-free' way. At least in finance, I think a good way is first to take a simplest semi-solvable case, and solve it from the start to the end using only RL, but in a completely controllable setting, where the solution just reduces to a bunch of linear regressions. This is what I just did in my papers.
 katastrofa Total Posts: 458 Joined: Jul 2008
 Posted: 2018-05-13 02:45 I think basic RL was always meant to be a toy model for doing the big thing.
 Nudnik Shpilkes Total Posts: 47 Joined: Jan 2009
 Posted: 2018-05-13 03:18 You mean the objective of RL in general?RL is definitely on a path from Supervised Learning to AGI, but it is a tiny step. I don't think we should confuse the current state-of-the art in RL with AGI - they have very little in common IMHO.
 katastrofa Total Posts: 458 Joined: Jul 2008
 Posted: 2018-05-13 03:27 You're saying that as if you knew where the path ends ;-)
 Nudnik Shpilkes Total Posts: 47 Joined: Jan 2009
 Posted: 2018-05-13 03:38 well, precious little I understand about AGI tells me it is quite different from the modern RL. I had some fun playing an arrogant ass among arrogant traders, but I did not think about playing a prophet yet :)
 NeroTulip Total Posts: 1017 Joined: May 2004
 Posted: 2018-05-15 04:30 Hi Nudnik,Interesting approach, I though of doing something along these lines but never got to it. Good to see someone putting in the effort!One note: if you are using resampling, you are destroying serial correlations and therefore are not completely model-free. Still, this is close, and the best one can do.The interesting part about using artificial data is that you know what the "correct" answer is: your agent should, over time, rediscover Black-Scholes. The question is how much data you need for that: 1 year, 10 years, 100 years?I also suspect that if you use a stochastic vol model to generate your artificial data, learning will be slower than in the GBM case, as your agent will need to observe rare events to rediscover the vol smile. How much slower the learning is, is an interesting empirical question.The real world process is more complex than the GBM or stochastic vol cases (and unknown to us), so learning would be even slower. I suspect that the answer would be that you would need orders of magnitude more data than is available for your agent to be able to learn in the real world, but it would be great to have some numbers to back that up. "Earth: some bacteria and basic life forms, no sign of intelligent life" (Message from a type III civilization probe sent to the solar system circa 2016)
 Strange Total Posts: 1501 Joined: Jun 2004
 Posted: 2018-05-15 06:42 > I also suspect that if you use a stochastic vol model to generate your artificial data, learning will be slower than in the GBM case, as your agent will need to observe rare events to rediscover the vol smile. The agent will also have to know about the upcoming events - NFPs, FOMCs etc. Unless, of course, you back these things out of the current market prices. I don't interest myself in 'why?'. I think more often in terms of 'when?'...sometimes 'where?'. And always how much?'
 NeroTulip Total Posts: 1017 Joined: May 2004
 Posted: 2018-05-15 07:31 That would be true with real data, but if you are playing with artificial data generated by a stochastic vol model, these things do not exist. All the agent needs to figure out is the generating process and its parameters e.g. vol, vol of vol, mean reversion, etc...Hope what I am saying is clear. "Earth: some bacteria and basic life forms, no sign of intelligent life" (Message from a type III civilization probe sent to the solar system circa 2016)
 Nudnik Shpilkes Total Posts: 47 Joined: Jan 2009
 Posted: 2018-05-15 08:06 Hi NeroTulip, thanks, these are all questions that I wanted to explore more when have a bit more time, maybe in the summer. yes, it will learn the BS itself, I mentioned it in the paper. I did so far very few experiments by training with noisy hedges (by randomly perturbing the optimal hedges by +/- 50%, and then using them for training). It showed quite a little slowing down in learning in comparison to learning with optimal hedges.This gives some optimism about how it will perform with real data.
 katastrofa Total Posts: 458 Joined: Jul 2008
 Posted: 2018-05-15 09:27 "One note: if you are using resampling, you are destroying serial correlations and therefore are not completely model-free."My point exactly!What Nudnik could do is to phone up a few hedge funds and ask them for some very old trade histories, train his agent on them and see if he can recover the pricing model used by the hedge fund.
 Nudnik Shpilkes Total Posts: 47 Joined: Jan 2009
 Posted: 2018-05-15 14:53 Katastrofa, I did not disagree with you about resampling. And on hedge funds - yes, that was the plan, but most of my contacts in industry work in equities not options, so I did not find any real data for this. If anyone has option data that you could share, please PM me.
 Nudnik Shpilkes Total Posts: 47 Joined: Jan 2009
 Posted: 2018-05-15 15:28 Gents, here is another question for you:In the third paper I mentioned on this forum, I tried to build a multi-period version of Black-Litterman using methods of RL. One of the proposed outcomes of this model is that the effective asset price dynamics is given by a Geometric Mean Reverting (GMR) process with signals:where mean reversion $\kappa$ is proportional to a linear impact model parameter $\mu$, and time-dependent mean level $\theta_t$ is a linear function of signals. The conventional log-normal dynamics is recovered from the above in the limit $\kappa \rightarrow 0$, but this is a wrong limit to take, because it describes different "physics".The GMR process was used for commodities and real options. I was wondering if such process was used by anyone in equities space, and also if in general such process might make sense for equities. Any thoughts?
 ronin Total Posts: 401 Joined: May 2006
 Posted: 2018-05-15 17:34 @nudnik,I guess I don't understand why you need hedge positons to fit to. Conceptually you can reconstruct the optimal delta hedge just from the underlying paths. How do the hedge positions help?Your mean reverting process doesn't sound like anything from equities. This sort of thing is used to model volatile forward curves, and equities don't have volatile forward curves. "There is a SIX am?" -- Arthur
 Nudnik Shpilkes Total Posts: 47 Joined: Jan 2009
 Posted: 2018-05-15 18:44 Ronin - it is called Q-learning: learning from *actions*, i.e. hedges, when applied to trading...My question on the GMR process for equities is whether it *outright contradicts* any known facts about equities.
 katastrofa Total Posts: 458 Joined: Jul 2008
 Posted: 2018-05-16 00:23 Download a price series for IBM and find out ;-)
 Nudnik Shpilkes Total Posts: 47 Joined: Jan 2009
 Posted: 2018-05-16 00:48 you probably mean that any auto-correlations implied by such dynamics would be exploited to non-existence? If yes, I am not sure it is a complete answer.Actually, I did it for the particular case of the IBM stock. Can you be more specific?
 ronin Total Posts: 401 Joined: May 2006
 Posted: 2018-05-16 09:46 > t is called Q-learning: learning from *actions*, i.e. hedges, when applied to trading...The fact that it has a name doesn't really answer the question. Why do you need hedge positions? Does it converge faster when you are learning using hedge positions? How much faster? Do hedge positions introdue a bias? How much bias? How much error in hedge positions can you tolerate before you introduce a bias?> My question on the GMR process for equities is whether it *outright contradicts* any known facts about equities.Yes, it does. In the large price limit it compounds linearly, not exponentially. Which would imply something pretty strange about the cost of funding when stock prices are high. Can't even be bothered to think it through. "There is a SIX am?" -- Arthur
 Nudnik Shpilkes Total Posts: 47 Joined: Jan 2009
 Posted: 2018-05-16 14:14 So don't do then, save your brain cells.Any other opinions?
 Page 3 of 4 Goto to page: 1, 2, [3], 4 Prev Next