 quantie
|
|
Total Posts: 913 |
Joined: Jun 2004 |
|
|
https://www.risk.net/awards/7926411/quant-of-the-year-hans-buehler
Anyone has seen any published research on what this is?
Peter Carr did some work on using regression models for pricing is this simillar to that? |
|
|
|
 |
|
I take it that you are referring to "USING MACHINE LEARNING TO PREDICT REALIZED VARIANCE" (2020). The setup to the paper is that the canonical approach to building a volatility index is to weight options in the portfolio according to a variance swap pricing formula. Carr uses regression models of option volatility (as opposed to pricing formulas) to determine portfolio weights. He hedges by weighting according to ML models.
Deep hedging goes a step further by bypassing the "weighting" step and having the model itself determine what the portfolio should look like, i.e. the NN doesn't print out option volatility numbers; it tells you the trades to place to update your book.
Key take-away from the article: Buehler's deep hedging uses variational autoencoders to create training data. This is not his idea, and has been done for other ML applications. IMO tuning these VAEs is the secret sauce. Using an $$L_2$$ distortion measure VAE to generate images gives you blurrier pictures, compared to other distortion measures, for example. There is probably an analog when using VAEs to generate "fake" pricing data.
Curious to know if anyone has better insight to this, especially the choice of training a vanilla LSTM on the VAE data, as opposed to something more modern like a transformer, or incorporating the VAE into some kind of GAN/adversarial DL setup. |
Silence Dogood No. 4 |
|
 |
 ronin
|
|
Total Posts: 708 |
Joined: May 2006 |
|
|
The quant-of-the-year thingy seems to focus on pretty esoteric stuff these days. Last year they loved rough volatility, this year they love deep hedging.
In both cases, I am not sure what is the actual problem that is being solved. It's just intellectual masturbation.
I mean, what option trader goes "I am losing two million a year because my volatility is not fractal enough"? Or, "I am losing three million a year because I have to look at all these sensitivities and decide how to hedge them?"
And, by the way, I am serious. If there is an option trader who ever said any of that, I take it all back. |
"There is a SIX am?" -- Arthur |
|
|
 |
 rftx713
|
|
Total Posts: 144 |
Joined: May 2016 |
|
|
ronin - for the uninitiated and simply curious (my focus has been on physical commodities), what questions would you expect an options trader in that position to be asking? |
|
|
 |
|
@ronin
> what option trader goes "I am losing three million a year because I have to look at all these sensitivities and hedge them?"
You've never hedged a portfolio? How do you manage risk? |
Silence Dogood No. 4 |
|
|
 |
 ronin
|
|
Total Posts: 708 |
Joined: May 2006 |
|
|
> what questions would you expect an options trader in that position to be asking?
Well, that is the thing - option trading is a bit on the mature side these days. The questions tend to be a lot more engineering-ish. While this stuff just screams 'underemployed quants, looking for relevance'.
> You've never hedged a portfolio? How do you manage risk?
Oh, I am really glad you asked! Why, I look at exposures, and then I decide how to hedge them. Is there another way?
No, in all seriousness. This stuff is, in the grand scheme of things, comparable to self-driving cars, or autopilot on planes. Yes, it can keep it going in a straight line, on a motorway, once you get it there and provide it with the vectors, if nothing wildly out of the ordinary happens along the way. No, it can't (and shouldn't) do anything more than that. And it definitely shouldn't do anything critical.
But then, what's left? A bot that rebalances to keep deltas, gammas and vegas within limits? We've had those for over a decade. I wrote one of them more than a decade ago, for my own use. It was a few dozen lines of code.
"Ah, but now we do it without telling you what the numbers are! No, better than that - we do it without even knowing what the numbers are! Nah, it's ok - we trained it. On data. That we made up."
Yeah. Great. Thanks for that. Brilliant stuff. |
"There is a SIX am?" -- Arthur |
|
 |
|
I don't know very much about ML but I am curious how these "variational autoencoders" work and how the vegas/deltas can be estimated without a pricing model (or if it is all left to the machine, black box style)
Is there any explanation with an intermediate level of detail (somewhere between "academic paper" and "newspaper article" ;) ? ) |
|
|
|
 |
 chiral3
|
Founding Member |
Total Posts: 5233 |
Joined: Mar 2004 |
|
|
When the earlier paper came out, which I think was a couple years ago, maybe more, the guy running my hedge book sent it to me. At this point I am biased. I have been doing this shit for too long. The teams aren't that big, or that expensive, ...., and the incumbent code base is long, and plugged in everywhere, .... etc. But my takeaway after scanning the paper was "how do you prevent gigo trading?" Philosophically the existing frameworks are anchored. We can argue about vol and sentiment and other tweaks and tactical shifts to express a view, but it's almost axiomatic. It's also doing the same thing, just differently. How many times has this happened? A paper that does the same thing, just differently.
Two things about JPM: first, they like these types and they have a history of PR on these fronts. Second, this is in the wake of what happened, for example, in equities at GS. This trend towards a bunch of computers and one dude making sure they're all plugged in. In a large org, when you are running a large group, with a big budget, and you have to present your three year plan, you better have some element that hits the buzz words du jour - "automation", "robotics", "AI", "machine learning".
So I don't know how to disambiguate these two things. I do find it uncanny that these awards always trail the state-of-the-art for a given zeitgeist and never lead. |
Русский военный корабль, иди на хуй! |
|
 |
 ronin
|
|
Total Posts: 708 |
Joined: May 2006 |
|
|
> I am curious how these "variational autoencoders" work
It's a neural network dimensionality reduction - basically nonlinear PCA. Data passes through a low-dimensional layer, and it is then reconstructed back. Depending on how many layers you put in, you get the nonlinearity fitted to a better or worse extent.
So in this context, the factors would be say spot, ATM straddle, 25d risk reversal and 25d fly. As spot moves around, the options gain and lose delta/gamma/vega, and your nonlinear fit is supposed to have learned that.
The problem is this is massively non-stationary. Due to theta, due to vol surface dynamics, and just generally due to all kinds of noise. There is no way you can learn the entire dynamics based on historical data. So you generate training data using Black-Scholes, and you are hoping you are teaching it the Black-Scholes formula.
To be fair to the JPM guys, they seem to have realised that that was too much. Then they seem to have moved on to using this to fit the parameters of a 3-4 parameter slv model. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3808555
And now they are testing it on cliquets. I can just see the conversation: "You can't manage this sh*t anyway. So that sh*t can't do any worse than any other sh*t." I paraphrase, but that seems to be the general idea. https://www.risk.net/derivatives/equity-derivatives/7921526/jp-morgan-testing-deep-hedging-of-exotics
> I do find it uncanny that these awards always trail the state-of-the-art for a given zeitgeist and never lead
That's probably the least surprising bit. The awards happen when their marketing guys call the risk.net sales guys. That doesn't happen until there is a product to sell. |
"There is a SIX am?" -- Arthur |
|
|
 |
 nikol
|
|
Total Posts: 1483 |
Joined: Jun 2005 |
|
|
@ronin  |
... What is a man
If his chief good and market of his time
Be but to sleep and feed? (c) |
|
 |
 Maggette
|
|
Total Posts: 1350 |
Joined: Jun 2007 |
|
|
I think the broad and general direction is interesting. Not, as ronin already elaborated, because it solves some real world problem. Sure as fuck it does not.
But because it, at least IMHO, highlights a general problem that is somehow finally gaining some traction: constraining DL models or encode existing knowledge into them. Making the thing learn Black Scholes feels kind of strange.
I mean, some strategies seem to be straight forward. If you have a predictive model, stack your DL model on top of it. The DL get's the current state and the prediction of your model: then learns to predict (and correct) the error to the best of its abilities. So what can Deep RL add on top of your Black Scholes hedges? No?
I am quite interested in hybrid models. Not trying to learn F = m*a from data seems to be a smart thing to do. Not really related though but I kind of love almost everything Steve Brunton and his team does. https://www.youtube.com/watch?v=8e3OT2K99Kw |
Ich kam hierher und sah dich und deine Leute lächeln,
und sagte mir: Maggette, scheiss auf den small talk,
lass lieber deine Fäuste sprechen...
|
|
|
 |
 chiral3
|
Founding Member |
Total Posts: 5233 |
Joined: Mar 2004 |
|
|
So same thing, but different, for maybe one institutional client. Instead of teaching vol and pumping learned vol into BS, which I assume we’ve all done forever, and pumping that into trading, it’s pumping learned BS into trading.
Realizing that this is all PR I still imagine something like the Simpson’s episode where Homer gets morbidly obese so he can work from home... there’s a guy at JPM and some routine that kicks out a string when moneyness or # contracts is outside some band and he needs to manually review aspects of the trade and press “Y” or “N”. But, like Homer, he realizes that the plastic water drinking bird head can press the “Y” key for him while we is talking to the girl up in rates. BUT the damage is gradual and slow. After thousands of small, biased trades some risk manager starts sperging on some massive long SPX exposure and the SPX tanks at close.
 |
Русский военный корабль, иди на хуй! |
|
 |
 nikol
|
|
Total Posts: 1483 |
Joined: Jun 2005 |
|
|
@chiral3 > BUT the damage is gradual and slow. After thousands of small, biased trades
Sounds as a long pump filling an algo-memory with patterns which seem regular and stationary, but then dumping it at the end. |
... What is a man
If his chief good and market of his time
Be but to sleep and feed? (c) |
|
|
 |
|
> It's a neural network dimensionality reduction - basically nonlinear PCA. Data passes through a low-dimensional layer, and it is then reconstructed back. Depending on how many layers you put in, you get the nonlinearity fitted to a better or worse extent.
It's not even that good. The original work boils the outputs of the NN to an n-dimensional Gaussian before reconstruction arguing (paraphrased) "it can be shown any distribution passed through a nonlinear function is Gaussian". Worse, the entire game is played using a lower bound. You are describing a traditional autoencoder which (I recognize the irony here) "can be shown to be non-linear PCA".
VAE's have had some nice improvements and theoretical connections with GANS (notably Wasserstein distance) but are piss and a half to train. Poor man's Bayesian estimation for things NN's excel at due to feature learning like NLP and Images. There is *one* paper I have found on evaluation: http://proceedings.mlr.press/v80/yao18a/yao18a.pdf which I have never tested (because I, like sensible people, use NUTS).
Don't even get me started on generating data to train models. Thesis: you're into "AI" so your models are data hungry, best make more data with the same statistical properties as your sample to appease the gods. Startups are based on this idea; I think they forgot OLS. |
|
|
 |
 willis
|
|
Total Posts: 29 |
Joined: Feb 2005 |
|
|
>Second, this is in the wake of what happened, for example, in equities at GS
I did not know this was a thing; what happened in equities at GS? Humans replaced by large optimizer? |
|
|
|
 |
|
not as dramatic as that but yes
from the mid 00's onwards, increasing number of hires from tech/coding background and decreasing payout ratio (salaries/bonuses as % of profits), accompanied by grumbling from support staff. |
|
|
 |
|
@ willis's last post Goldman has used Chiral's many-computers-one-person paradigm since the 90s. Emanuel Derman has some blog posts about his time as head of the Quantitative Strategies group at Goldman Sachs; he says that no more than 5 of his 30 quants researched or coded models for Goldman's equity derivatives traders.
@ silverside's last post The fundamental theorem of Nuclear Phynance: For all threads T, as time goes to infinity, Pr(T has a post about changes in the job market since 2007) converges to 1. My theory is that 2008 wrecked first year analyst recruiting at IBs. After noticing that copula formulas, asset pricing and TARP, et cet. were on the news, universities that depended on IB jobs for post-grad employment numbers compensated by creating financial engineering programs (e.g. the MS Financial Engineering at Columbia that Derman now runs), and convincing the industry that PhD physicists were expensive and unnecessary compared to the new generation of MFEs. |
Silence Dogood No. 4 |
|
|
 |