Forums > Pricing & Modelling > simulation of correlated random variables

 Page 1 of 2 Goto to page: [1], 2 Next
Display using:
 filthy Total Posts: 1257 Joined: Jun 2004
 Posted: 2006-04-20 00:09 given one normally distributed random number, i can generate another with a given correlation. but can this be easily extended so i can generate N series with a given correlation matrix? if only i could remember where i learned the first trick i might be able to generalize it... "Game's the same, just got more fierce"
 IAmEric Phorgy PhynanceBanned Total Posts: 2961 Joined: Oct 2004
 Posted: 2006-04-20 00:25 If you want to generate N series consisting of M samples each with a given correlation, generate an M x N matrix X of uncorrelated normally distributed random variables. Take your N x N correlation matrix C and perform a cholesky (or some other) decomposition so that you haveC = A^t A.Set Y = X A.Y is an M x N matrix whose columns have the desired correlation.(pretty sure I got that right)Good luck
 Patrik Founding Member Total Posts: 1330 Joined: Mar 2004
 Posted: 2006-04-20 00:28 I think this was brought up a couple of days ago. If you search the forum for "cholesky" I'm sure you'll find that thread (and a couple of other threads on the subject), and using google you'll find more than a couple of papers. Good luck. edit: crossed posts with IAE Capital Structure Demolition LLC
 mj Total Posts: 1045 Joined: Jun 2004
 Posted: 2006-04-20 01:42 most people do C= AA^{t} and set Y =AX
 IAmEric Phorgy PhynanceBanned Total Posts: 2961 Joined: Oct 2004
 Posted: 2006-04-20 02:17 The builtin Matlab function chol.m factors C according toC = A^t A.The Matlab code is function series = nfactor(nsamp, corrmat);ncorr = length(corrmat);R = chol(corrmat);x = randn(nsamp,ncorr);series = x*R;
 filthy Total Posts: 1257 Joined: Jun 2004
 Posted: 2006-04-20 17:05 thanks guys. next time i will search harder. "Game's the same, just got more fierce"
 SARS Total Posts: 69 Joined: Feb 2006
 Posted: 2006-04-20 17:45 Just to clarify, and a quick google seems to confirm, presumably Cholesky decomposition works for any elliptical distributions and not just the multivariate Normal (as seems to generally get asked here)....
 aaron Total Posts: 746 Joined: Mar 2006
 Posted: 2006-04-27 00:26 Although this is the common approach, it's easy to see that it makes little sense. Cholesky gives you a series of coefficients, C11, C21, C22, C31, C32, C33,. . . You have independent Normal variates X1, X2, X3. . . and you want correlated variates Y1, Y2, Y3. . . Y1 = C11*X1 Y2 = C21*X1 + C22*X2 Y3 = C31*X1 + C32*X2 + C33*X3 and so on. The instability of this process should be obvious. Multiplying and adding long series of numbers exacerbates rounding error. If you get a deviant value in X1, it affects every Y. If your Normal generating routine is even slightly bad, you'll get wrong correlations. When I also mention that the process for computing the C's is subject to exactly the same instabilities, you'll see that squares the problem. On top of all this, the process is computation-intensive and destroyed by small errors in the covariance matrix, and we know covariance matrix estimation is not robust in the first place. Cholesky works on the blackboard for symbol manipulation, but any time someone shows it to you in code, you know they've done the problem wrong. Block diagonal decompositions, with correlations between blocks but not directly between individual members of different blocks, make a lot more sense. These give formulae for the Y's in which each Y is expressed of as a weighted sum of, say, 10 quantities (some of which are also sums of 10 quantities and so forth), rather than one Y expressed as a sum of one quantity and one Y expressed as a sum of N. The whole process is much more robust and easier to code and maintain.
 Henrik Total Posts: 803 Joined: Nov 2004
 Posted: 2006-04-27 00:41 Block diagonal decompositions..... Aaron, do you have any reference to this? Cheers, Henrik Friendly ghost
 IAmEric Phorgy PhynanceBanned Total Posts: 2961 Joined: Oct 2004
 Posted: 2006-04-27 00:41 True. I'd also add that anyone who wants to do this for large matrices has more problems than just rounding error because the whole concept is flawed to begin with (I have no faith in large correlation matrices). If you do need to perform a large dimensional Monte Carlo, it makes sense to decompose the problem hierarchically as you suggested. At the end of the day, you will still need to simulate the smaller blocks though. Doing the smaller blocks via Cholesky (or similarity transforms and eigenanalysis) should be fine I would think.Cheers
 sanyasi Total Posts: 52 Joined: Jun 2005
 Posted: 2006-04-27 05:30 Attached File: NALXLW.zip I'm in the process of converting some libraries for use in Excel. One of the functions available is a Cholesky decomposition. See attached spreadsheet and add-in. Perhaps will be of use to somebody. In addition to the problems already pointed out about this decomposition to generate correlated random numbers, there is also the issue of this method not being all that suitable for terminal correlation.
 mj Total Posts: 1045 Joined: Jun 2004
 Posted: 2006-04-27 08:37 you can use it for terminal correlation provided you use the terminal correlation matrix
 aaron Total Posts: 746 Joined: Mar 2006
 Posted: 2006-04-27 14:52 I don't have a reference handy. People who publish about this generally assume you have to identify the blocks statistically. This is the hard problem. In finance, we always have a lot of structure to begin with. If you're simulating security price changes, you know you have sectors like equities, interest rates, currencies, softs, hards and energy. You can subdivide these further as appropriate. Once you get to a granular level, typically with six to twenty prices, you can rotate into principal components. It's usually enough to take the first principal components of these blocks (although you can take two or more of some if they're important), and put them in a new covariance matrix with 1/6 to 1/20 the number of rows and columns of the original. If that's still too big, you do it again for another level of aggregation. When you're done, to reconstruct a specific equity price you'll have something like: P = C0*Global_Equity _Factor + C1*US_Equity_Factor + C2*Manufacturing_Industry_Factor + C3*Electrical_Supplies_Factor + C4*Ideosyncratic_Factor This reconstruction will not match all N*(N+1)/2 covariances of the overall covariance matrix, but you can't estimate those reliably anyway. It will produce simulated prices that are statistically indistinguishable from the historical series used for covariance estimation, in a robust, meaningful and numerically stable way. The correlation between, say, the sixth principal component of electrical supplies stocks with the eighth principal component of south asian currency inflation rates is safe to ignore; it's not going to be stable anyway.
 SARS Total Posts: 69 Joined: Feb 2006
 Posted: 2006-04-27 17:03 I'm with Eric.  Although I'm no expert on data and methods within the banking world, I can't believe that some of the data issues that face insurers are completely unique.  Rounding errors will be the least of your problem with a large matrix - so I personally wouldn't get too hung up about it.
 aaron Total Posts: 746 Joined: Mar 2006
 Posted: 2006-04-28 15:34 I would say rounding errors underlie many of the problems with large matrices. High dimensions have a way of exaggerating tiny problems, but rounding error is often the grain of sand needed for the pearl. Also, I have an engineer's sense that there's a right way and a wrong way to do things. When you see code that exaggerates rounding errors, it's usually wrong for other reasons as well. When someone writes: Y = C0 + C1*X + C2*X^2 + . . . instead of: Y = C0 + X*(C1 + X*(C2 +. . . I know the answer will be wrong (and probably the question as well) just as I know the emails filled with mispellings and bad grammar are probably not sound business proposals, or a device that makes a lot of noise and heat and bad smells is not the right technology for the job.
 IAmEric Phorgy PhynanceBanned Total Posts: 2961 Joined: Oct 2004
 aaron Total Posts: 746 Joined: Mar 2006
 sfca Total Posts: 904 Joined: May 2004
 Posted: 2006-05-04 17:49 So how do you do these correlated RV's across time?  For example, I have 50 equities and there is a payoff in the future based upon how many times they individually hit a barrier.  To get a first approximation I want to model 50 correlated series of lognormal equity prices.  First, I estimate the correlation of each to the S&P500 and use that correlation at each weekly timestep to link them in a one-factor type model like RV(equity_i)=sqr(rho)*(sys_e)+sqr(1-rho)*non_sys_e   where sys_e is the systematic risk and non_sys_e is the nonsystematic risk.  These RVs feed into the growth rates, and each period equity is a function of the previous through S(t)=S(t-1)*exp((u-vol^2/2)dt+vol*sqr(dt)*RV). Two issues.  One, is that while for each time period the input correlation is just fine, the output is a non-linear transformation so that the output correlation (at each single time period taken by itself)  may be significantly different.  What does one do about the output correlation being different than the input correlation due to the transformation? Second, when calculating the correlations of the resulting series across time, the correlations may diverge from the inputs because the series are cumulative.  Does one do a huge Cholesky for this with each week a new column or what?
 IAmEric Phorgy PhynanceBanned Total Posts: 2961 Joined: Oct 2004
 Posted: 2006-05-04 22:18 Hi Aaron,Thanks a lot for your post. This is great stuff.What you say here makes a lot of senseIf you report a confidence interval for VaR, or a number with one sigificant digit, you get a lot of arguments. If you report essentially a random number drawn from the confidence interval, you can make a yes/no decision about whether a desk is over limit. That means a desk close to limit might get ruled under or over depending on the luck of the draw, which isn't a bad outcome since it's impossible to tell whether it's a little under or a little over. There's nothing magic about the limit choice.If I understand correctly, you are saying that when all is said and done, there is a declaration made regarding whether a particular desk is over or under their VaR limit. It is pretty much binary.What happens when a desk is over their VaR limit?Something must happen. Things probably start out with a warning, but eventually assets are going to be allocated. The thing that worries me (and it doesn't seem to worry you as much, which is actually reassuring) is the situation where EVERY desk is suddenly SIGNIFICANTLY over their VaR limits. It's not a matter of gradation of some fine numbers. What if absolutely every single desk suddenly comes up in a report as being over their VaR limits? Am I wrong to think that some high level managers will start to freak out? Is it inconceivable that if (when?) vols snap up and all VaR numbers suddenly quadruple that this would cause some serious reallocation of assets? The very process of which will exacerbate volatilities?For the record, I'm not caught up with precision. On the contrary, the thing that worries me is that a lot of people I talk to have a lot of faith in people like you. It might not be as obvious to them as it is to you that those numbers are really accurate to no more than 1 digit even though 3 or 4 are reported. They think that since there are so many smart people working in risk management that the risk is probably actually managed pretty well. With over \$17 trillion in CDS out there, and no one has a clue on how to model the risk, you tell me how confident they should be in risk management.Anyway, kind of the point (and to try to tie it back to the original subject of this thread) is that if you told some non-mathematically inclined manager that you had a group of 10 PhDs constructing a huge correlation matrix, then you and I know that the whole exercise is pretty much pointless and the resulting emprical correlation matrix contains almost no information, but it is easy to imagine how that non-mathematically inclined manager may be impressed and think things are under control since so much firepower has been thrown at the problem. I think a large population of the finance industry does not know that a 1000 x 1000 or 10000 x 10000 correlation matrix is pretty useless. A lot of those people are in control of large sums of money. Those are the people that worry me. The ones who will see spikes in VaR numbers across the board (which I view as an almost certainty) and not know how to respond. Not everyone in a decision making position is as smart as you unfortunately We'll see. Like I said, I have only been working in finance for less than 1.5 years and haven't see a market crisis yet. When LTCM blew up, I was more interested in hanging out with John Baez and learning about quantum gravity and representations of Lie algebras. I have a lot of history to live before I'll even begin to have a clue (if ever).Cheers Eric
 tristanreid Total Posts: 1677 Joined: Aug 2005
 Posted: 2006-05-05 00:00 I think the most powerful thing constraining this type of model feedback is the existence of other constraints.  Many managers can only turn over so much of their portfolio in a limited time without going out of some compliance.  I think they generally take those limits a lot more seriously than VAR.    If all managers were constrained by VAR, and there were no other constraints besides VAR, there would still be different reactions to spiking volatility.  If one manager is underweight some asset class and the other is overweight in that asset class, conditions of excess volatility will cause both of them to provide each other liquidity as they revert to the index.  I don't think we can make too many assumptions about that kind of negative correlation, and that only applies to managers tied to benchmarks, but I think it's safe to say they won't all definitely act 100% in the same direction.  Also, volatility is measured over some time period, there would have to be a pretty big spike to cause a panic over a short period, even if all models used the same resources to measure volatility.  I'll bet there are some systems out there that only update market volatilities once every year or two, or even less.  Are they wrong?  IAE, I know you used to work on VAR systems at a previous job, so I'm not asking that rhetorically, is there a guideline or rule about how often vols get updated? -t. the only reason it would be easier to program in C is that you can't easily express complex problems in C, so you don't. -comp.lang.lisp
 Graeme Total Posts: 1629 Joined: Jun 2004
 Posted: 2006-05-05 09:27 According to BIS/Basle 1996 VaR rules, one must have strict evaluation procedures with parameter updates at least quarterly, and parameter estimation based on a minimum of a year of historical data; a sufficiently rich set of risk factors. These factors in particular must capture volatility risk of all positions and at all maturities. That may or may not be verbatim, I'm taking it from some old notes of mine. I haven't had the enviable joy of going through much of the Basle II documentation, but given that market VaR is basically unchanged there, I would guess that this rule still stands. Bit of a joke really. Risk can and should update their parameters every day. It then all depends who risk de facto report to: to management, or to some big report vault in the sky. If the former, daily; if the latter, then quarterly. A lot of politics gets attached to risk management. Graeme West
 Graeme Total Posts: 1629 Joined: Jun 2004
 Posted: 2006-05-05 09:33 I'm coming in quite late on this correlated normal random variable stuff but I would venture to suggest that any matrix bigger than you can print and read on a page (with my shitty eyesight) is too big to put any value on individual entries, and you need to go to some kind of factor model, or PCA. Graeme West
 doctorwes Total Posts: 576 Joined: May 2005
 Posted: 2006-05-05 11:04 We'll see. Like I said, I have only been working in finance for less than 1.5 years and haven't see a market crisis yet. When LTCM blew up, I was more interested in hanging out with John Baez and learning about quantum gravity and representations of Lie algebras. I have a lot of history to live before I'll even begin to have a clue (if ever). What's nice about fixed income is that something horrible happens, somewhere, about once every 2-3 years at most, so that everyone gets to see a crisis fairly early in their career.
 IAmEric Phorgy PhynanceBanned Total Posts: 2961 Joined: Oct 2004
 Posted: 2006-05-05 18:59 I love this topic, but there is already a thread dedicated to it, so I suggest moving the discussion back here, where I plan to (attempt) to address some of tristanreid's questions re parameter updates and compliance.
 sfca Total Posts: 904 Joined: May 2004
 Posted: 2006-05-05 19:11 So I guess my question got ignored.
 Page 1 of 2 Goto to page: [1], 2 Next