Forums  > Software  > Risk System in R for a small Asset Management company  
     
Page 1 of 1
Display using:  

zee4


Total Posts: 59
Joined: May 2010
 
Posted: 2017-03-18 06:25
Has anybody implemented a Risk System purely in R for identifying, measuring, managing, and reporting risks? Do you think it is feasible to create such a system/platform from scratch? What are the pros and cons of developing it in R? What packages could be used?

70-80% of the traded instruments are ETFs. There are some single name stocks and bonds as well.

Third party products like RiskMetrics/BarraOne, Axioma, etc. are not being considered at this point. And we think that having our own system would give us more flexibility as compared to BBG risk tools.

Any ideas/resources appreciated. Thanks.

Бухарский

Maggette


Total Posts: 943
Joined: Jun 2007
 
Posted: 2017-03-18 20:29
That depends on the scope of your risk system. If doind all the ETL and data cleaning and management is included in your risk system, R is without a doubt capable....but I would not recommend to do everything in R for a productive System. If I had to take a script language with some scientific packages I would take python.

I did implement a reporting pipeline for an energy commodities company in R and Oracle. It did work and was in production for some time. But I still wouldn't recommend creating a complete Risk Engine in R

Just my humble opinion

Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...

finanzmaster


Total Posts: 119
Joined: Feb 2011
 
Posted: 2017-03-19 12:31
Agree with Maggette!
R is great for rapid prototyping and research but if a system needs to be robust and fast (and it likely needs) than a pure R solution is suboptimal.

I, myself, when I need speed, first develop an model in R and then rewrite critical parts in C++ or even CUDA.

As to my experience with "industrial level" software (where I was just one of many developers), the calculation kernel was developed in C++ and the front-end in Java.
Probably, one can do everything in Java, like e.g. OpenGamma did.

As to Python, I cannot say anything due to lack of experience.
But anyway, I don't like weakly typed languages for big projects.

www.yetanotherquant.com - Knowledge rather than Hope: A Book for Retail Investors and Mathematical Finance Students

Maggette


Total Posts: 943
Joined: Jun 2007
 
Posted: 2017-03-19 14:15
On a more constructive note (if you insist on R) I used: Shiny to create some reporting dashboards and some technical dashboards for myself (checksums, counts, ranges and moments of input data vs their historical development)

Sweave for pdf generation(which was in a horrible state back then)

Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...

HitmanH


Total Posts: 424
Joined: Apr 2005
 
Posted: 2017-03-19 14:19
With Maggette here - you CAN create a simple enough risk / reporting system in R - it's very easy to do - although the term "Risk System" can be taken to different levels.

If you want to automate some analysis, consolidate some reporting, some conditional stuff - then R is more than fine.

If you want really interactive, whatif, proposing different positions, adding multiple asset classes - then while a lot of the stats work is good in R - might want to consider other options..

Hansi


Total Posts: 296
Joined: Mar 2010
 
Posted: 2017-03-19 17:08
We built a R based equity risk model over the last few years with the bulk done the first six months and then constant refinements. We'd previously built screening, backtesting and various analytics in R so we were quite familiar with it and have built quite a bit on top of the vanilla language to escape from it's shortcomings as a language and ecosystem.

With the risk model we expanded to risk reporting, portfolio optimization based on risk factors, attributions etc. This was in order to offer an internal alternative to Barra and BBG PORT with something more bespoke and flexible to our needs.

We then exposed the functionality to other systems via C# based WebAPI services that connect to web application user interfaces but most of the use is directly in R for research, bespoke analytics and reporting. We generally do ETL first in R and then refine in C# once we understand the dataset and cleaning required better.

My view in hindsight.

Pros:
* R allows you to express complex logic quickly and elegantly
* R packages allow you to solve common problems quickly
* You can easily go into C++ via Rcpp for the bottlenecks

Cons:
* R is by default single threaded and splitting things up is doable but creates many problems
* Incorrectly written base R is crazy slow
* R packages can be very low quality
* You need to solve a lot of tooling, interoperability and ecosystem problems yourself if you want things running in production.

Some questions you should ask yourself before hand:
* How many people will be the main contributors for this and how experienced are they with R?
* How confident are you that the people that will develop this either have enough experience to build it themselves (both finance, math and dev side)?
* Are those people comfortable with building R packages? Can they write clean testable R code? Can they set up their own package repository? Can they write to a style guide with comments etc such that if they go away someone else can continue the work?
* How confident are you that you can easily onboard another person with similar skills to work on the system in the future?
* How do you currently handle connectivity with other systems and how do you want to handle it in the future?
* Does this need to scale to run everything, everywhere?

Could give a few extra pointers but would depend on:
* Size of team working on this.
* Structure of team (quants only vs devs + quants + BA + IT stuff etc)
* Number of instruments and portfolio groupings (e.g. we built ours for 15K equity line universe, 10 portfolios. Now supports 180K equity lines and 150 portfolios, scaling that in R was hard).

If I was building this thing again today I’d maybe consider C#, Java or maybe C++ but only with good developers to support that flow around it. The reason we did it R only for us was that it had to be built by quants only in something that could easily be understood and run on desk and expanded and changed as it evolved. Going with a main stream primary language will solve a lot of tooling problems, scalability and maintainability and then going only to R for the stats and math stuff is an option but will most likely take more time to get to a minimum viable product.

As noted Python is an option because it has slightly better tooling, interoperability and maintainability options but end of the day it’s not much better and there might well be other issues on that side.

As with everything when it comes to this stuff the devil is in the details.

ryankennedyio


Total Posts: 12
Joined: Nov 2016
 
Posted: 2017-03-19 23:08
As quite a few have mentioned, Python is probably a good balance. If development time, skill and costs aren't an issue, likely a compiled language will give a better outcome.

A few libraries in Python (i.e. Pandas) give you a large number of nice, R-like dataframe manipulation methods, plus I believe the amount of extensions, and the "glue" libraries are either a higher quality, or more maintained (i.e. database connections, tooling, etc). Multithreading is not impossible, but must be careful about the Global Interpreter Lock. FWIW, the Pandas author built it at AQR and now works at 2sigma.

If you have a small team, Python is probably a safe option. If you come across an unavoidable bottleneck, it's often possible to write that portion in C, Fortran, or CUDA if you need to run calculations on a GPU.

I think a rough rule of thumb is that R > Python for raw 'data science' or stats, but Python > R for a long term, critical piece of production software.

zee4


Total Posts: 59
Joined: May 2010
 
Posted: 2017-03-20 04:49
Thank you all for the replies. Appreciate them.

Some details:
- currently everything is done based on BBG PORT/MARS functionality. Plus Excel for other ad-hoc stuff;
- risk team is 3 people with some experience in R and no expertise in other programming languages; we are not supported by developers or IT; relationship with the company is hopefully long term and don't think there will be new hires in the foreseeable future;
- the number of traded instruments is <100; so there shouldn't be any issues with R in terms of robustness and speed (I guess?).

As you can see, considering C++/Java/C# is not an option and whether it is worth spending time on learning Python is a big dilemma for us. And obviously, having some sort of automated system would be a lot better than nothing. That is why we were leaning towards R.

What we would like to have is (both at individual instruments and portfolio level):
1) portfolio breakdown - positions, weights, sectors, risk contribution/attribution, performance attribution;
2) a comprehensive risk measures - st.dev., VaR, ES, betas, drawdown measures, limits, correlation/covariance matrices;

As soon as we get these, we may try to include other fancy stuff - factor analysis, portfolio optimization, etc. What do you think? I would really appreciate if you could comment on what else could be included here.

Also, what resources would you recommend to get things started? I would appreciate if you could share names of the books and websites that might be of help.

Бухарский

ryankennedyio


Total Posts: 12
Joined: Nov 2016
 
Posted: 2017-03-29 02:53
I would suggest each of your risk team spend a full day learning some python. If it seems enjoyable, follow up with some tutorials around Pandas and Numpy.

You won't have a problem with speed in R (as long as you vectorise calculations correctly, don't write for loops, etc), but maintainability and code-spaghetti might be a higher risk, particularly without "software engineers" available.

Goodrich, Tamassia, have a great book on "Data Structures & Algorithms in Python". It goes into a bit of detail on class structure, OOP & designing larger programs. There might be better resources for non-programmers though.

Personal anecdote (and as someone still reasonably new to R, so take with a grain of salt), I find my projects and code are a lot more maintainable and higher quality in Python than in R. I find it too easy to fall into writing "read-only" code with R, then forget to comment it, and then nobody else can understand what I just wrote (even myself a week later).

I find it quite a bit harder to make that mistake in Python.

It does sound simple, but there is a real compounding effect to what I mention -- a large project with a few dozen moving parts is too much to keep in one's head, so being able to load code into your brain easily is important.

It seems like some of the general consensus I've seen is that R is definitely better tool for 'academic', or exploratory kind of work though.



Here's a link to a project I'm contributing to that might be useful inspiration for how something similar might be designed with Python. There's handling in there for positions, weights and some reasonable risk measures in the tearsheets as shown on the front page. Given the requirements you stated, you might even be able to use that project as a base (ignoring the strategy/real-time aspect & simply feeding positions into the system daily).

Hansi


Total Posts: 296
Joined: Mar 2010
 
Posted: 2017-03-29 14:23
@ryankennedyio: The issues you mention can be applied to Python too when starting fresh. It's just an experience question on either side. I'm sure if you spend the same amount of time with R as you have Python you'll be writing similar code in the end. You start using roxygen, writing packges, breaking it up, unit testing, integration testing etc etc. It doesn't happen if you use it as a throw away analysis tool but just like Python if you stick with it there is ample opportunity to use it in the same way.

@zee4: Given your team experience, team size and instrument universe I recommend just giving it a go in R and possibly Python side by side and just pick the one that feels more intuitive.

rickyvic


Total Posts: 117
Joined: Jul 2013
 
Posted: 2017-03-30 00:50
I have written a lot of stuff in R and used in production for more mission critical applications.

It works but I would use commercial stable numerical libraries or some third party component for the heavy lifting.

Parallel code is fine as long as it is simple stuff, I use lists and lapply to store objects and loop over it.

Apart from xts (test it well first) and some basic time and dates infra, I would write my own code since many finance packages are buggy.

Reach out to me if you need help

"amicus Plato sed magis amica Veritas"

finanzmaster


Total Posts: 119
Joined: Feb 2011
 
Posted: 2017-04-14 09:58
A concrete case study:
I use R in my blog to let people not just read but also immediately try what I affirm.

I call R-code from PHP scripts. Since I pass the arguments via command line I have to use RScript ... and the code that perfectly worked with R didn't work with Rscript

(here I explain why and how to solve this problem)

More recently I have experience that generates axis and perfectly works with R
>labelz = seq(1, length(datesToTake), 242)
>axis(side=1, at=labelz, labels=datesToTake[labelz])
causes a strange error when run with Rscript (and I still didn't manage to get it right)

To sum up, R is gread for ad-hoc analysis and rapid prototyping but if you have to automate a routine task, I would not recommend it.

www.yetanotherquant.com - Knowledge rather than Hope: A Book for Retail Investors and Mathematical Finance Students

bluelou


Total Posts: 68
Joined: Jan 2009
 
Posted: 2017-04-18 02:13
You'll probably want to look at the R packages Portfolio Analytics, blotter, and Performance Analytics. I think they're all on CRAN, but the most recent stuff will be on github.

Je suis ce que je suis, et c'est tout ce que je suis -Popeye

bluelou


Total Posts: 68
Joined: Jan 2009
 
Posted: 2017-04-18 02:15
There are additional packages for Factor Analytics by the same package authors.

Je suis ce que je suis, et c'est tout ce que je suis -Popeye
Previous Thread :: Next Thread 
Page 1 of 1