Forums  > General  > Python vs R  
     
Page 1 of 1
Display using:  

berkalp


Total Posts: 7
Joined: Jan 2019
 
Posted: 2019-01-21 15:09
Hello

For a beginner analyst,

Which one would you prefer for time series analysis? Which one is easier to learn? Would financial analysts at banks/funds prefer one over another?

What if the topic is Machine learning or neural networks? Is one language clearly good then the other one?

I'm about to begin writing my thesis and trying to decide.

Thanks for the help!

Maggette


Total Posts: 1129
Joined: Jun 2007
 
Posted: 2019-01-21 16:40
There is aboslutley no doubt that python is the better "multiple purpose" language. Anybody claiming otherwise has never worked in a real software project. Period. For many many use cases python is more performant, easier to read, easier to interface and comes with the better general purpose packages (from web scrapping, data base connections, etc). Lots of lots of bare metal stuff is easier to do in python. You can easily speed up with numba or cython (compile python to C). Python also has great parallel execution frameworks (for example: dask)

But that is not what you asked. You asked for specific use cases. Using a different language for a specific purpose is justified if the specific language has so many great features that make the hussle worth to add another language/technology to your stack.

So let's have a look:
For ANNs it is also without a doubt : python APIS are IMHO the better documented for Keras, PyTorch, TensorFlow and Pyro.

Time Series:
The R forecast package is a wonderful thing. Python statsmodel is still lacking there. Scipy is decent, but it somehow (esign, data model) does not interface that great with the rest of the python ML world. It's a general problem of most machine learning tribes that they ignore awesome work from signal proccessing, control theory, statistical time series analysis and similiar disciplines that work in the time domain. Here R is the clear winner to me.

In general: R has the more freaky packages from many fields of science (like biology, acturial sciences, ect): you find anything, from extreme value theory, artifcial immune systems, differential evolution, survial models...).

But the quality from these obscure packages is rather dubious at times. I had horrible experiences there.

I love R. I started with R after matlab. I love vector based languages. And I still quite often call R functionality from python.

I would recommend: go with python, find some packages you like in R and learn enough R to call R from python.

Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...

day1pnl


Total Posts: 54
Joined: Jun 2017
 
Posted: 2019-01-21 18:30
General purposes -> Python
Statistics -> R

quiet1


Total Posts: 2
Joined: May 2014
 
Posted: 2019-01-22 10:03
Not exactly your questions but I think Maggette covered your main issues well:

R (R markdown in particular) is excellent for stats + charts/report/presentation production.

Python is better all-round for production code and Pycharm is a great IDE.

Python is also very intuitive. After a little while you find you can more or less guess what the Python syntax for doing something is and it just works. R seems more cobbled together and can be a bit frustrating for people from a programming background (sample of 1 for that generalisation).

FWIW when I did my masters dissertation I used R Markdown for the report and Python with scikit-learn library for the heavy lifting.

billWalker


Total Posts: 174
Joined: Feb 2005
 
Posted: 2019-01-22 14:05
Worth noting that Python has an equivalent of R markdown / notebooks in Jupyter notebooks. In fact, I think they were there first.

My own $0.02 is that Python used to have an advantage on exploratory time series analysis in that Pandas handled dates much more smoothly than zoo/xts. However, handling of multi-dimensional time series data in Pandas is not what it used to be now that panels have been deprecated. Multi-index doesn't work the way you intuitively think it would (e.g., you can't just do log-differences on data with, say, date and ticker, in the index). Xarrays are promising but not a drop-in replacement for panels.

R has gotten a lot more intuitive as the tidyverse packages develop. If you are beginning in R, without question just go and learn the tidyverse packages.

R stats packages are far and away superior, still. Python is better for "machine learning". The R Tensorflow interface piggy-backs on the Python one. Can't speak for PyTorch.

While there are many efforts to improve charting in Python, IMO nothing holds a candle to ggplot yet.

"Plausible regularities may be present but swamped by changes in attendant circumstances." Ole Peters

nikol


Total Posts: 729
Joined: Jun 2005
 
Posted: 2019-01-24 00:36
I am not big fun of Python, but being free + PyCharm + some projects on the way got me stuck in it. Notebook is also good thing to generate reports etc.
Large number of libraries in Python makes it very attractive. I specifically like the interface of functions from scipy/numpy. They resemble Matlab and behave in this way.

I tried R for couple of days and left it with negative feeling about the syntax and interface of stat.functions. Sometimes I had to do additional steps, which in Matlab and Python are done at once. I do not remember specifics now, sorry.

However, since I m heavily influenced by analytic station ROOT (from CERN) and Matlab, I miss in-line commands and popping multiple windows (figure) with plots. I find it very efficient, when with just "up-arrow" i can get previous command(s).

Is it possible to have same environment in Python/Notebook? Cell execution makes me mad, especially when it becomes very long. Also, calling external functions/libraries is easier in Matlab

jslade


Total Posts: 1177
Joined: Feb 2007
 
Posted: 2019-01-24 19:45
Nicol have you looked at the apply family of functions in R? There is an array language embedded in R, but you have to know it is there. I've always found R to be better at one-liners.

The momentum is definitely in the python direction for general data science work, especially if you're on the dweeb learning chuckwagon. Even if you just do the classics, scikitlearn and xgboost cover things really well, and the rest of the language and ecosystem are "grown up programming language" in a way R never will be. Plots, stats, reporting, small scale app deployment (aka shiny) most classic time series work, R is vastly better. Wes has done yeoman work on Pandas, but it's really just data.table (as opposed to xts/zoo).

R should have a reputation score for its package system; too many really low quality packages on CRAN. Python probably should too, but at least there is anaconda for curation.

It's funny thinking about why R is a thing at all. Cloning a big existing platform was a good idea, and SPSS was reasonably well thought out for interactive work, but IMO, it is the packaging system which made it work. It's not good by modern lights; too many clowns think it is nodejs and using sub-packages with versioning problems. But in its day it was extremely good. You can't make an R package without unit tests and documentation (which autogenerates unit tests unless you tell it not to).

"Learning, n. The kind of ignorance distinguishing the studious."

nikol


Total Posts: 729
Joined: Jun 2005
 
Posted: 2019-01-24 21:03
@jslade

Thank you for the effort, but I don't have much energy just to try since I focus on the content. In "R" I really did not like the syntax. It does not build up on my past experience: Pascal, Modula, Fortran, c/c++, Java, PL/SQL, VBA, all those shell scripting langs. Even tried FORT and lisp one day. Of course, if it pays immediately I am happy to learn. I learned Python, because there was a 3 month project to rebuild their legacy Python code, so after 2 weeks of my own nightmare I started to produce running code and delivered capital reduction. With R I see no such immediate motivation.



I found the way to turn PyCharm into analytic station like Matlab!
It works great!

How to run a file in IPython console as default instead of terminal?
Previous Thread :: Next Thread 
Page 1 of 1