Forums  > Software  > Monorepos vs Multirepos  
     
Page 1 of 1
Display using:  

EspressoLover


Total Posts: 315
Joined: Jan 2015
 
Posted: 2018-03-03 23:22
Curious to hear what the phorum has to say on the topic of monorepos vs multirepos. I.e. do you put your org's entire codebase into a single VCS repository? Or do you subdivide into modular projects and libraries that are separated into independent repositories? Maybe you split the difference and use some of federations of multiple repos. There seems to be a growing field of tooling to allow for tighter coupling between multirepos?

I've always been a subscriber to the multirepo camp. I think it's a natural extension to the Unix philosophy of self-contained systems with small surface area that do one thing and do it well. I think it also promotes modularity and good software design hygiene. You're forced to think about projects as standalone products, rather than just bespoke sub-systems. It also just seems to fit in better with git and its workflow. Why have a tool that allows for light-weight, flexible repos if you're just going to throw everything in a single behemoth.

However these days, it seems like a lot of smart people and companies are on board the monorepo train.
Google, Facebook, Twitter and Netflix all seem to use monorepos. The arguments are fairly compelling. Easy dependency across the entire org's codebase. Less overhead when making changes to an external facing API. Single view to access all code at once. One-button builds and environment setups. A lot of this you can get with smart tooling around multirepos, but at that point are you just wasting your time?

There's a lot already written on this topic all over the internet. But I brought it up for discussion here, because I'm curious to get a finance-specific viewpoint. There's a lot of considerations that may be specific to our industry, or at least more pronounced:
- Proprietary and confidential code. If I have a contractor writing a market data parser, I don't necessarily want to give him access to source code for all the alphas.
- A lot of code that gets written for some tentative purpose, used a few times then thrown away. The traditional software industry benefits from more focused development. Here's the product specs, now write something filling that. Finance people want to quickly test and iterate through strategy ideas that start nebulous and morph into something completely different.
- A lot of code that gets written in weird languages, niche frameworks and bespoke stacks. There's way more freely available tooling and dependency managers for your typical webdev.
- Fast release cycles. A lot of patches that get done when things are on fire.
- Dozens of other considerations I'm sure I'm missing.

So, what are people's experiences and opinions on this topic. Even if it's just half-baked, I'm definitely interested in hearing any perspective.

Good questions outrank easy answers. -Paul Samuelson

ThomasJ02


Total Posts: 37
Joined: Feb 2009
 
Posted: 2018-03-04 02:13
I don't think any of the finance-specific considerations you mention preclude monorepos, except maybe the "external contractor" issue. The "fix the fire" and "test and iterate" issues can both be solved with branching. The "niche tooling" isn't a big deal (you can put it in its own directory structure in your repo).

At the end of the day though, I don't think multirepo vs monorepo is that big of a deal. Our group has two large repos for example and I've never felt a strong urge to join into one or or split them further.

Patrik
Founding Member

Total Posts: 1354
Joined: Mar 2004
 
Posted: 2018-03-06 12:23
I tend to prefer monorepos for purely internal code bases mainly because it makes dependency, release and deployment management simpler. In the setting of "internal" code where you do not have to care about external unknown use it's been an overall productivity gain for me to use monorepos.

The contractor scenario is real, but I don't tend to subcontract much so not been a consideration for me. If there's a real permissions and secrecy issue you obviously need a solution. That solution doesn't have to be multiple repos, for example at GS there's a monorepo with a layer on top of it that enforces some permissions. So a hybrid kind of solution.

The other bullet points I don't see as hugely affected by multi vs mono.

Capital Structure Demolition LLC Radiation

Maggette


Total Posts: 1037
Joined: Jun 2007
 
Posted: 2018-03-06 14:30
Even in a mono repo I consider branch restrictions a best practice. You do not want every developer and his mother to have access to pull, push all branches.

Hence like Patrick said things like git submodules come to mind. We practice that (outside of finance)

Ich kam hierher und sah dich und deine Leute lächeln, und sagte mir: Maggette, scheiss auf den small talk, lass lieber deine Fäuste sprechen...

Hansi


Total Posts: 299
Joined: Mar 2010
 
Posted: 2018-03-06 21:28
We used to do the monorepo thing back in SVN days because it lends it more easily to that but for Git I prefer multirepo.

We break it down to a repo being something that can be run independently and deployed independently. Our team has one large repo that is a few libraries and 3 apps but then have 40+ repos with more constrained items.

I find it also makes it easier to enforce checks via continuous integration of tests and code reviews before things from branches hit the main production level branch. On the flips side if you have cross repo dependancies this can be a bit of a hindrance but solvable by using an artefact repository where you can pull in built version from other repos by version and/or use git submodules.

- Proprietary and confidential code. If I have a contractor writing a market data parser, I don't necessarily want to give him access to source code for all the alphas.

Focus on hiring good people and treating them well, this most likely won't be an issue then. Having said that we do have
most repos fully open to the whole firm but the alpha ones limited to our team which includes contractors. Easy with multi, possible with mono.

- A lot of code that gets written for some tentative purpose, used a few times then thrown away. The traditional software industry benefits from more focused development. Here's the product specs, now write something filling that. Finance people want to quickly test and iterate through strategy ideas that start nebulous and morph into something completely different.

For those kind of things it's often good to either keep a bunch of prototype repos or one mega research repo where everyone just works on one branch and segregate by folder. We often operate on the principle that writing disposable prototypes is the first step then consider the code lost and write it again in a maintainable way when you want to run money with it. Of course in practice there is often not enough time for it but at least we know what we want to do even if sometimes code that's not perfect is used. Easy with multi, possible with mono.

- A lot of code that gets written in weird languages, niche frameworks and bespoke stacks. There's way more freely available tooling and dependency managers for your typical webdev.

We try to aim for open source stuff and anything that can be executed from a command line can usually be structured in a sensible development, validation and deployment process. Not really related to repo.

- Fast release cycles. A lot of patches that get done when things are on fire.

As long as you focus on the up front automation for all stages there is no reason this can't be done in a sound way from a technical point of view (policies and politics are another thing). Not really related to repo provided you use best practices on the mono repo front.
Previous Thread :: Next Thread 
Page 1 of 1