The Benefits of Upstream Compatibility and Upstreaming

So you've read the loads of articles about why Open Source Software (OSS) is good for business. Perhaps you've also read Tom Preston-Werner's superb article about why your company should open source as much software as possible (but probably not everything). Maybe you've even read The Software Paradox, which has basically become required reading in the software business. So, now you're sold on using OSS in your business, and your R&D team starts hacking away on new products leveraging the awesome might of open source projects.

At some point, you can be guaranteed that someone will suggest creating a permanent customized fork of an existing OSS project. Time to stop, collaborate, and listen:

The Upstream

The important concepts here are those of upstreaming and upstream compatibility. Being upstream compatible means that your fork is compatible with the official version of the open source project. A common engineering term for the "official version of the project" is the mainline. Contributing changes or improvements you made to a project back to the mainline is called upstreaming them.

You lose upstream compatibility if you fork a project and make changes that break compatibility between your fork and the mainline. This concept is important, even if you don't intend to upstream your changes to the mainline.

Reasons to be Upstream Compatible

As long as your fork is compatible with the upstream version of a project, you can pull updates from the mainline into your fork. This has two major benefits:

1) New Features - As new features get merged into the upstream version of the project, you can easily pull them into your fork and make use of them.

2) Bug Fixes & Security Updates - As bugs are fixed and security flaws / CVEs are closed, you can immediately pull them into your fork and benefit from them.

By maintaining upstream compatibility, even if you do not plan to push your changes upstream, you benefit from the development happening in the mainline and only need to test for regressions against your private changes. If you had broken upstream compatibility, however, you would need to individually backport new features, bug fixes, and security updates into your fork - a painful and expensive process.

Reasons to Upstream Your Changes

Actually contributing your changes back to the mainline has additional advantages to those you get by being upstream compatible. Here's why upstreaming your changes can be very beneficial to your engineering process and product:

1) Code Quality - When you go through the process of upstreaming your contributions to a particular library, your code will get reviewed by the developer community that maintains that library. Thus, skilled developers that are likely more knowledgeable about the library than you are will be reviewing your code for any possible bugs or design flaws. It also helps ensure that you are using the code in the way they intended, that your changes are stable and aligned with the direction of the library's development, and that your code doesn't have any unintended side effects.

2) Code Maintenance - Once your code is merged into the mainline, the official project is responsible for maintaining it. Or, put differently, the burden is no longer on you to continuously maintain your code because it is being maintained for you as part of the mainline.

The most obvious benefit of this is that you don't have to continuously merge the latest version of the mainline into your fork to keep it up-to-date. Related to this is another important benefit, which is that future changes to the mainline library can't be merged unless they are non-breaking for the code you contributed. If, for example, you had not upstreamed your changes, it's perfectly feasible for a change to be made in the upstream that breaks your code, and thus you must take on the burden of fixing your independent fork.

In short, upstreaming your changes dramatically reduces the cost of code maintenance.

3) Open Source Participation - By being active in the developer community for a particular project, you are better able to shape the project's direction to meet your needs. By making contributions, you gain influence within the project which gives you authority in discussing changes. As solely a downstream consumer of a project, you will only be able to react to what the upstream does rather than proactively shaping it.

As described in "Open Source (Almost) Everything", linked at the top of this post, it's also great advertising for your company, helps you attract and retain talent, and reduces duplicated effort.

4) Ease of Use - Upstreaming your changes, and thus enabling your customers to use the mainline version of a project, makes your product much easier to install and use.

By using the mainline version of a library, your customers can install the dependency through their normal package management workflow. This also means if other applications depend on that library, they only need one version of the library on their system.

While dependency hell has largely been solved by better package management in modern OSes, you can easily put your customers back in that hell by requiring the use of forked libraries. This is rather a jerk move.

In Closing

Upstreaming your changes should be the rule, not the exception. There are great benefits to upstreaming, and passing those up ought to be an active, justified decision rather than an accident.

If you do choose not to upstream, though, I have yet to see a case where it makes sense to break upstream compatibility. This is a road to frustration, unnecessary expense, and insanity. Abandan hope, all ye who break upstream compatibility.

[Credit to Moritz Fischer (@fischmz) for providing input months ago that is reflected in this post.]

Ben Hilburn

Ben Hilburn

bits, nibbles, bytes, and words
D.C. Metro Area