Manage the Risks of Software Reuse

Whether or not your organization develops software, it’s likely exposed to the risks of vulnerabilities buried deep within code.

Gregory Vial

May 23, 2022

Source:

MITSLOAN

MANAGEMENT REVIEW

One of the key ways software development organizations drive efficiency is by drawing on libraries of existing, reusable software components when creating their own software products and services. This helps accelerate digital innovation, but the advantages come with a trade-off: Organizations accept, sometimes unknowingly, a degree of risk that can lead to serious cybersecurity issues.

That risk was highlighted in December 2021, when it came to light that a widely used open-source software framework called Log4j contained a critical vulnerability.1 The news made headlines because countless pieces of software deployed in organizations, government agencies, and people’s homes depend on this logging framework for the Java programming language. Security experts found that exploits built on the Log4Shell vulnerability, as it came to be known, could have devastating consequences for companies and individuals. And exposure to that vulnerability was found to be stunningly broad: The code had become embedded in software systems on a grand scale, introducing a serious vulnerability into many critical systems around the world. The Log4j exposure should be a wake-up call to executives to better understand software reuse and how to mitigate the risk of using it in their organizations.

Software reuse originated as an efficiency measure within large software companies and was mostly an internal undertaking involving home-built proprietary components. The advent of the internet and the explosion of open-source software transformed the practice. Today, most software is at least partially built on functionality acquired through external software components. These components are often shared for everyone’s benefit on open-source repositories, such as PyPI for Python, NPM for Node.js, and Maven Central Repository for Java, to name a few.

The main advantage for developers importing components from these repositories is that they do not have to assume ownership of the code or take responsibility for bug fixes or feature enhancements. Rather, they can concentrate on writing their own software while benefiting from the work of other teams of software developers. In addition, it is easier to import an entire package than to cherry-pick specific lines of code that will usually then need to be modified to fit into one’s own source code. A package is self-contained and built as a turnkey solution for reuse that can be treated like a black box by developers. Manually cherry-picking specific lines of code means having to wade through a package’s source code and identifying, copying, or reproducing parts of that source code, which is more time-consuming.

Taking Stock of Reuse at Scale

It can be difficult for business leaders to comprehend the extent to which reuse has become ingrained in software development practices. To illustrate its pervasiveness, consider that Lodash, one of the most popular open-source JavaScript packages available on the NPM repository, was downloaded more than 2 billion times in 2021 (that’s more than 40 million times per week on average) and that more than 149,000 other packages published on NPM depend on it to function. Chalk, another popular package, was downloaded more than 5 billion times in 2021, and more than 77,000 packages published on NPM depend on it.

Some researchers argue that writing software is now often more about writing “glue code” to tie in pieces of existing software components than writing entirely new sets of instructions or algorithms.2 The authors of a recent study of the practice observed that “the software industry is undergoing a paradigm shift. Unlike in the past, when software reuse was just an anomaly, reuse is now becoming the norm for any significant software-development projects” — a sentiment shared by many.3

To grasp the scale of this phenomenon, consider that for every software component a team chooses to reuse, there is a high chance that this component itself depends on other software components that also have their own dependencies. (See “Software Reuse Dependencies in Principle — and in Practice.”) What this means is that when a developer imports a single component — also known as a dependency, in this context — dozens of other dependencies might be brought in at the same time, contained within that single component like Russian matryoshka dolls. As a result, a large software project might indirectly depend on thousands of other components created and maintained by as many teams of developers, each with its own interests, objectives, and agendas.

Software Reuse Dependencies in Principle — and in Practice

The diagram on the left illustrates the direct and indirect dependencies created by software reuse; the diagram on the right shows a real-world example of the extent of such a network of dependencies for the Express web development framework. Express depends directly on 48 other packages, which in turn depend on a total of 250 additional packages.

In the case of Log4j, its developers reacted swiftly to the vulnerability disclosure, and a new version of the framework was made available for download within days. This is a testament to the open-source community’s ability to move quickly when problems arise. However, the real issue for many thousands of organizations was that it then became the responsibility of all those who used a vulnerable version of Log4j in their software to upgrade with a patched version and manage incident resolution with their customers. While Log4Shell is no doubt an extreme illustration of the phenomenon, reports frequently crop up on the discovery of issues in popular packages, or problems stemming from organizations’ use of low-quality or outdated software packages to produce their own software.4 In some instances, these issues are found to affect millions of connected devices that cannot be easily upgraded.5

Despite these undesirable outcomes, organizations have so much to gain from reuse that it is bound to play a growing role in software development practice. At the same time, business leaders should be aware of the implications of software reuse, even if they are not in the business of developing software. Here are four key insights for leaders as they consider ways for their organizations to manage the risks:

Consider the consequences. Software reuse may boost productivity in the short term, but it can have long-term consequences. For example, if there is a bug in an external component, your organization will have to wait until that bug is fixed to redeploy its own software. Some open-source projects benefit from an engaged, active community of developers, as is the case with Log4j. But it is also possible for the community supporting the development of a component to lose interest. Key developers may move on to other projects. And if development of a component stagnates, your developers might be left to deal with an outdated package.

When evaluating the fitness of a component for reuse, your development team should be looking beyond immediate functional needs. They should do some due diligence on the community or the organization (including your own) supporting the project, its responsiveness to bugs or feature requests, and its overall track record, in order to determine whether your organization will still be able to count on that component well into the future.

Look beyond direct dependencies. If the default practice in your organization is to systematically look for reusable software components to implement functionality, it is likely that your software is dependent on hundreds, if not thousands, of external components. It’s important to look beyond direct dependencies to evaluate the indirect dependencies that are the consequence of a particular reuse decision.

In some instances, it will be worth taking on the risks of acquiring indirect dependencies, because the functionality gained will help achieve project goals. But sometimes a single piece of functionality depends on dozens of other external components. In those cases, it may be best to reimplement the portion of the functionality you want to acquire and, if needed, cherry-pick a few select external components to help you along the way. While this can involve more upfront work and can feel like reinventing the wheel, sometimes it is the most sensible decision, from both a business and a technical standpoint. Your organization’s reuse decision process should include a careful assessment of the need for specific functionality in light of the indirect dependencies it brings with it.

Periodically review components in use. Decisions to include external components in software should be revisited regularly as part of the good practice of paying down technical debt (defined as the cost of past decisions to choose an expedient solution over the best solution). Software teams are often encouraged to engage in refactoring, a practice wherein source code is reorganized to reduce technical debt accumulated in the form of earlier coding shortcuts and workarounds, so as to improve future developer productivity and make the code easier to maintain. Unfortunately, code refactoring usually considers only internal source code, because that is what teams can control and alter easily.

When your team writes code, they sometimes upgrade external components. Over time, it can become a reflex: If a component has served the organization well for months, by default the team trusts that the latest version can be safely used, and they continue to build around that component, even if that means having to implement or keep workarounds in order to do so. Unfortunately, this can contribute to increased technical debt over time. Refactoring offers an opportunity for teams to reconsider whether they still want to depend on an external component. It’s important to keep abreast of a component’s development and its road map, including monitoring it for potential issues and vulnerabilities over time, to ensure that it continues to be a good fit for your needs. During refactoring initiatives, careful consideration of external components can contribute to reduce technical debt and minimize the buildup of software bloat that can hinder software quality and developer productivity.6

Know what software your organization relies on. It may sound simplistic, but knowing what software your organization runs, especially operationally critical software, is important beyond the requirements of IT audits. The discovery of Log4Shell led to the shutdown of the Canada Revenue Agency’s services; in the province of Quebec alone, more than 4,000 government websites were shut down as a preventive measure.7 Agencies and organizations were forced to suspend access to systems while they inventoried their software to assess whether they were actually affected by the vulnerability. Log4j is so pervasive that organizations that had Java applications running could probably assume that Log4j was being used somewhere and might require patching.

When it comes to commercial software, such an issue should be handled by your software providers. However, it’s important to ensure that when you select vendors, you are confident that they will be responsive to issues that occur in their products, whether linked to their own source code or to software components they reuse. Ask vendors to demonstrate that they have good governance over reuse decisions, including properly vetting software for quality and meeting licensing requirements to reuse open-source software. Your organization needs to trust not only the software provider but also the many other teams of far-flung developers that their code relies on.

Senior executives increasingly recognize that digital resilience is an existential issue for their organizations. It’s incumbent upon leaders to understand the risks inherent in both modern software development practices and packaged software acquisition decisions, and to ensure that their technology function has good processes in place to manage those risks. While software reuse is critical to the accelerated pace of digital innovation, it has the potential to cause widespread, unanticipated, and damaging consequences. Ensuring that the practice is managed carefully and considers both short- and long-term implications can mitigate the risk that is inextricable from the benefits.

Blog