People & Process

Can You Measure Software Developer Productivity? Tips and Insights

Can You Measure Software Developer Productivity? Tips and Insights

Yes, you can measure software developer productivity … but it won’t always lead to good outcomes. What matters most is how you do it.

In today's data-driven world, the question of measuring software developer productivity has become increasingly contentious. But let's first take a step back and ask ourselves: Should we measure it, and is it actually beneficial?

In this article, we will explore how measuring software developer productivity is pointless if you don’t create a trust-based environment. We’ll also discuss effective metrics, the challenges involved, and strategies to avoid common pitfalls.

Sections:

1. Why even measure Developer Productivity?

Before we dive into the how of measuring developer productivity, let’s tackle a more fundamental question: why would we want to measure the developer productivity in the first place?

There’s typically 3 scenarios:

  1. To improve team performance: To boost team performance: Engineering managers want to build high-performing teams. They aim to celebrate when people are doing great work, and know who might need extra support and coaching.
  2. To make delivery more predictable: Engineering leaders often get asked how long it's going to take to build something. Estimation is notoriously tricky, but having some data can help leaders and teams sanity-check their estimations, ensure they have enough work to keep the team busy, and figure out what might be blocking or slowing down delivery.
  3. To make a case for engineering budget: Another big question for engineering leaders is what value they're creating with their resources. They need to show that they've put their teams to good use, so data can be a powerful ally in showing progress and making the business case to expand on successful experiments (e.g., our clients have used Multitudes to show why a four-day workweek won’t impact productivity, or how hiring juniors makes teams better off).

With this context, let’s then dive into what developer productivity truly is, how it can be measured, but most importantly how the metrics should be used.

2. What is Developer Productivity?

When we talk about developer productivity, what are we really measuring? It's crucial to recognize that productivity in software development isn't just about churning out lines of code or closing tickets. It's about building high-performing teams and creating real value for the business.

Developer productivity refers to how efficiently and effectively software developers can complete their tasks and deliver high-quality software products within a given timeframe. It's a concept that includes many factors like:

  • The ability to deliver features that meet user needs
  • Efficiency in problem-solving and debugging
  • Collaboration and knowledge sharing within teams
  • Ability to adapt to new technologies and methodologies
  • Code quality

Measuring developer productivity can help teams get more predictability in their project timelines, support development of people, and making the case for investment in teams.

But here's the tricky part: sometimes leaders use overly simplistic metrics that can cause harm. We've all heard stories about the CEO who stack-ranked engineers based on lines of code and fired the bottom quartile. That's why at Multitudes, we believe it's crucial to put guardrails on what we measure, in line with our data ethics principles.

Outcomes vs. Outputs: an important distinction

In the world of software development, it's super helpful to distinguish between outputs and outcomes when gauging team productivity:

  • Outputs are quantifiable products such as lines of code, number of tickets completed, or features created. Relying solely on these can be misleading: a high volume of buggy code could actually slow you down, compared to fewer lines of carefully considered code.
  • Outcomes, on the other hand, reflect the real benefit from those outputs—like improvements in performance or user satisfaction.

A classic pitfall is to measure only outputs, which can be easily gamed. It also discourages positive behaviors, like learning a new language, sharing knowledge, keeping documentation up-to-date, or doing other glue work for the team (Multitudes is one of the only products that can measure glue work!). In these examples, your output could drop, even though you're helping achieve a better long-term outcomes for the business!

Let's not forget that software development is fundamentally a creative process (arguably if not harder 😉).

3. Key Metrics for Measuring Developer Productivity

Even once we've decided to measure developer productivity, there are endless debates about what metrics to use.

At Multitudes, our approach is refreshingly simple:

  1. Use research-backed metrics because they're more likely to measure the right things
  2. Look at a multitude of metrics – one or two alone won't give you the full picture of the trade-offs we have to make
  3. Don't stress over choosing perfect metrics – focus instead on building trust, pick a few solid key metrics, and leave plenty of room for team feedback and changes along the way

Among the recognized frameworks utilized for measuring engineering team performance are DORA metrics, SPACE (Satisfaction, Performance, Activity, Communication, and Efficiency) framework, and the Developer Experience (DevEx) framework.

We’ve written in-depth our perspective on metrics here.

Each framework presents its own lens through which engineering team performance can be viewed:

  • DORA metrics place emphasis on quality and velocity, which are in tension with each other. Optimizing DORA requires balance —teams that laser focus on velocity lose quality, and visa versa.
  • SPACE offers a more holistic approach including well-being measures of engineers on top of performance metrics.
  • DevEx defines the dimensions of the engineering lived experience and the points of friction encountered in everyday work.

DORA Metrics

DORA metrics, when first introduced by Google's DevOps Research and Assessment team, focused on 4 key metrics (”the four keys”) that are strong indicators of software delivery performance. This has evolved over time, with updates to a metric and an introduction of a 5th:

  • Change Lead Time: The time it takes to go from code committed to code successfully running in production.
  • Deployment Frequency: How often an organization deploys code to production or releases it to end users.
  • Failed Deployment Recovery Time (Formerly Mean Time to Recovery): The average time it takes to restore service when a software change causes an outage or service failure in production.
  • Change Failure Rate: The percentage of changes that result in degraded service or require remediation (e.g., lead to service impairment or outage, require a hotfix, rollback, fix forward, or patch).
  • Reliability: This fifth metric was introduced later in 2021, and assesses operational performance including availability, latency, performance, and scalability. Given it is newer and doesn’t have a quantifiable metric for performance benchmarks, it tends to attract less attention.

At Multitudes, we like DORA metrics because they’re correlated with both financial performance and with psychological safety of the team. We need to deliver value to customers for our companies to survive, and that’s best achieved if we create a good environment for our people — and DORA balances both.

DORA metrics with reliability

SPACE Framework

The SPACE framework, proposed by Nicole Forsgren, Margaret-Anne Storey, and Chandra Maddila, takes a holistic view of developer productivity by considering 5 key dimensions:

  1. Satisfaction and Well-being: Measures satisfaction, fulfillment, and well-being, both at work and off work.
  2. Performance: Outcomes that the organization aims to reach or create value for customers and stakeholders.
  3. Activity: Combines outputs that are countable, discrete tasks and the time it takes to complete them.
  4. Communication and Collaboration: Represents the interactions, discussions, and other acts of collaboration that take place within teams.
  5. Efficiency and Flow: Focuses on the ability of a developer to complete work or make progress.

By integrating these dimensions into consideration, the SPACE framework gives managers a holistic view of developer productivity that enables them to make better decisions.

DORA and SPACE aren’t necessarily alternatives. In fact, key author Nicole Forsgren has said it is most helpful to consider DORA an implementation of SPACE framework. While DORA is based on metrics that are good for the psychological safety of the teams, it doesn’t directly measure how people are doing. SPACE fills that gap by bringing in more human-centric metrics, such as the well-being of developers, alongside their efforts.

DevEx Framework

The DevEx Framework focuses on measuring the lived experience of developers and the points of friction they encounter in their everyday work. The DevEx framework aims to provide a holistic view of the developer experience by considering 3 key dimensions:

  1. Feedback Loops: The speed and quality of responses to developer actions.
  2. Cognitive Load: The mental effort required to perform tasks.
  3. Flow State: The ability to work with focus and enjoyment.

By tuning into both what developers say and how they actually work, you can unlock some pretty amazing insights into how to improve developer productivity, satisfaction, and retention.

Individual contributor vs. Team Metrics

While individual contributions are important, we recommend emphasizing team-level metrics to encourage collaboration and shared responsibility. The reality is that the best software is built by cohesive, high-performing teams. PRs are a team sport.

When scrutinizing the output of single developers, it becomes difficult to assign credit or blame because software development inherently depends on teamwork. In fact, excessive focus on individual performance can actually make the overall performance worse – imagine the brilliant jerk who gets their work done quickly, but at a cost to the rest of the team.

That’s why it’s so important to look at team performance – because that’s what we want to optimize for. In fact, a two-year study by Google called Project Aristotle found that the number one thing for improving performance is building strong team factors, particularly psychological safety.

For all these reasons, we focus on team performance over individual performance.

4. Pitfalls when trying to measure developer productivity

Software development is a type of knowledge work, so it’s not as simple to measure as inputs in leading to widgets out. On top of that, the rise in remote and hybrid work has further complicated this scenario. We must be cautious not to fall into the trap of oversimplification or creating perverse incentives.

Lack of psychological safety

Fostering a high-trust environment with strong psychological safety is crucial for the effective use of productivity metrics. Google's Project Aristotle research into team effectiveness revealed that psychological safety, more than anything else, was critical to making a team work. The researchers found that individuals on teams with higher psychological safety were less likely to leave Google, they were more likely to harness the power of diverse ideas from their teammates, they brought in more revenue, and they were rated as effective twice as often by executives.

In such environments, teams have the space to experiment, learn, and refine their metrics over time, knowing that they will support each other in the process of continuous improvement. This in contrast to environments where the lack of trust causes fear that metrics are used to blame individuals — leading to burnout and behaviors that do not contribute genuine business value.

Mis-use of metrics

Reductive metrics can lead to reductive outcomes—focusing the number of commits or lines of code won’t give the full picture of how a team is performing and what they’ve contributed. Goodhart’s Law warns us that once a metric becomes a goal, its effectiveness as an indicator declines. Concentrating only on these superficial figures might inadvertently encourage behaviors that do not contribute genuine business value—for instance, writing lots of code without spending enough time on quality, or simply to game the metrics.

That’s why we recommend putting in place guardrails for what you won’t measure as much as what you will. Just because something can be measured doesn’t mean it should be, or that it will help anyone to measure it. We’ve written a separate article about our data ethics guardrails here.

Overemphasis on the metrics alone

Depending too much on metrics can give a skewed perspective of developer productivity. Numerical data may yield useful information, but it usually doesn’t account for the entirety of a developer’s work quality. For instance, while high volumes of code commits might appear impressive based on quantitative analysis, without getting the team’s context, one cannot discern if these changes are substantive or merely cosmetic.

To gain a full grasp of productivity levels, it is crucial to pair metrics with the human context. As we always like to say at Multitudes, even if we had all the integrations, surveys, and data sources in the world, it would never be the sum of a human – because humans do things offline and they change. The best way to get the complete picture is to pair the data with conversations with people. That’s the best way for organizations to make informed choices and enhance overall team performance significantly.

Looking at performance metrics without human metrics

It can be easy to think of performance only as what work gets done. But since software development is knowledge work, it can only be done well if the people doing it are well. This is why looking at things like Wellbeing and Collaboration are important human factors to round out the performance metrics.

By examining communication patterns and interactions, organizations can identify factors that boost or hinder team performance. These measures highlight important elements like team morale and workplace stress, which significantly impact productivity but aren't captured in pure performance measures.

Combining performance and human metrics gives companies a more comprehensive view of their development teams. This approach helps avoid misinterpretation and leads to more informed decisions about improving productivity. For instance, analyzing how team members communicate can reveal bottlenecks or effective practices that aren't apparent from performance data alone. To provide a real-world example, this is how Multitudes helped Octopus Deploy ship 47% more PRs.

5. Strategies to effectively measure developer productivity

Given these challenges, how can we approach measuring developer productivity in a way that's both meaningful and beneficial?

Focus on Outcomes and Impact

Instead of fixating on effort or output, prioritize metrics that reflect the real value delivered to customers and the business. While these are highly specific to each teams use-case, some examples include:

  • Customer satisfaction scores for new features
  • Revenue generated or costs saved by engineering initiatives
  • Improvements in key performance indicators (e.g., system reliability, performance)

We recommend deciding on these metrics as a group with relevant stakeholders (e.g., ICs, Team Leads and PMs).

Embrace Team-Level Metrics

While individual contributions are important, emphasize team-level metrics to encourage collaboration and shared responsibility. This approach aligns with the reality of how software is built and can lead to more cohesive, high-performing teams.

Incorporate Context

Numbers don't tell the whole story. What they can do is give you a great starting point for conversations. Regular check-ins, retrospectives, and peer feedback can provide valuable insights into team dynamics, individual growth, and areas for improvement that metrics alone might miss.

We also recommend leaving room for annotations and context notes alongside the data, so the human context stays next to the numbers (which you can easily do inside the Multitudes platform).

Use Metrics for Exploration, Not Accusations

Instead of using productivity metrics as a tool for reward or punishment, use them as a starting point for conversations about process improvements, resource allocation, and professional development.

Ultimately, improving developer productivity requires a holistic approach that goes beyond what can be measured alone. Clear communication, effective prioritization, knowledge sharing, targeted training, and maintaining a healthy work-life balance all cultivate high-performing teams. The value isn't in the measurement of developer productivity, but instead the actions you take based on the insight you learn to improve it.

5. Strategies to effectively measure developer productivity

To effectively measure developer productivity, teams can use Multitudes, which is an engineering insights platform for sustainable delivery. Multitudes integrates with your existing development tools, such as GitHub and Jira, to provide insights into your team's productivity and collaboration patterns.

With Multitudes, you can:

  • Automatically track all key DORA metrics like Change Lead Time and Deployment Frequency
  • Get visibility into work patterns and where the team’s time went, e.g. into feature development vs. bug fixing
  • Identify collaboration patterns and potential silos within your team
  • Understand individual and team well-being through metrics like out-of-hours work, incidents, and meetings
  • Integrate with Slack to give you automated nudges and strategies on exactly what you should do next to improve team performance

By leveraging Multitudes, teams can spend more time acting on insights to improve their productivity and satisfaction.

Ready to unlock happier, higher-performing teams?

Try our product today!

Contributor
Multitudes
Multitudes
Support your developers with ethical team analytics.

Start your free trial

Get a demo
Support your developers with ethical team analytics.