How to Measure Engineering Team Performance

Wondering how to measure or improve your team’s performance?

In this article, we will explore our measurement approach and evidence-based metrics to use to improve your team's performance.

But, before diving into our approach, it’s important to highlight that the bedrock to a high-performing engineering team is having a trust-based environment. Without that, measuring engineering team performance is pointless, as metrics may be misused or gamed and the team won't trust them (or their leaders) enough to take action on them.

It’s Goodhart’s Law in action: “When a measure becomes a target, it ceases to be a good measure.”

Sections:

1. Our Approach to Measuring Engineering Team Performance
2. Key Metrics for Measuring Engineering Team Performance
3. Strategies to effectively measure engineering team performance
4. Software to measure Engineering Team Performance

‍

1. Our Approach to Measuring Engineering Team Performance

Here is our 6 step approach to measuring your engineering team performance:

Be Clear about the Business Goal: Why do we want to bring in metrics? What are the bigger outcomes we're hoping to support? Make sure your chosen metrics align with the value you are trying to create (e.g., shipping a new release, retaining key talent). And don't keep it a secret – let everyone know why these metrics matter.
Choose a Couple Metrics to Start: You don't need perfect metrics, just a few to get started. You can always gradually expand as your team becomes more comfortable. What are a few key metrics that will help your team and align you with the business objectives? An easy way to start is with the 4 key DORA metrics on the performance side combined with some of the human metrics from the Satisfaction or Collaboration dimensions in SPACE.
Establish a Baseline: Take a snapshot of your current performance against the selected metrics. This gives you a reference point for seeing how you compare to benchmarks and tracking progress as you run experiments.
Integrate with Existing Tools and Processes: Getting metrics should help you, not slow you down. This can look like:
1. Incorporating tracking into your git tooling, CI/CD pipelines, issue tracking, and IMS
2. Weaving metrics into your team practices like stand-ups, retros and 1:1s to stay across bottlenecks and issues early. Most dashboards are built and then get forgotten; you don't want that to happen here!
3. Supporting team members on the metrics and encouraging open communication and feedback during the integration process.
Run an experiment: By now, hopefully your team has live access to key metrics, woven the metrics into your team practices, and are having rich discussions about what's going well and where you can improve. So its time to run an experiment! Pick one metric you want to improve, come up with a hypothesis on how to do it as a team, and then give it a go!
Track Progress and Celebrate Successes: Ideally, your tooling helps you check in how things are going (see how our platform can help you with this). Are you seeing the improvements you hoped for? Or not quite there yet? Whether the experiment worked or it’s time to try a new approach — the most important part is to celebrate every win or learning, no matter how small. And remember, spread the knowledge! Your wins could inspire other teams and contribute to a culture of continuous improvement by making failed experiments a common and celebrated part of work.

From there, it's rinse and repeat for steps 4-6. Choose one area to improve, then run an experiment, track progress and iterate.

This approach will help build a culture of continuous improvement and blameless experimentation. Building the habit of experimentation lowers the barriers for teams to change their ways of working, which really helps when tackling new challenges and constantly shifting priorities.

2. Key Metrics for Measuring Engineering Team Performance

Once you’ve decided that we’d like to measure engineering team performance, there are endless debates across the internet about what metrics to use.

At Multitudes, our philosophy towards metrics is:

Focus on building trust first
Start with a small, manageable selection of metrics. One or two alone won’t be a true reflection of the trade-offs we have to make
Don’t agonize about choosing perfect metrics. Metrics will evolve over time based on priorities — what’s important is leaving plenty of opportunity for team feedback and changes along the way
Use research-backed metrics because they bring more evidence of measuring the right things

Among the recognized frameworks used for measuring engineering team performance are the DORA metrics, the SPACE (Satisfaction, Performance, Activity, Communication, and Efficiency) framework, and the Developer Experience (DevEx) framework.

You can read more about what we measure and why here.

Each framework presents its own lens through which engineering team performance can be viewed:

DORA place emphasis on quality and velocity, which are in tension with each other. Optimizing DORA requires balance —teams that laser focus on velocity lose quality, and visa versa.
SPACE offers a more holistic approach including well-being measures of engineers on top of performance metrics.
DevEx defines the dimensions of the lived experience of developers and the points of friction encountered in their everyday work.

DORA Metrics

DORA metrics, when first introduced by Google, focused on 4 key metrics (”the four keys”) that are strong indicators of software delivery performance. This evolved over time, with updates to metrics and introduction of a 5th:

Change Lead Time: The time it takes to go from code committed to code successfully running in production.
Deployment Frequency: How often an organization deploys code to production or releases it to end users.
Failed Deployment Recovery Time (Formerly Mean Time to Recovery): The average time it takes to restore service when a software change causes an outage or service failure in production.
Change Failure Rate: The percentage of changes that result in degraded service or require remediation (e.g., lead to service impairment or outage, require a hotfix, rollback, fix forward, or patch).
Reliability: This fifth metric was introduced later in 2021, and assesses operational performance including availability, latency, performance, and scalability. Given it is newer and doesn’t have a quantifiable metric for performance benchmarks, it tends to attract less attention.

DORA metrics focus on performance aspects, which have correlations with customer value and financial performance of companies.

SPACE Framework

The SPACE framework takes a big picture view of developer productivity by considering 5 key dimensions:

Satisfaction and Well-being: Measures satisfaction, fulfillment, and well-being, both at work and off work.
Performance: Outcomes that the organization aims to reach or create value for customers and stakeholders.
Activity: Combines outputs that are countable, discrete tasks and the time it takes to complete them.
Communication and Collaboration: Represents the interactions, discussions, and other acts of collaboration that take place within teams.
Efficiency and Flow: Focuses on the ability of an engineer to complete work or make progress.

By integrating these dimensions into consideration, the SPACE framework gives managers a holistic view of engineering team performance that enables them to make better decisions. At Multitudes, we like the SPACE framework because it encapsulates the performance aspects of the DORA framework while also acknowledging the importance of a psychologically-safe and trust-based working environment.

DevEx Metrics

The Developer Experience (DevEx) framework focuses on the lived experience of developers and the points of friction they encounter in their everyday work. The DevEx framework aims to provide a holistic view of the developer experience by considering 3 key dimensions:

Feedback Loops: The speed and quality of responses to engineer actions.
Cognitive Load: The mental effort required to perform tasks.
Flow State: The ability to work with focus and enjoyment.

In addition to these dimensions, the DevEx framework also emphasizes the importance of measuring overall KPIs, such as developer satisfaction and engagement, to gain a comprehensive view of the developer experience and guide improvement efforts.

5. Strategies to effectively measure engineering team performance

Software development is a type of knowledge work, so it’s not as simple to measure as inputs in leading to widgets out. On top of that, the rise in remote and hybrid work has further complicated this scenario. Given these challenges, how can we approach measuring engineering teams in a way that's both meaningful and beneficial?

Focus on Outcomes and Impact

Instead of fixating on effort or output, prioritize the right engineering metrics that reflect the real value delivered to customers and the business. While these are highly specific to each teams use-case, some examples include:

Customer satisfaction scores for new features
Revenue generated or costs saved by engineering initiatives
Improvements in key performance indicators (e.g., system reliability, performance)

We recommend deciding on these metrics as a group with relevant stakeholders (e.g., ICs, Team Leads and Product Managers).

Combine human metrics with engineering metrics

It can be easy to think of performance only as what work gets done. But since software development is knowledge work, it can only be done well if the people doing it are well. This is why looking at things like Wellbeing and Collaboration are important human factors to round out the performance metrics.

By examining communication patterns and interactions, organizations can identify factors that boost or hinder team performance. For example, Multitudes can help you measure:

Out-of-hours work — how often people are working outside of their own preferred working hours
PR Participation Gaps — the absolute difference between the most and least frequent commenters on the team
Feedback Flows — How much feedback each person gave on other people’s PRs and how feedback flows between people

Combining performance and human metrics gives companies a more comprehensive view of their development teams including insights into workplace stress, psychological safety, and team cohesion.

For instance, analyzing how team members communicate can reveal bottlenecks or effective practices that aren't apparent from performance data alone. To provide a real-world example, this is how Multitudes helped Octopus Deploy ship 47% more PRs by reducing the feedback burden on a principal engineer by 89%.

Incorporate Context

Numbers don't tell the whole story. What they can do is give you a great starting point for conversations. Regular check-ins, retrospectives, and peer feedback can provide valuable insights into team dynamics, individual growth, and areas for improvement that metrics alone might miss.

We also recommend leaving room for annotations and context notes alongside the data, so the human context stays next to the numbers (which you can easily do inside the Multitudes platform).

Individual contributor vs. Team Metrics

While individual contributions are important, we recommend emphasizing team-level metrics to encourage collaboration and shared responsibility. The reality is that the best software is built by cohesive, high-performing teams. PRs are a team sport.

When scrutinizing the output of single developers, it becomes difficult to assign credit or blame because software development inherently depends on teamwork. In fact, excessive focus on individual performance can actually make the overall performance worse – imagine the brilliant jerk who gets their work done quickly, but at a cost to the rest of the team.

That’s why it’s so important to look at team performance – because that’s what we want to optimize for. In fact, a two-year study by Google called Project Aristotle found that the number one thing for improving performance is building strong team factors, particularly psychological safety.

For all these reasons, we focus on team performance over individual performance.

Use Metrics for Exploration, Not Accusation

Instead of using productivity metrics as a tool for reward or punishment, use them as a starting point for conversations about process improvements, resource allocation, and professional development.

However, it is important to note that metrics only capture a portion of the big picture. A big problem we see in the industry is when leaders use reductive measures to evaluate engineering performance, which tends to cause harm. We’ve all read about the CEO who stack-ranked engineers on lines of code and fired the bottom quartile.

That’s why Multitudes puts limits on what we measure in line with our data ethics principles. Ultimately, improving engineering team performance requires a holistic approach that goes beyond what can be measured alone. Clear communication, effective prioritization, knowledge sharing, targeted training, and maintaining a psychological safety all cultivate high-performing teams.

6. Software to measure Engineering Team Performance

To effectively measure Engineering Team Performance, teams can use Multitudes, which is an engineering insights platform for sustainable delivery. Multitudes integrates with your existing development tools, such as GitHub and Jira, to provide insights into your team's productivity and collaboration patterns.

With Multitudes, you can:

Automatically track all key engineering performance metrics like Change Lead Time and Deployment Frequency
Get visibility into work patterns and where the team’s time went, e.g. into feature work vs. maintenance work and bug fixes
Identify collaboration patterns and potential knowledge silos within your team
Understand individual and team health through metrics like out-of-hours work, incidents, and meetings
Get nudges via Slack about blocked work and who might need more support, sent just in time for your next stand-up, retro, or 1:1

By leveraging Multitudes, teams can spend more time acting on insights to improve their productivity and satisfaction.

Ready to unlock happier, higher-performing teams?