Lines of Code (LOC) - how NOT to measure developer productivity

“Measuring software productivity by lines of code is like measuring progress on an airplane by how much it weighs.”

—Bill Gates

Lines of code come in many forms — but regardless of how you measure it, focusing on LOC can create misaligned incentives and hide what really matters: delivering value to users.

In this guide, we'll explore why lines of code falls short as a productivity metric, examine its limitations, and share more effective ways to measure developer productivity.

Whether you're an engineering leader looking to improve team performance or a developer interested in productivity metrics, you'll learn why moving beyond LOC is crucial for modern software development.

Sections:

1. What are Lines of Code (LOC)?
2. History of Lines of Code and measuring developer productivity
3. Lines of Code as a productivity metric: a detailed breakdown
4. Alternative approaches to measuring developer productivity
5. Tools for holistically measuring developer productivity

‍

1. What are Lines of Code (LOC)?

Lines of Code (LOC) measures the size of software projects by counting lines in a program's source code. This widely-used engineering metric encompasses all text within source files, including executable statements, declarations, comments, and blank lines – though specific counting methods may vary.

Engineers typically work with two distinct LOC measurements:

Logical LOC (also known as Source Lines of Code or SLOC) focuses specifically on functionality. This measurement excludes comments, blank lines, and non-executable statements to capture only code that contributes to the software's actual operation.
Physical LOC provides a literal measurement by counting every line in the source code, including comments and blank lines. This gives teams a straightforward view of total codebase size.

Let's look at a Python example that demonstrates the difference between logical and physical LOC:

#Calculate time between code review request and response
def get_review_time(start, end):
    if not start or not end:
        return 0
    return end - start

result = get_review_time(9, 10)

‍

This code contains:

5 logical LOC (function definition, if statement, first return, second return, and assignment)
7 physical LOC (includes the comment and blank line)

The example shows why logical and physical LOC can tell different stories about code size. While physical LOC counts every line, logical LOC focuses only on the lines that execute functionality.

‍

2. History of Lines of Code and measuring developer productivity

A classic example from computing history illustrates the limitations of LOC. In 1982, Apple Lisa's engineering managers began tracking LOC contributions per developer. During one notable week, Bill Atkinson, the lead user interface designer, optimized QuickDraw's region calculation system by removing 2,000 lines of code. His LOC was -2000!

After this demonstration of how code reduction could represent significant improvement, management abandoned their LOC tracking approach.

However, the urge to measure productivity of developers still remains. Understanding developer productivity isn't about surveillance or individual performance tracking — it's about enabling teams to do their best work. Organizations measure productivity to uncover opportunities for improvement and ensure they're supporting their developers effectively.

Key motivations include:

Identifying where teams need better tools or resources
Understanding the impact of process changes
Discovering and removing delivery bottlenecks
Learning from successful practices across teams
Making informed decisions about technical investments

Different stakeholders have different needs when it comes to productivity data. A team lead might want to spot collaboration patterns, while engineering leaders might need to evaluate the impact of new development practices. The key is choosing metrics that align with your goals and provide actionable insights at the right level - whether that's team, department, or organization-wide.

‍

3. Lines of Code as a productivity metric: a detailed breakdown

It’s clear why one may consider using LOC. It’s a simple. It’s easy to measure and understand, and one can easily compare it across people and projects.

The problem?

It’s not measuring the right thing, and even worse, it drives the completely wrong set of behaviors. ****A big problem we see in the industry is when leaders use reductive measures to evaluate engineering performance, which tends to cause harm. We’ve all read about the CEO who stack-ranked engineers on lines of code and fired the bottom quartile.

Below is a synthesis of the key ideas presented from the book Rethinking Productivity in Software Engineering, written by Google and Microsoft Researchers.

LOC is a single, narrow, and reductive metric but productivity is broad

The challenge with measuring developer productivity lies in its inherent complexity and multifaceted nature. No single metric can adequately capture the full spectrum of developer work, which spans far beyond code contributions to include system design, code review, mentoring, and cross-team collaboration.

Even when focusing solely on code contributions, crucial aspects like code quality, maintainability, and long-term impact resist simple quantification. Attempting to flatten these diverse elements into a single measurement creates misleading results. For instance, how do you compare a developer who writes less code but maintains high quality against one who produces more code that requires later refinement?

Additionally, combined metrics often become abstract and less actionable. When multiple factors like complexity, completion time, and test coverage are compressed into a single score, the resulting number loses practical meaning and fails to provide clear guidance for improvement. Effective measurement requires acknowledging this complexity and adopting a more nuanced, multi-dimensional approach to understanding developer productivity.

LOC is gameable and incentivizes poor quality code

Steve Jobs said "Quality is more important than quantity. One home run is much better than two doubles". While also true in many other areas of life, it is especially true in software. Consider the following simple example:

A pretty decent solution:‍

javascript
Copy code
function isDayOfWeek(day) {
  const days = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"];
  return days.includes(day);
}

But if LOC is your end goal, an engineer might instead code up the below. Does this make them a better engineer?

javascript
Copy code
function isDayOfWeek(day) {
    if (day === "Monday") {
        return true;
    }
    if (day === "Tuesday") {
        return true;
    }
    if (day === "Wednesday") {
        return true;
    }
    if (day === "Thursday") {
        return true;
    }
    if (day === "Friday") {
        return true;
    }
    if (day === "Saturday") {
        return true;
    }
    if (day === "Sunday") {
        return true;
    }
    return false;
}

‍

Long and complex code introduces business risk

Code complexity and refactoring needs demonstrate another key limitation of LOC as a productivity metric. A larger codebase doesn't necessarily mean better software - in fact, it often creates increased cognitive complexity, making the code more challenging to understand and maintain.

This complexity introduces significant business risks. When engineers leave your organization, they take their understanding of complex code with them. New team members then face steeper learning curves and spend more time deciphering intricate code structures, leading to slower delivery and increased likelihood of errors.

Refactoring - the practice of restructuring code to improve its clarity without changing functionality - often reduces total lines while enhancing code quality and maintainability.

However, teams measured by LOC might avoid this practice, as it would appear to "reduce productivity" even though it's strengthening the codebase's long-term health.

‍

4. Alternative approaches to measuring developer productivity

We believe in using a multitude of different metrics to capture a nuanced idea like developer productivity. While research shows there are now 700-800 different metrics being used across the industry, this is our philosophy**:**

Focus on building trust first
Start with a small, manageable selection of metrics. One or two alone won’t be a true reflection of the trade-offs we have to make
Don’t agonize about choosing perfect metrics. Metrics will evolve over time based on priorities — what’s important is leaving plenty of opportunity for team feedback and changes along the way
Use research-backed metrics (such as the DORA metrics) because they bring more evidence of measuring the right things

Among the recognized frameworks used for measuring engineering team performance are the DORA metrics, the SPACE (Satisfaction, Performance, Activity, Communication, and Efficiency) framework, and the Developer Experience (DevEx) framework. You can read more about what we measure and why here.

Each framework presents its own lens through which engineering team performance can be viewed:

DORA place emphasis on quality and velocity, which are in tension with each other. Optimizing DORA requires balance —teams that laser focus on velocity lose quality, and visa versa.
SPACE offers a more holistic approach including well-being measures of engineers on top of performance metrics.
DevEx defines the dimensions of the lived experience of developers and the points of friction encountered in their everyday work.

DORA Metrics

**DORA metrics** have evolved to be an accepted industry benchmark for developer productivity, as it is based on research spanning over 39,000 professionals across organizations of all sizes and industries. These metrics have proven to be reliable predictors of organizational performance. Over time, these metrics have evolved, leading to updates and the introduction of a fifth metric in 2024:

Change Lead Time: The time it takes to go from first commit to code successfully running in production.
Deployment Frequency: How often an organization deploys code to production or releases it to end users.
Failed Deployment Recovery Time (Formerly Mean Time to Recovery): The time it takes to restore service when a deployment causes an outage or service failure in production (whereas previously Mean Time to Recovery also included uncontrollable failure events such as an earthquake disrupting service).
Change Failure Rate: The percentage of changes that result in degraded service or require remediation (e.g., that lead to service impairment or outage, and require a hotfix, rollback, fix forward, or patch).
Rework rate: This fifth metric was introduced later in 2024, and together with Change Failure Rate provide an indicator of software delivery stability. Since it's a newer addition, there aren’t yet established quantifiable benchmarks and so this metric tends to receive less focus.

SPACE Framework

**The SPACE framework takes a big picture view of developer productivity by considering 5 key dimensions:**

Satisfaction and Well-being: Measures satisfaction, fulfillment, and well-being, both at work and off work.
Performance: Outcomes that the organization aims to reach or create value for customers and stakeholders.
Activity: Combines outputs that are countable, discrete tasks and the time it takes to complete them.
Communication and Collaboration: Represents the interactions, discussions, and other acts of collaboration that take place within teams.
Efficiency and Flow: Focuses on the ability of an engineer to complete work or make progress.

By integrating these dimensions into consideration, the SPACE framework gives managers a holistic view of engineering team performance that enables them to make better decisions. At Multitudes, we like the SPACE framework because it encapsulates the performance aspects of the DORA framework while also acknowledging the importance of a psychologically-safe and trust-based working environment.

DevEx Metrics

The Developer Experience (DevEx) framework focuses on the lived experience of developers and the points of friction they encounter in their everyday work. The DevEx framework aims to provide a holistic view of the developer experience by considering 3 key dimensions:

Feedback Loops: The speed and quality of responses to engineer actions.
Cognitive Load: The mental effort required to perform tasks.
Flow State: The ability to work with focus and enjoyment.

In addition to these dimensions, the DevEx framework also emphasizes the importance of measuring overall KPIs, such as developer satisfaction and engagement, to gain a comprehensive view of the developer experience and guide improvement efforts.

‍

5. Tools for holistically measuring developer productivity

To holistically measure developer productivity, instead of relying on reductive measures like LOC, teams can rely on Multitudes, an engineering insights platform designed for sustainable delivery.

Multitudes integrates seamlessly with tools like GitHub and Jira, offering a comprehensive view of your platform’s technical performance, operational health, and team collaboration.

With Multitudes, you can:

Monitor key developer productivity metrics across DORA, SPACE, and DevEx
Uncover patterns in your metrics that influence delivery speed and quality
Measure the impact of collaboration on platform performance and developer productivity

By leveraging Multitudes, platform teams can spend less time on data collection and more on using actionable insights to improve platform performance

Our clients ship 25% faster while maintaining code quality and team wellbeing.

Ready to improve your platform engineering metrics?