This is the second post in a series about using NDepend for code analysis. Below are the links to all posts of the series (will be updated as more posts are published):

In the first blog post I tried to describe how NDepend helps developers to write better code and where it belongs in the code analysis tools ecosystem. Now it is time to dive a bit deeper and see what some of the built-in code quality rules can tell us. I will be analyzing the same 29-projects .NET solution from the previous post and will keep using the standalone Visual NDepend application.

Dashboard

Let’s continue from where we stopped last time - that is, looking at the VS solution analysis report. First thing to look at is the dashboard, which displays an overview of the code quality.

You immediately see the overall technical debt expressed as Rating (a single letter in A-E range) and also as days required to fix all the detected issues. As NDepend documentation states, this rating is calculated using SQALE methodology, which stands for Software Quality Assessment based on Lifecycle Expectations. The total number of days is a sum of the costs of all found code issues (as we will see later, every NDepend rule can define its own time-to-fix estimate). I think of this section as an “executive summary”, something you can show to your non-technical manager to quickly communicate an overall code quality status.

Let’s see what other insights we can get from all this.

Lines of code

It gets interesting right from the start: as it turns out, the # Lines of Code metric relies on information from .pdb files instead of sources, allowing to calculate the logical lines instead of physical ones. This is neatly explained in the documentation:

Computing the number of lines of code from PDB’s sequence points allows to obtain a logical LOC of code instead of a physical LOC (i.e directly computed from source files). 2 significant advantages of logical LOC over physical LOC are:

  • Coding style doesn’t interfere with logical LOC. For example the LOC won’t change because a method call is spawn on several lines because of a high number of argument.

  • logical LOC is independent from the language. Values obtained from assemblies written with different languages are comparable and can be summed.

(As a side note, I really like reading these docs on NDepend site - not only to learn the tool itself, but also to better understand the methodology and details of .NET build process. For instance, I learned that a typical ratio of logical to physical LOC for .NET languages is about 1:7. It allows to look at your source code from a slightly different perspective: we spread the actual meaningful code vertically on the screen and make it about 7 times longer just for the sake of readability. Perhaps, the ratio will vary per language, but it will surely be always greater than 1 because of the fundamental typography principles: humans really need space on pages and screens to be able to read a text efficiently.)

Also, it shows that there are 243 lines which are “not my code”. What would that mean? Luckily, everything in Visual NDepend is clickable and interactive, so I can easily see what is included in this metric. Since the tool counts the logical lines of code, it will include the code written by a developer as well as generated by the compiler. Most common examples are default public constructors for classes and backing fields for auto-properties:

Obviously, this code exists in the resulting assembly, but isn’t written or maintained by a developer and is therefore excluded from the maintenance costs calculation.

Types

Some statistics about the number of types/assemblies/methods is also available, which gives a rough idea about how big the projects are. For this solution, there are on average 109 types per assembly. Comparing the number of types (2518) with the number of source files (2494) tells something about the coding style of the team: since the numbers are pretty close, it looks like there was an agreement to almost always put every type in a separate file (a generally accepted exception is keeping both generic and non-generic versions of a type together in a single file).

Cyclomatic complexity

We also see that the average Method complexity is 1.49, while the maximum is 36. In this case NDepend has computed so-called “cyclomatic complexity” across the methods (all 9483 of them, as the dashboard helpfully shows). Cyclomatic complexity is an interesting and non-trivial topic, which I won’t cover here. Here’s the easiest way to think about this metric: it is the number of different execution paths of a method. No conditionals at all? Then complexity is 1. One if statement? Now there are two possible paths, so complexity jumps to 2. Case statement or null-coalescing operator? Even more possible paths. This documentation page explains how exactly NDepend calculates these numbers.

In a way, if unit tests are present and are properly written, the cyclomatic complexity of a method will correlate with the number of tests covering it, because every possible execution path requires a separate unit test. This is why this metric can be such a good indication of code maintainability and cost of change. (For more detailed discussion of cyclomatic complexity I recommend an excellent article by Erik Dietrich published in NDepend’s blog.)

Now, out of curiosity I had a closer look at this method with complexity of 36. It’s kind of funny: at first glance, it contains only a single statement, which returns a dictionary:

private Dictionary<string, string> BuildPropertyMapper(SomeType someType, SomeOtherType someOtherType)
{
    return new Dictionary<string, string>
    {
        {"Weird.Nested.PropertyName", someType == null ? SomeConstant : GetNamePart(someType.StringProperty)},
        // 39 more lines similar to the one above
    };
}

Huh? Isn’t it just one “logical line of code” then? No, of course, not. This is an overly creative use of collection initializer syntax combined with a ternary conditional operator, which only makes it look simple. In reality, all this syntactic sugar will be converted by C# compiler into a series of Dictionary<TKey,TValue>.Add(TKey, TValue) method calls and normal IL branch instructions for null checking (have a look at this answer on StackOverflow, if you are curious what if compiles into). This is one the examples of deceivingly compact code, which in fact only hides complexity, but doesn’t really deal with it.

Now, before you say that it’s too easy to criticize someone else’s code without knowing the context, I have a confession to make…

I wrote this method.

¯\_(ツ)_/¯

Source control history never lies, and in this case it clearly shows that it was me trying to show off my C# skills about 3.5 years ago. Today I wouldn’t consider this code to be especially smart or maintainable and would prefer it simpler and more explicit. Well, you don’t learn without making a couple lots of mistakes.

Rules and quality gates

There are also Quality Gates, Rules and Issues displayed on the dashboard. This may look confusing, so once again I peek into the documentation:

  • Quality Gates represent a synthesized way to know if the team can release to production, through a simple PASS / FAIL approach.

  • Rules represent a detailled way to assess the quality of a code base through an issue / technical-debt approach.

The way I understand it, rules are the actual code quality checks, which, when violated in specific pieces of code, are registered as issues. On top of that, quality gates define higher-level aggregated checks, which can be used to quickly determine the overall status of the code and make a “go/no-go” decision. Rules have different severity, so I would always start the investigation from the most important violations. You can review the “critical” rules and their violations in the Queries and Rules explorer window, which displays them nicely grouped by rule type:

The picture above has the word “query” almost all over the place, but what does it actually mean? What is being queried? This is a perfect moment to have a deeper look into the underlying code-querying infrastructure of NDepend, which powers its rich reporting capabilities.

CQLinq

Under the hood, every rule and quality gate in NDepend is a query executed against the project codebase. The actual sources are .NET assemblies (for analyzing all the compiled IL code), source code itself (e.g. for finding possible issues with file paths or excluding certains files from analysis) and *.pdb files (for making a connection between the compiled IL code and source files). NDepend exposes a lot of this data as several IQueryable interface implementations, allowing to query them using LINQ in an intuitive and developer-friendly manner.

Let’s pick a simple example, a very common mistake made in .NET code: public constants. The corresponding built-in rule in NDepend is called Avoid publicly visible constant fields and, when double-clicked, will show the underlying query in the editor:

In my opinion, this is so readable, that you can represent it as instructions in almost natural language:

Warn me if there are violations of this rule.

For every class field in my code,

Which has been declared as constant, is publicly visible, and is not an enumeration,

Show me the field itself, a technical debt estimation (assuming the fix takes 30 seconds per field), and treat it as an issue of “medium” severity.

Simple and beautiful. What I also like is that every built-in rule contains a Description and a Hot to fix sections: you are informed about a potential issue and then guided to a suggested solution.

To complete the picture, I would add that this public const issue is well-known, so SonarQube has a rule for it and Microsoft offers a similar advice:

Use caution when you refer to constant values defined in other code such as DLLs. If a new version of the DLL defines a new value for the constant, your program will still hold the old literal value until it is recompiled against the new version.

Custom rules

The minute I realized that all built-in rules are just CQLinq queries, I knew I wanted to write my own ones. Let’s try it out!

Even though NDepend is a very advanced tool and can help detect really complex code issues, nothing prevents us from using it for much simpler checks - for example, naming conventions. For the sake of demonstration, I decided to write a rule verifying that all method names start from a capital letter (a widely accepted convention for C#). It took me about 5 minutes to finish, from clicking Rules > New > Code Rule menu item and until I was able to see all 171 violations in the solution under analysis. Most of that time was spent using IntelliSense in the query editor window to filter out compiler-generated methods: constructors, auto-properties, indexers, and operator overloads are all turned into methods by C# compiler, but we don’t really care about their runtime names. So, here is the query I ended up with:

// <Name>Method names should start from a capital letter</Name>
warnif count > 0
from m in JustMyCode.Methods
where !m.IsClassConstructor &&
      !m.IsConstructor &&
      !m.IsPropertyGetter && 
      !m.IsPropertySetter &&
      !m.IsIndexerGetter &&
      !m.IsIndexerSetter &&
      !m.IsOperator &&
      !Char.IsUpper(m.Name[0])
select new
{
  m,
  m.Name
}

That was easy and, once again, turned out to be very readable and intuitive.

Conclusion

At this point I realized that I didn’t make it further than the Dashboard view of NDepend and barely scratched the surface of all the things you can do with it. Exposing the codebase metadata via LINQ provides almost unlimited power of analyzing the code quality. The more I think about it, the more I see NDepend not just as a tool, but as a framework for building a tailored code analysis experience that makes sense to your project and your team.