Question about detected function length #8

jdhitsolutions · 2017-10-10T15:55:56Z

I also have a question on how you determine what a proper function length is. From your blog post I appreciate the notion that sometimes you need to refactor. But sometimes a function is long by design. For example, in a function I tested I get 78 lines of code. But if you don't count commented lines or my lines of Write-Verbose (or someone might use Write-Debug), then I'd get a line count of 60 which your analyzer would report as "better". For that matter if don't count white space, my function gets down to 46 lines of actual code that does something.

I completely understand where you are going with this metric but I'm trying to figure out how it is being determined and if that is the "best" way.

jdhitsolutions · 2017-10-10T16:08:37Z

I read through your documentation on why I guess my question is if your how is the best way.

MathieuBuisson · 2017-10-12T11:00:07Z

Hi Jeffery @jdhitsolutions ,

Here is how Get-FunctionLinesOfCode currently works :

It includes lines with a single Write-* command
It excludes all lines of comment(s)

The reasoning behind this comes from how other code metrics tools work, for example :
NDepend
And it tries follows this statement :

"Only concrete code that is effectively executed is considered when computing LOC"

But this is arguable for many reasons :

Get-FunctionLinesOfCode is counting blank lines
Tools like NDepend look at compiled code, so maybe it doesn't make sense for PowerShell. Also, the main purpose of PSCodeHealth is to assess the human effort required to maintain the code, so the source code matters more than what is compiled/executed.
To decide which lines should be included/excluded in this metric, we should first decide what we are trying to measure with with metric : to me, Lines of Code is mainly about readability and complexity.
Comments (especially inline comments) can actually degrade readability.
Comments (especially inline comments) can be considered as a "code smell" : a sign that the code is lacking in clarity or expressiveness.
If we acknowledge that, we should probably include comments into this metric, as a kind of "inline comments penalty".
On the other hand, blank lines tend to increase readability (unless there are way too many of them), so they should probably be excluded from the line count.
This metric can be gamed very easily : we can "improve" it by squishing multiple lines into 1, for example. Does it improve readability ? Most likely not. Does it improve maintainability ? Most likely not. But any metric can be gamed, so developers should keep in mind what the metric attempts to measure, not exclusively the metric itself.

For all these reasons, I'm happy with NOT deciding alone and arbitrarily what is included/excluded in the Lines of Code metric. I think it belongs to the PowerShell community to decide that, and once we have some kind of consensus, I would be very happy to change the code in PSCodeHealth to reflect that.

That's why I'm glad you raised this subject and your opinion/suggestions are very welcome.

Does this answer your question(s) ?
Do you have suggestions on what should be included/excluded in the line count ?

jdhitsolutions · 2017-10-12T13:31:13Z

That helps. I might have to disagree about comments. Properly inserted code comments are valuable and something the PowerShell community recommends. Remember, many people writing or using PowerShell modules aren't developer oriented and using comments is considered a best practice. I would never assume a lot of comments means the code is poor. I would not penalize for it.

The whole "a shorter function" is better metric seems completely arbitrary. If I need 200 lines, excluding any comments and blank lines to properly achieve a result, and my code is well formed, then that's what it needs. My function is completely healthy.

Or here's another illustration. Suppose I need to include a hashtable in my code. It could be written like this (using simple values to save some typing:

$h = @{a=1;b=2;c=3;d=4;e=5}

With real values this could be a very long line and something that I find hard to read. I prefer and recommend this:

$h = @{
  a=1
  b=2
  c=3
  d=4
  e=5
}

Obviously I've increased the number of lines of code but I would say this is "healthier" because it is easier to read and troubleshoot.
As you pointed out, what we are really talking about is source code, which has to be treated differently.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about detected function length #8

Question about detected function length #8

jdhitsolutions commented Oct 10, 2017

jdhitsolutions commented Oct 10, 2017

MathieuBuisson commented Oct 12, 2017

jdhitsolutions commented Oct 12, 2017

Question about detected function length #8

Question about detected function length #8

Comments

jdhitsolutions commented Oct 10, 2017

jdhitsolutions commented Oct 10, 2017

MathieuBuisson commented Oct 12, 2017

jdhitsolutions commented Oct 12, 2017