Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static code analysis - naming conventions #86

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sagatowski
Copy link
Contributor

This pull request serves dual purpose:

  1. Implement SLAC naming conventions into the lcls-twincat-general library
  2. Initiate a discussion of static code analysis for TwinCAT software at SLAC

The TL;DR of this PR is that static code analysis is easy to use once it’s in place, but hard to implement if it’s on an existing project.

The (slightly longer version:
What follows are some findings I did while trying to implement the static code analysis into this library.
First of all, when I’m talking about static code analysis in this context I’m referring to the static code analysis tool TE1200 that Beckhoff provides. If you don’t have a license for TE1200, it’s still possible to use the static code analysis light, which provides a small subset of the static code analysis rules. The light version does not provide any help with naming conventions.

In the current state of the library, static code analyser rules + naming conventions have been enabled.
These are however not consistent with the naming conventions that are documented. For example the library uses:
x as naming convention for starting Booleans, while the documentation says that b should be used
P_ as naming convention for start declaration of programs, while the documentation says nothing should be used

I’m assuming that whatever is in the library is incorrect and was simply a test to try out the TE1200 tool.

The TE1200 can be executed in two different ways. One is simply execute the static code analysis, which will work the very same way as building a project. This means that the tasks are the entry point, and then the static code analysis will execute on all objects referenced by tasks/programs. The other alternative is to execute the TE1200 on all objects, which works the very same way as to try to build all objects, independently of whether they are referenced by a task or not. I used the second option for everything, which I assume will always be the case for libraries.

First thing I did was to update the naming convention rules in the project so it’s in accordance with the documentation. Next, I simply tried to execute the static code analyser. I immediately reached the limit of the tests (which is set to ~500). After this number, the TE1200 simply doesn’t continue doing checks. Some of the errors were legitimate, while others were simply wrong and what I would simply call bugs with TE1200. I wrote about these type of bugs over 5 years ago on my blog. As always, Beckhoff tries to fix the bugs, but as there are no release notes (ugh), only the deep dungeons of Beckhoff-Verl know if/when things are fixed. As this library is developed using an slightly older version of the XAE (3.1.4024.35), this means that not all bug-fixes are there either. The TE1200 is integrated into the XAE, and thus it can’t be updated separately.
If I went through all these errors, I would never be able to finish with something in a sensible amount of time. Given the above, I’m not entirely sure it’s even worth it. In either way, in the current state the static code analyser is too buggy. At least to use it for all checks. Maybe it will be better in 4026, maybe not.

So I thought that the reasonable thing to do first, is to use the naming conventions part of the static code analyser. What I did next is to export all the existing static code analyser rules + naming conventions into the static-analysis-rules.csa-file (part of this PR). Next, I disabled ALL static code analyser checks, and only left the naming convention checks. This information is stored in the project. Now, we only do the naming convention checks of the project. If we ever want to do the static code analyser checks, we simply import the exported file into the project again. From that point, I ran the static code analyser and got the following result:

image

165 errors. That’s something we can work with.
Now I went on a journey of exploration and discovery trying to understand what you are doing. As I have limited knowledge about the context of where this library is used, I can’t foresee all the implications on all these changes. The only thing I can see is that there will be implications as there are multiple variables that are exposed through API’s of the function blocks (as var_input, var_output or through method-calls) that have been changed. As everything in TwinCAT is exposed through ADS (TcCOM), the implications can obviously be bigger and that’s one of the topics I think a discussion needs to be made.

Generally speaking there were a lot of places that were not consistent with the naming conventions. I updated those to the naming conventions defined.

Also, many naming conventions can’t automatically be detected. See this example:

image

The naming convention says that a method doesn’t need a M_ or anything like that. It should just for example be DoAThing(). But the above examples would be OK, as the naming convention in the TE1200 tool would simply have an empty string, and that means that the methods can start with anything you want.

In the documentation, you state the following about references:
image

There is no way in the tool to separate between these two. There are just references. So I simply configured the tool that there is nothing special with references, and that they should be treated like their respective primitives they are referring to (in other words, case 1 above). This means this would be valid syntax:
image
no matter whether it’s used as a method input or in other cases (1 & 2 above).

I enabled this setting (enabled by default):
image

What this means is that naming conventions will be “stacked” on top of each other. If we look at this example in the PR:

image

It’s ct because it’s a constant and it’s a TIME-type.
If you don’t use the c before you will get an error:
image

This is a more extreme case:
image

ast as it’s an array (a) of structures (st).

Another example:
image

As it’s a pointer (p) to an integer (n).

And the most extreme case:
image

Array (a) of pointers (p) to function blocks (fb).

Next, I didn’t only have to change VAR_INPUT and VAR_OUTPUT (which will definitely brake functionality in dependencies), but also %I* and %Q*-variables, like this one:

image
This will obviously break all instances of this variable that is pointing to EtherCAT in any place that is using an instance of this variable.

I also noticed you had a few of these:
image
I assume that you are using this to further process this variable through the TMC for some purpose? These (like all other variables) also had to be changed.

I noticed that you had a few pragmas {attribute ‘naming’ := ‘omit’} in your code. I didn’t change those as the static code analyser actually ignored the naming roles on those places. Like here for example:
image

Running the static code analyser now with only the naming conventions activated will make the static code analyser pass with 0 errors.

I want to finish by asking a question, and that is if we even need to use Hungarian notation at all, and whether it could make sense to reconsider them?
I wrote a little about it on my blog a few years ago, so I felt I need to raise the question.

…is-rules.csa".

Corrected naming conventions in the project.
Disabled static code analyser rules checks.
Adapted all code so that the static code analyser naming convention checks passed.
@sagatowski sagatowski changed the title Exported existing static code analyser rules into file "static-analys… Static code analysis - naming conventions Jan 17, 2024
@ZLLentz
Copy link
Member

ZLLentz commented Jan 19, 2024

I want to finish by asking a question, and that is if we even need to use Hungarian notation at all, and whether it could make sense to reconsider them?

I thought about this on and off since you made this PR and also asked for some opinions from the rest of the team (not a lot of responses, admittedly). I think the truth is that I (and the team) largely use Hungarian notation out of habit. It can be useful to contextualize typing information when reviewing a git diff, but otherwise most of the specifications are redundant. Googling around, it seems like this is only truly advocated for in the PLC sphere, which tends to lag behind standard practices, which implies that it has largely fallen out of favor.

The wikipedia page has some fun opinions on it: https://en.wikipedia.org/wiki/Hungarian_notation#Notable_opinions

I'll think about this some more, but my impression after a couple days of consideration is that most of the prefixes here shouldn't be rigerously checked. I think some of them have more value than others, for example it can be nice to sort and differentiate between enums, structs, etc. in the solution explorer, which otherwise have the same symbol and only carry this information in parentheses at the end of the line that tends to blend together.

This opinion is made prior to reviewing the specifics of this PR, I'll look at the diff now.

Copy link
Member

@ZLLentz ZLLentz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this wasn't worse

It wouldn't be very hard to adjust other code to the slightly modified inputs/outputs as needed

Maybe I need some more internal discussions about what people think about the naming conventions in general

Comment on lines -3 to +4
<DUT Name="DUT_EPS" Id="{0134451e-57c2-478a-9365-5d6570cca7b2}">
<Declaration><![CDATA[TYPE DUT_EPS :
<DUT Name="ST_EPS" Id="{0134451e-57c2-478a-9365-5d6570cca7b2}">
<Declaration><![CDATA[TYPE ST_EPS :
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we merge this, I might want to go back through and implement backwards compatibility where possible. We should be able to make a DUT_EPS alias for ST_EPS to ease the transition.

MAX_STATES: UINT := 15;
END_VAR
]]></Declaration>
cnMaxStates: UINT := 15;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can see a lot of places in here where other coding styles have bled into the library: some languages recommend ALL_CAPS for constants.

END_VAR
]]></Declaration>
cnSlaveAddrArrSize : UINT := 256;
cnEscMaxPorts : UINT := 3; // Maximum number of ports (4) on ESC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like declaring that something is a constant via the name has more value than declaring the types, that's a useful reminder for how the variable is used and that you cannot change it.

I'm surprised that global variables don't have a prefix assigned, I'd assume that knowing at a glance that a variable is a global and not a local could be useful when it comes to understanding the program flow, especially if the GVL file doesn't have the qualified_only pragma.

After some thought, maybe the correct convention to set is to always use the qualified_only pragma, rather than enforce some sort of globals prefix.

@@ -80,7 +80,7 @@ VAR_OUTPUT
bValid: BOOL;
END_VAR
VAR
rTrig: R_TRIG;
fbTrig: R_TRIG;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be fbRTrig, I'm not sure r was a prefix to replace or if it's just to give it the name of a "rising" trigger

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants