Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new PE CV_CFL_LANG language code for ALIASOBJ and Rust #530

Merged
merged 25 commits into from
Mar 2, 2022

Conversation

shaopeng-gh
Copy link
Collaborator

@shaopeng-gh shaopeng-gh commented Dec 15, 2021

Description:

Add new PE CV_CFL_LANG language code for ALIASOBJ and RUST.

L"Objective-C",                    // CV_CFL_OBJC
L"Objective-C++",                  // CV_CFL_OBJCXX
L"Swift",                          // CV_CFL_SWIFT
L"ALIASOBJ",                       // CV_CFL_ALIASOBJ
L"Rust",                           // CV_CFL_RUST

"helpUri": "https://github.com/microsoft/binskim/blob/main/docs/BinSkimRules.md#rule-BA2002DoNotIncorporateVulnerableDependencies",
"help": {
"text": "Binaries should not take dependencies on code with known security vulnerabilities."
},
Copy link
Contributor

@eddynaka eddynaka Dec 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you verify what happened?
this should not happen. #Resolved

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the re-baseline, will see why there is this change. The order of the json node changed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

found the reason,
not related to our change,
was the result of changing the AnalyzeCommand to use MultithreadedAnalyzeCommandBase. I did verified if change it back this will disappear.
f87857a

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have fixed the sdk, got the latest from main and re-ran the baseline, now these order changes are gone.

Copy link
Contributor

@eddynaka eddynaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@shaopeng-gh shaopeng-gh requested a review from marmegh January 25, 2022 03:53
@@ -92,5 +109,27 @@ public void PEBinary_TryLoadPdbFromSymbolFolder()
peBinary.PdbParseException.Should().BeNull();
}
}

private static bool IfContainsLanguagueCode(string fileName, Language language)
Copy link
Contributor

@marmegh marmegh Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IfContainsLanguagueCode

This name is confusing. I would expect either ContainsLanguageCode or IfContainsLanguageCode for a method name. #Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, you can simply use 'Contains' for these, we generally don't burn 'If' into this sort of boolean test.

Look at the way the code reads, do you see the duplication of 'if'?

if (IfContainsLanguageCode...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (IfContainsLanguageCode...
---- thanks for the example. fixed.

HLSL = 0x10, // High Level Shader Language
ObjectiveC = 0x11,
ObjectiveCxx = 0x12,
Swift = 0x13,
ALIASOBJ = 0x14, // module generated by the aliasobj tool
Rust = 0x15,
Copy link
Contributor

@marmegh marmegh Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does adding these accomplish? Does it prevent an error if we come across one of these files while trying to analyze a repo? Does it unblock future work? What does the customer get out of this addition? #Resolved

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only thing it is for is when the compilers Clang/GCC starts to emit the 0x15 value when the program is Rust program,
our tool will be able to parse it as "Rust" and show correctly.

@@ -1,5 +1,9 @@
# BinSkim Release History

## Unreleased

* FEATURE: Add new PE CV_CFL_LANG language code for ALIASOBJ and Rust. [530](https://github.com/microsoft/binskim/pull/530)
Copy link
Member

@michaelcfanning michaelcfanning Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FEATURE

I don't think FEATURE is buying us much in this document. This release note item doesn't describe the additional value the change is providing. #Resolved

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, changed to the other one BUGFIX

@@ -1,5 +1,9 @@
# BinSkim Release History

## Unreleased

* FEATURE: Add new PE CV_CFL_LANG language code for ALIASOBJ and Rust. [530](https://github.com/microsoft/binskim/pull/530)
Copy link
Member

@michaelcfanning michaelcfanning Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CV_CFL_LANG

This file does not do a good job of using markdown conventions of properly denoting code items, for example, by encapsulating them in backticks. #Resolved

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added backticks.

CVTRES = 0x08,
CVTPGD = 0x09,
LINK = 0x07, // linker-generated module
CVTRES = 0x08, // resource module converted with CVTRES tool
Copy link
Member

@michaelcfanning michaelcfanning Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resource

All comments need to start with an upper cased word.

Please prefer adding periods to the end of sentences. #Resolved

Java = 0x0D,
JScript = 0x0E,
MSIL = 0x0F,
MSIL = 0x0F, // Microsoft Intermediate Language (MSIL), possibly a result of using the /LTCG (Link-time Code Generation) switch
Copy link
Member

@michaelcfanning michaelcfanning Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LTCG

Huh? The native whole code optimization feature produces MSIL? Can you show me the source of your information? #Resolved

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LINK = 0x07,
CVTRES = 0x08,
CVTPGD = 0x09,
LINK = 0x07, // linker-generated module
Copy link
Member

@michaelcfanning michaelcfanning Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linker

What's the additional value of this comment? #Resolved

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree some of the comments does not add value at all, removed all to be consistent

HLSL = 0x10, // High Level Shader Language
ObjectiveC = 0x11,
ObjectiveCxx = 0x12,
Swift = 0x13,
ALIASOBJ = 0x14, // module generated by the aliasobj tool
Copy link
Member

@michaelcfanning michaelcfanning Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

module

What's the additional value of your comment? #Resolved

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree some of the comments does not add value at all, removed all to be consistent

HLSL = 0x10, // High Level Shader Language
ObjectiveC = 0x11,
ObjectiveCxx = 0x12,
Swift = 0x13,
ALIASOBJ = 0x14, // module generated by the aliasobj tool
Rust = 0x15,
Copy link
Member

@michaelcfanning michaelcfanning Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rust

Note that for this language, you didn't say 'Module generated by Rust.' So, in general, please be consistent in your approach. I don't think this comment would provide much value, by the way. If your comment doesn't add useful additional information, we are just cluttering the code. #Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so I think what you've done is to partly port the documentation for enum values over. My feedback to you is: don't do that in an inconsistent way.

You have two choices:

  1. Author useful comments in places where you think a comment provides additional information over the language name. Is anyone confused by what 'Pascal' means here, for example?
  2. OR: port all the comments from the documentation, so that every member is consistently documented precisely as it is online.

i.e., stay out of the middle, which is to inconsistently apply exact language in some cases, partial documentation in others, and to ignore others entirely.

V_CFL_C Application language is C.

CV_CFL_CXX Application language is C++.

CV_CFL_FORTRAN Application language is FORTRAN.

CV_CFL_MASM Application language is Microsoft Macro Assembler.

CV_CFL_PASCAL Application language is Pascal.

CV_CFL_BASIC Application language is BASIC.

CV_CFL_COBOL Application language is COBOL.

CV_CFL_LINK Application is a linker-generated module.

CV_CFL_CVTRES Application is a resource module converted with CVTRES tool.

CV_CFL_CVTPGD Application is a POGO optimized module generated with CVTPGD tool.

CV_CFL_CSHARP Application language is C#.

CV_CFL_VB Application language is Visual Basic.

CV_CFL_ILASM Application language is intermediate language assembly (that is, Common Language Runtime (CLR) assembly).

CV_CFL_JAVA Application language is Java.

CV_CFL_JSCRIPT Application language is Jscript.

CV_CFL_MSIL Application language is an unknown Microsoft Intermediate Language (MSIL), possibly a result of using the /LTCG (Link-time Code Generation) switch.

CV_CFL_HLSL Application language is High Level Shader Language.

CV_CFL_OBJC Application language is Objective-C.

CV_CFL_OBJCXX Application language is Objective-C++.

CV_CFL_SWIFT Application language is Swift.

CV_CFL_ALIASOBJ Application is a module generated by the aliasobj tool.

CV_CFL_RUST Application language is Rust.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stay out of the middle --- agree

and also agree some of the comments does not add value at all, so removed all to be consistent.

@michaelcfanning
Copy link
Member

michaelcfanning commented Jan 25, 2022

"$schema": "https://schemastore.azurewebsites.net/schemas/json/sarif-2.1.0-rtm.5.json",

How did you generate your test binaries? Why aren't we checking in build scripts?

It is not clear to me what compiler settings you are using. If someone needs to update your test binaries in the future, they will not be able to leverage your approach.

In general, please try to help engineers in the future who will update your changes.


In reply to: 1021451914


In reply to: 1021451914


In reply to: 1021451914


Refers to: src/Test.FunctionalTests.BinSkim.Driver/BaselineTestsData/Expected/clangcl.pe.c.codeview.exe.sarif:2 in 9b9d3a6. [](commit_id = 9b9d3a6, deletion_comment = False)

@@ -70,6 +74,19 @@ public void PEBinary_PdbIsStripped()
}
}

[Fact]
public void PEBinary_LanguagueCode()
Copy link
Member

@michaelcfanning michaelcfanning Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEBinary_LanguagueCode

Language is misspelled in this name.

This name doesn't describe what is being tested here. #Resolved

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Language is misspelled in this name. --- found I have several places, searched and fixed all.

updated the name.

@@ -224,28 +224,35 @@ public string GetDialect(out string versionNumber)
}
}

/// <summary>
/// Information about the CV_CFL_LANG enumeration,
Copy link
Member

@michaelcfanning michaelcfanning Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Information

This class is an implementation of the enumeration not information about the enumeration (though you do have some code comments that won't be visible to anyone viewing your summary comment via intellisense. #Resolved

@michaelcfanning
Copy link
Member

michaelcfanning commented Jan 25, 2022

  "results": [

How did you review these data files?

In general, these files contain too much junk to be easily verified by simply viewing them as json. Look at all the not applicable warnings.


In reply to: 1021455602


In reply to: 1021455602


In reply to: 1021455602


Refers to: src/Test.FunctionalTests.BinSkim.Driver/BaselineTestsData/Expected/Native_x64_RustC_Rust_debuginfo2_v1.57.exe.sarif:6 in 9b9d3a6. [](commit_id = 9b9d3a6, deletion_comment = False)

Copy link
Member

@michaelcfanning michaelcfanning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

LINK = 0x07,
CVTRES = 0x08,
CVTPGD = 0x09,
LINK = 0x07, // linker-generated module
Copy link
Member

@michaelcfanning michaelcfanning Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linker

Is this comment adding anything? #Resolved

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree some of the comments does not add value at all, removed all to be consistent

@shaopeng-gh shaopeng-gh requested a review from marmegh February 3, 2022 07:18
@@ -2,6 +2,7 @@

## Unreleased

* BUGFIX: Add new PE `CV_CFL_LANG` language code for `ALIASOBJ` and `Rust`. [530](https://github.com/microsoft/binskim/pull/530)
Copy link
Contributor

@eddynaka eddynaka Feb 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BUGFIX

If this is a bugfix, what are we fixing and why? what was the behavior? #Resolved

Copy link
Collaborator Author

@shaopeng-gh shaopeng-gh Feb 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

originally I set as FEATURE,
but we think FEATURE is not buying us much in this document.
I don't think BREAKING is good here too,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelcfanning and @VCOLAWTON is there a better label we can use for this type of code change for documentation purposes? FEATURE and BUGFIX both don't fit this scenario.

Copy link
Contributor

@eddynaka eddynaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

@eddynaka
Copy link
Contributor

eddynaka commented Feb 3, 2022

We should stop pushing SARIF files to the code.
Every time we have a new rule, re-ordering, we need to re-baseline over and over.
This is hard to maintain.

Can you figure it out a way to prevent this?

How can we test without re-baseline over and over?


In reply to: 1029461711

@shaopeng-gh
Copy link
Collaborator Author

This is hard to maintain. --- agree.

but I think this do have value, remember we found out somehow our order of the json node changes,
if not for this baseline tests we will not notice that.


In reply to: 1029461711

@michaelcfanning
Copy link
Member

What we need to do is to stop depending entirely on file diffs to verify analysis. It is a useful technique but you can't use it for everything, it's a fragile approach and the diffs become overwhelming.

Notice that you haven't added a code-only test for this change: that's the primary issue: we have no code-based unit test mechanism for you to isolate/verify code changes. We need to address that.

Separately, my sense is that we should stop producing highly verbose log files for every test binary. i.e., our inventory of test binaries should generally be focused on specific rules (and their error-level output).

Now, what we lose here is exhaustive testing of verbose output (like 'not applicable' testing). We could retain verbose testing for a smaller number of binaries. Or we could author more unit-tests that report conditions in the IDE, rather than via code diffs.


In reply to: 1029481349

@shaopeng-gh
Copy link
Collaborator Author

got it, will change to add code-based unit tests.


In reply to: 1029484759

@shaopeng-gh shaopeng-gh marked this pull request as draft February 14, 2022 19:58
@shaopeng-gh
Copy link
Collaborator Author

I see Rust have a new version 1.58.1 will give it a try and update PR.


In reply to: 1029530224

@shaopeng-gh
Copy link
Collaborator Author

as of 1.58.1 it is still not using the new lang code. I have updated the test to use the new version.


In reply to: 1039566742

@shaopeng-gh shaopeng-gh marked this pull request as ready for review February 14, 2022 22:30
@@ -1,5 +1,9 @@
# BinSkim Release History

## Unreleased

* BUGFIX: Add new PE `CV_CFL_LANG` language code for `ALIASOBJ` and `Rust`. [530](https://github.com/microsoft/binskim/pull/530)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BUGFIX

Call this 'FEATURE:

@michaelcfanning michaelcfanning enabled auto-merge (squash) March 2, 2022 22:56
@michaelcfanning michaelcfanning disabled auto-merge March 2, 2022 23:43
@michaelcfanning michaelcfanning dismissed eddynaka’s stale review March 2, 2022 23:43

Outdated review.

@michaelcfanning michaelcfanning merged commit 95e5b35 into main Mar 2, 2022
@michaelcfanning michaelcfanning deleted the users/shaopeng-gh/addnewlangcode branch March 2, 2022 23:43
@marmegh marmegh mentioned this pull request Mar 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants