Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new PE CV_CFL_LANG language code for ALIASOBJ and Rust #530

Merged
merged 25 commits into from
Mar 2, 2022
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
cffed44
Add Language Code for PE
shaopeng-gh Oct 1, 2021
e497d5f
Merge branch 'main' into users/shaopeng-gh/addnewlangcode
shaopeng-gh Oct 4, 2021
484bf18
Add Objective C baseline
shaopeng-gh Oct 4, 2021
84728fa
Merge branch 'users/shaopeng-gh/addnewlangcode' of https://github.com…
shaopeng-gh Dec 13, 2021
1e03879
Add new PE CV_CFL_LANG language code for ALIASOBJ and RUST
shaopeng-gh Dec 15, 2021
21329eb
Update Rust string
shaopeng-gh Dec 15, 2021
ba2136f
Merge branch 'main' into users/shaopeng-gh/addnewlangcode
eddynaka Dec 15, 2021
914b14d
Merge branch 'main' into users/shaopeng-gh/addnewlangcode
shaopeng-gh Jan 11, 2022
9b9d3a6
re-baseline with latest Sarif SDK
shaopeng-gh Jan 11, 2022
a8c7f72
Merge branch 'main' into users/shaopeng-gh/addnewlangcode
shaopeng-gh Jan 25, 2022
4593f04
fix comments
shaopeng-gh Jan 25, 2022
3fabb3c
fix comments for md file
shaopeng-gh Jan 25, 2022
258fb73
add . to end of sentence in md file
shaopeng-gh Jan 25, 2022
fd58c56
FunctionalTestBuildScripts.md add spacing before code
shaopeng-gh Jan 26, 2022
cd21338
merge from main
shaopeng-gh Feb 2, 2022
03fc7fb
update FunctionalTestBuildScripts.md
shaopeng-gh Feb 3, 2022
50cfd14
Merge branch 'main' into users/shaopeng-gh/addnewlangcode
michaelcfanning Feb 3, 2022
43bcf65
update to Rust 1.5.8.1
shaopeng-gh Feb 14, 2022
9a60f7f
Merge branch 'main' into users/shaopeng-gh/addnewlangcode
shaopeng-gh Feb 14, 2022
268eae2
Fix merge issue for release history file
shaopeng-gh Feb 14, 2022
d2d41f0
remove extra baseline expected sarif file
shaopeng-gh Feb 14, 2022
af8906a
Merge branch 'main' into users/shaopeng-gh/addnewlangcode
shaopeng-gh Feb 16, 2022
719fbc1
merge from master
shaopeng-gh Feb 16, 2022
4c816b9
Merge branch 'main' into users/shaopeng-gh/addnewlangcode
michaelcfanning Mar 2, 2022
a8b922d
Update ReleaseHistory.md
michaelcfanning Mar 2, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions docs/FunctionalTestBuildScripts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Build Scripts
Copy link
Contributor

@marmegh marmegh Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VCOLAWTON, if you have a chance to take a quick look to see if there is anything that should be considered form a documentation standard perspective that'd be awesome. #Resolved

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Mary, looks good! I looked at other GH content and think that adding a sentence or two would make the section look more like information and less like loose leaf data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document is definitely going to need some expansion, but I don't believe all of that is necessary at the moment. What we want to immediately focus on is, usability. Can someone come in, take a look at this document, and walk away knowing how to recreate your files? No. What will it take to get someone up and running?

Are these both pretty much basic hello world binaries? In which case, expand the paragraph beginning at line 3, to explain that the base scenario is going to be a simple hello world program.

As for where I see this document going, I think this would be a good location to begin describing our functional and baseline testing approach for binskim. But, again, that's going to be a longer process/discussion and should be a collaborative effort. I wouldn't include all of that as part of this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expand the paragraph beginning at line 3, to explain that the base scenario is going to be a simple hello world program. ---- done, thanks.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document is definitely going to need expansion, and it is absolutely necessary for usability. Not sure, what you mean by "all of that" but anything we add to this document will be tremendously helpful. I know people in your positions do not typically like to do this type of work and would rather be as simple as possible, so just let me know how you want me to help. Probably a good idea to loop Michael in, too!


This file records scripts used to compile the test files, in alphabetical order.
Base scenario is a simple hello world program built with different parameters for testing purpose.
Test files are located in [BaselineTestsData](https://github.com/microsoft/binskim/tree/main/src/Test.FunctionalTests.BinSkim.Driver/BaselineTestsData) and [FunctionalTestsData](https://github.com/microsoft/binskim/tree/main/src/Test.FunctionalTests.BinSkim.Rules/FunctionalTestsData).

## clangcl.pe.c.codeview.exe

A simple hello world program, compiled with `clang 13.0.0` that generates a .exe and associated .pdb file. Script to reproduce:
`clang-cl -o clangcl.pe.c.codeview.exe -fuse-ld=lld-link hello.c -m32 -Z7 -MTd`
Copy link
Contributor

@marmegh marmegh Feb 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"A simple hello world program, compiled with clang 13.0.0 that generates a .exe and associated .pdb file. Script to reproduce:
clang-cl -o clangcl.pe.c.codeview.exe -fuse-ld=lld-link hello.c -m32 -Z7 -MTd" #Resolved

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great, will follow this style.


## Native_x64_RustC_Rust_debuginfo2_v1.58.1.exe

A simple hello world program, compiled with `rustc 1.58.1` that generates a .exe and associated .pdb file. Script to reproduce:
`rustc -g -Clink-arg=/DEBUG:FULL src\main.rs -o Native_x64_RustC_Rust_debuginfo2_v1.58.1.exe`
Original file line number Diff line number Diff line change
Expand Up @@ -224,6 +224,11 @@ public string GetDialect(out string versionNumber)
}
}

/// <summary>
/// The CV_CFL_LANG enumeration,
/// which specifies the code language of the application or linked module in the debug interface access SDK.
/// https://docs.microsoft.com/en-us/visualstudio/debugger/debug-interface-access/cv-cfl-lang
/// </summary>
public enum Language : uint
{
C = 0x00,
Expand All @@ -242,11 +247,13 @@ public enum Language : uint
Java = 0x0D,
JScript = 0x0E,
MSIL = 0x0F,
HLSL = 0x10, // High Level Shader Language
HLSL = 0x10,
ObjectiveC = 0x11,
ObjectiveCxx = 0x12,
Swift = 0x13,
NASM = 0x4E, // The Netwide Assembler
ALIASOBJ = 0x14,
Rust = 0x15,
Copy link
Contributor

@marmegh marmegh Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does adding these accomplish? Does it prevent an error if we come across one of these files while trying to analyze a repo? Does it unblock future work? What does the customer get out of this addition? #Resolved

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only thing it is for is when the compilers Clang/GCC starts to emit the 0x15 value when the program is Rust program,
our tool will be able to parse it as "Rust" and show correctly.

Copy link
Member

@michaelcfanning michaelcfanning Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rust

Note that for this language, you didn't say 'Module generated by Rust.' So, in general, please be consistent in your approach. I don't think this comment would provide much value, by the way. If your comment doesn't add useful additional information, we are just cluttering the code. #Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so I think what you've done is to partly port the documentation for enum values over. My feedback to you is: don't do that in an inconsistent way.

You have two choices:

  1. Author useful comments in places where you think a comment provides additional information over the language name. Is anyone confused by what 'Pascal' means here, for example?
  2. OR: port all the comments from the documentation, so that every member is consistently documented precisely as it is online.

i.e., stay out of the middle, which is to inconsistently apply exact language in some cases, partial documentation in others, and to ignore others entirely.

V_CFL_C Application language is C.

CV_CFL_CXX Application language is C++.

CV_CFL_FORTRAN Application language is FORTRAN.

CV_CFL_MASM Application language is Microsoft Macro Assembler.

CV_CFL_PASCAL Application language is Pascal.

CV_CFL_BASIC Application language is BASIC.

CV_CFL_COBOL Application language is COBOL.

CV_CFL_LINK Application is a linker-generated module.

CV_CFL_CVTRES Application is a resource module converted with CVTRES tool.

CV_CFL_CVTPGD Application is a POGO optimized module generated with CVTPGD tool.

CV_CFL_CSHARP Application language is C#.

CV_CFL_VB Application language is Visual Basic.

CV_CFL_ILASM Application language is intermediate language assembly (that is, Common Language Runtime (CLR) assembly).

CV_CFL_JAVA Application language is Java.

CV_CFL_JSCRIPT Application language is Jscript.

CV_CFL_MSIL Application language is an unknown Microsoft Intermediate Language (MSIL), possibly a result of using the /LTCG (Link-time Code Generation) switch.

CV_CFL_HLSL Application language is High Level Shader Language.

CV_CFL_OBJC Application language is Objective-C.

CV_CFL_OBJCXX Application language is Objective-C++.

CV_CFL_SWIFT Application language is Swift.

CV_CFL_ALIASOBJ Application is a module generated by the aliasobj tool.

CV_CFL_RUST Application language is Rust.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stay out of the middle --- agree

and also agree some of the comments does not add value at all, so removed all to be consistent.

NASM = 0x4E,
Unknown
}

Expand Down
4 changes: 4 additions & 0 deletions src/ReleaseHistory.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# BinSkim Release History

## Unreleased

* BUGFIX: Add new PE `CV_CFL_LANG` language code for `ALIASOBJ` and `Rust`. [530](https://github.com/microsoft/binskim/pull/530)
Copy link
Contributor

@eddynaka eddynaka Feb 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BUGFIX

If this is a bugfix, what are we fixing and why? what was the behavior? #Resolved

Copy link
Collaborator Author

@shaopeng-gh shaopeng-gh Feb 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

originally I set as FEATURE,
but we think FEATURE is not buying us much in this document.
I don't think BREAKING is good here too,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelcfanning and @VCOLAWTON is there a better label we can use for this type of code change for documentation purposes? FEATURE and BUGFIX both don't fit this scenario.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BUGFIX

Call this 'FEATURE:


## **v1.9.3** [NuGet Package](https://www.nuget.org/packages/Microsoft.CodeAnalysis.BinSkim/1.9.3)

* BUGFIX: Fix `KeyNotFoundException` exception raised by `BA2006.BuildWithSecureTools` when individual `MinimumToolVersions` properties are removed from XML configuration. [#565](https://github.com/microsoft/binskim/pull/565)
Expand Down
39 changes: 39 additions & 0 deletions src/Test.UnitTests.BinaryParsers/PEBinary/PEBinaryTests.cs
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,17 @@
// Licensed under the MIT license. See LICENSE file in the project root for full license information.

using System;
using System.Collections.Generic;
using System.IO;
using System.Reflection;

using Dia2Lib;

using FluentAssertions;

using Microsoft.CodeAnalysis.BinaryParsers.ProgramDatabase;
using Microsoft.CodeAnalysis.Sarif.Driver;

using Xunit;

namespace Microsoft.CodeAnalysis.BinaryParsers
Expand Down Expand Up @@ -70,6 +74,19 @@ public void PEBinary_PdbIsStripped()
}
}

[Fact]
public void PEBinary_ContainsExpectedLanguageCode()
{
if (!PlatformSpecificHelpers.RunningOnWindows()) { return; }

ContainsLanguageCode("clangcl.pe.c.codeview.exe", Language.C).Should().BeTrue();
ContainsLanguageCode("clangcl.pe.cpp.codeview.exe", Language.Cxx).Should().BeTrue();
// As of v1.58.1 Rust official compiler RustC does not yet use the new CV_CFL_LANG code for Rust.
// We will have another test file when new version use it.
ContainsLanguageCode("Native_x64_RustC_Rust_debuginfo2_v1.58.1.exe", Language.Rust).Should().BeFalse();
ContainsLanguageCode("Native_x64_VS2019_CPlusPlus_DEBUG_DEFAULT.dll", Language.Cxx).Should().BeTrue();
}

[Fact]
public void PEBinary_CanCreateIDiaSourceFromMsdia()
{
Expand All @@ -92,5 +109,27 @@ public void PEBinary_TryLoadPdbFromSymbolFolder()
peBinary.PdbParseException.Should().BeNull();
}
}

private static bool ContainsLanguageCode(string fileName, Language language)
{
string fileFullPath = Path.Combine(TestData, "PE", fileName);
using (var peBinary = new PEBinary(new Uri(fileFullPath)))
{
peBinary.Valid.Should().BeTrue();
peBinary.Pdb.Should().NotBeNull();
peBinary.PdbParseException.Should().BeNull();

var languages = new HashSet<Language>();
foreach (DisposableEnumerableView<Symbol> omView in peBinary.Pdb.CreateObjectModuleIterator())
{
ObjectModuleDetails omDetails = omView.Value.GetObjectModuleDetails();
if (omDetails.Library == omDetails.Name)
{
languages.Add(omDetails.Language);
}
}
return languages.Contains(language);
}
}
}
}
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.