Skip to content

Commit

Permalink
Updating SARIF2004 (#1995)
Browse files Browse the repository at this point in the history
* Updating SARIF2004

* code review - 1

* code review - 2

* code review - 3

* Adding Extension and tests

* Updating tests and sarif files

* adding more cases to unit test

* code review - 4

* code review - 5

* updating order

* updating texts

* updating texts
  • Loading branch information
eddynaka authored Jul 20, 2020
1 parent dc82504 commit d9f783d
Show file tree
Hide file tree
Showing 18 changed files with 479 additions and 31 deletions.
16 changes: 14 additions & 2 deletions docs/Producing effective SARIF.md
Original file line number Diff line number Diff line change
Expand Up @@ -393,13 +393,25 @@ Similarly, most 'result' objects contain at least one 'artifactLocation' object.

#### Messages

##### `AvoidDuplicativeAnalysisTarget`: warning

The 'analysisTarget' property '{1}' at '{0}' can be removed because it is the same as the result location. This unnecessarily increases log file size. The 'analysisTarget' property is used to distinguish cases when a tool detects a result in a file (such as an included header) that is different than the file that was scanned (such as a .cpp file that included the header).

#### `AvoidDuplicativeResultRuleInformation`: warning

'{0}' uses the 'rule' property to specify the violated rule, so it is not necessary also to specify 'ruleId' or 'ruleIndex'. This unnecessarily increases log file size. Remove the 'ruleId' and 'ruleIndex' properties.

##### `EliminateLocationOnlyArtifacts`: warning

{0}: The 'artifacts' array contains no information beyond the locations of the artifacts. Removing this array might reduce the log file size without losing information. In some scenarios (for example, when assessing compliance with policy), the 'artifacts' array might be used to record the full set of artifacts that were analyzed. In such a scenario, the 'artifacts' array should be retained even if it contains only location information.
The 'artifacts' array at '{0}' contains no information beyond the locations of the artifacts. Removing this array might reduce the log file size without losing information. In some scenarios (for example, when assessing compliance with policy), the 'artifacts' array might be used to record the full set of artifacts that were analyzed. In such a scenario, the 'artifacts' array should be retained even if it contains only location information.

##### `EliminateIdOnlyRules`: warning

{0}: The 'rules' array contains no information beyond the ids of the rules. Removing this array might reduce the log file size without losing information. In some scenarios (for example, when assessing compliance with policy), the 'rules' array might be used to record the full set of rules that were evaluated. In such a scenario, the 'rules' array should be retained even if it contains only id information.
The 'rules' array at '{0}' contains no information beyond the ids of the rules. Removing this array might reduce the log file size without losing information. In some scenarios (for example, when assessing compliance with policy), the 'rules' array might be used to record the full set of rules that were evaluated. In such a scenario, the 'rules' array should be retained even if it contains only id information.

#### `PreferRuleId`: warning

The result at '{0}' uses the 'rule' property to specify the violated rule, but this is not necessary because the rule is defined by 'tool.driver'. Use the 'ruleId' and 'ruleIndex' instead, because they are shorter and just as clear.

---

Expand Down
24 changes: 24 additions & 0 deletions src/Sarif.Multitool/Extensions.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
using System;

namespace Microsoft.CodeAnalysis.Sarif.Multitool
{
public static class Extensions
{
public static bool RefersToDriver(this ToolComponentReference toolComponent, string driverGuid)
{
if (toolComponent.Index == -1)
{
if (toolComponent.Guid == null)
{
return true;
}
else
{
return toolComponent.Guid.Equals(driverGuid, StringComparison.OrdinalIgnoreCase);
}
}

return false;
}
}
}
31 changes: 29 additions & 2 deletions src/Sarif.Multitool/Rules/RuleResources.Designer.cs

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 18 additions & 3 deletions src/Sarif.Multitool/Rules/RuleResources.resx
Original file line number Diff line number Diff line change
Expand Up @@ -272,10 +272,16 @@ Many tools follow a conventional format for the 'reportingDescriptor.id' propert

In several parts of a SARIF log file, a subset of information about an object appears in one place, and the full information describing all such objects appears in an array elsewhere in the log file. For example, each 'result' object has a 'ruleId' property that identifies the rule that was violated. Elsewhere in the log file, the array 'run.tool.driver.rules' contains additional information about the rules. But if the elements of the 'rules' array contained no information about the rules beyond their ids, then there might be no reason to include the 'rules' array at all, and the log file could be made smaller simply by omitting it. In some scenarios (for example, when assessing compliance with policy), the 'rules' array might be used to record the full set of rules that were evaluated. In such a scenario, the 'rules' array should be retained even if it contains only id information.

Similarly, most 'result' objects contain at least one 'artifactLocation' object. Elsewhere in the log file, the array 'run.artifacts' contains additional information about the artifacts that were analyzed. But if the elements of the 'artifacts' array contained not information about the artifacts beyond their locations, then there might be no reason to include the 'artifacts' array at all, and again the log file could be made smaller by omitting it. In some scenarios (for example, when assessing compliance with policy), the 'artifacts' array might be used to record the full set of artifacts that were analyzed. In such a scenario, the 'artifacts' array should be retained even if it contains only location information.</value>
Similarly, most 'result' objects contain at least one 'artifactLocation' object. Elsewhere in the log file, the array 'run.artifacts' contains additional information about the artifacts that were analyzed. But if the elements of the 'artifacts' array contained not information about the artifacts beyond their locations, then there might be no reason to include the 'artifacts' array at all, and again the log file could be made smaller by omitting it. In some scenarios (for example, when assessing compliance with policy), the 'artifacts' array might be used to record the full set of artifacts that were analyzed. In such a scenario, the 'artifacts' array should be retained even if it contains only location information.

In addition to the avoiding unnecessary arrays, there are other ways to optimize the size of SARIF log files.

Prefer the result object properties 'ruleId' and 'ruleIndex' to the nested object-valued property 'result.rule', unless the rule comes from a tool component other than the driver (in which case only 'result.rule' can accurately point to the metadata for the rule). The 'ruleId' and 'ruleIndex' properties are shorter and just as clear.

Do not specify the result object's 'analysisTarget' property unless it differs from the result location. The canonical scenario for using 'result.analysisTarget' is a C/C++ language analyzer that is instructed to analyze example.c, and detects a result in the included file example.h. In this case, 'analysisTarget' is example.c, and the result location is in example.h.</value>
</data>
<data name="SARIF2004_OptimizeFileSize_Warning_EliminateLocationOnlyArtifacts_Text" xml:space="preserve">
<value>{0}: The 'artifacts' array contains no information beyond the locations of the artifacts. Removing this array might reduce the log file size without losing information. In some scenarios (for example, when assessing compliance with policy), the 'artifacts' array might be used to record the full set of artifacts that were analyzed. In such a scenario, the 'artifacts' array should be retained even if it contains only location information.</value>
<value>The 'artifacts' array at '{0}' contains no information beyond the locations of the artifacts. Removing this array might reduce the log file size without losing information. In some scenarios (for example, when assessing compliance with policy), the 'artifacts' array might be used to record the full set of artifacts that were analyzed. In such a scenario, the 'artifacts' array should be retained even if it contains only location information.</value>
</data>
<data name="SARIF2002_ProvideMessageArguments_FullDescription_Text" xml:space="preserve">
<value>In result messages, use the 'message.id' and 'message.arguments' properties rather than 'message.text'. This has several advantages. If 'text' is lengthy, using 'id' and 'arguments' makes the SARIF file smaller. If the rule metadata is stored externally to the SARIF log file, the message text can be improved (for example, by adding more text, clarifying the phrasing, or fixing typos), and the result messages will pick up the improvements the next time it is displayed. Finally, SARIF supports localizing messages into different languages, which is possible if the SARIF file contains 'message.id' and 'message.arguments', but not if it contains 'message.text' directly.</value>
Expand All @@ -290,7 +296,7 @@ Similarly, most 'result' objects contain at least one 'artifactLocation' object.
<value>{0}: This run does not provide 'versionControlProvenance'. As a result, it is not possible to determine which version of code was analyzed, nor to map relative paths to their locations within the repository.</value>
</data>
<data name="SARIF2004_OptimizeFileSize_Warning_EliminateIdOnlyRules_Text" xml:space="preserve">
<value>{0}: The 'rules' array contains no information beyond the ids of the rules. Removing this array might reduce the log file size without losing information. In some scenarios (for example, when assessing compliance with policy), the 'rules' array might be used to record the full set of rules that were evaluated. In such a scenario, the 'rules' array should be retained even if it contains only id information.</value>
<value>The 'rules' array at '{0}' contains no information beyond the ids of the rules. Removing this array might reduce the log file size without losing information. In some scenarios (for example, when assessing compliance with policy), the 'rules' array might be used to record the full set of rules that were evaluated. In such a scenario, the 'rules' array should be retained even if it contains only id information.</value>
</data>
<data name="SARIF2006_UrisShouldBeReachable_FullDescription_Text" xml:space="preserve">
<value>URIs that refer to locations such as rule help pages and result-related work items should be reachable via an HTTP GET request.</value>
Expand Down Expand Up @@ -360,4 +366,13 @@ This is part of a set of authoring practices that make your rule messages more r
<data name="SARIF2007_ExpressPathsRelativeToRepoRoot_Warning_ExpressResultLocationsRelativeToMappedTo_Text" xml:space="preserve">
<value>{0}: This result location does not provide any of the 'uriBaseId' values that specify repository locations: '{1}'. As a result, it will not be possible to determine the location of the file containing this result relative to the root of the repository that contains it.</value>
</data>
<data name="SARIF2004_OptimizeFileSize_Warning_AvoidDuplicativeAnalysisTarget_Text" xml:space="preserve">
<value>The 'analysisTarget' property '{1}' at '{0}' can be removed because it is the same as the result location. This unnecessarily increases log file size. The 'analysisTarget' property is used to distinguish cases when a tool detects a result in a file (such as an included header) that is different than the file that was scanned (such as a .cpp file that included the header).</value>
</data>
<data name="SARIF2004_OptimizeFileSize_Warning_AvoidDuplicativeResultRuleInformation_Text" xml:space="preserve">
<value>'{0}' uses the 'rule' property to specify the violated rule, so it is not necessary also to specify 'ruleId' or 'ruleIndex'. This unnecessarily increases log file size. Remove the 'ruleId' and 'ruleIndex' properties.</value>
</data>
<data name="SARIF2004_OptimizeFileSize_Warning_PreferRuleId_Text" xml:space="preserve">
<value>The result at '{0}' uses the 'rule' property to specify the violated rule, but this is not necessary because the rule is defined by 'tool.driver'. Use the 'ruleId' and 'ruleIndex' instead, because they are shorter and just as clear.</value>
</data>
</root>
Loading

0 comments on commit d9f783d

Please sign in to comment.