-
Notifications
You must be signed in to change notification settings - Fork 29
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Create rule S7118 String methods should be used to query content inst…
…ead of C apis CPP-5790
- Loading branch information
1 parent
289e7cf
commit 2c08a31
Showing
3 changed files
with
237 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
{ | ||
"title": "String methods should be used to query content instead of C apis", | ||
"type": "CODE_SMELL", | ||
"status": "ready", | ||
"remediation": { | ||
"func": "Constant\/Issue", | ||
"constantCost": "5min" | ||
}, | ||
"tags": [ | ||
], | ||
"defaultSeverity": "Major", | ||
"ruleSpecification": "RSPEC-7118", | ||
"sqKey": "S7118", | ||
"scope": "All", | ||
"defaultQualityProfiles": ["Sonar way"], | ||
"quickfix": "unknown", | ||
"code": { | ||
"impacts": { | ||
"MAINTAINABILITY": "MEDIUM" | ||
}, | ||
"attribute": "CLEAR" | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,212 @@ | ||
`std::string` methods should be preferred over operations on the result of `c_str()` or `data()` when manipulating a standard string. | ||
|
||
== Why is this an issue? | ||
|
||
When manipulating standard strings, it is recommended to use their methods, rather than accessing the underlying char array representation with low-level C methods. The string methods are clearer in their purpose, less error-prone, and will sometimes be more performant. | ||
|
||
[source,cpp] | ||
---- | ||
void func(std::string const& param1, std::string const& param2) { | ||
int size = strlen(param1.data()); // Noncompliant | ||
bool same = (strcmp(param1.c_str(), param2.c_str()) == 0); // Noncompliant | ||
} | ||
---- | ||
|
||
== How to fix it | ||
|
||
=== size calculation | ||
|
||
Using `strlen` to calculate the size of a string will necessitate finding its null terminator. The string already stores its own size and can return it for free. | ||
|
||
==== Noncompliant code example | ||
|
||
[source,cpp,diff-id=1,diff-type=noncompliant] | ||
---- | ||
int size = strlen(string.c_str()); // Noncompliant | ||
---- | ||
|
||
==== Compliant solution | ||
|
||
[source,cpp,diff-id=1,diff-type=compliant] | ||
---- | ||
int size = string.size(); // Compliant | ||
---- | ||
|
||
=== Embedded null-terminator detection | ||
|
||
Note that as mentioned in the "Pitfalls" part, `size()` will return the full size, | ||
including embedded null characters. If the goal of the call to `c_str()` is to detect | ||
the presence of embedded null-terminators, more explicit methods like `std::string::find()` or `std::string::contains()` should be used. | ||
|
||
==== Noncompliant code example | ||
|
||
[source,cpp,diff-id=2,diff-type=noncompliant] | ||
---- | ||
if (string.size() != strlen(string.c_str())) // Noncompliant | ||
{ | ||
// ... | ||
} | ||
---- | ||
|
||
==== Compliant solution | ||
|
||
In C++23 `std::string::contains()` is the best solution. | ||
|
||
[source,cpp,diff-id=2,diff-type=compliant] | ||
---- | ||
if (string.contains('\0')) // Compliant | ||
{ | ||
// ... | ||
} | ||
---- | ||
|
||
Before that, `std::string::find()` is an alternative. | ||
|
||
[source,cpp,diff-id=2,diff-type=compliant] | ||
---- | ||
if (string.find('\0') != std::string::npos) // Compliant | ||
{ | ||
// ... | ||
} | ||
---- | ||
|
||
=== string comparison | ||
|
||
Testing the equality between two full strings can be directly performed using the equality operator. For other comparisons, `std::string::compare` is more readable. | ||
|
||
==== Noncompliant code example | ||
|
||
[source,cpp,diff-id=3,diff-type=noncompliant] | ||
---- | ||
void comparisons(std::string const& s1, std::string const& s2) { | ||
bool equal = strcmp(s1.c_str(), s2.c_str()) == 0; // Noncompliant | ||
int comparePart = strncmp(s1.c_str() + 2, s2.c_str() + 3, 10); // Noncompliant | ||
} | ||
---- | ||
|
||
==== Compliant solution | ||
|
||
[source,cpp,diff-id=3,diff-type=compliant] | ||
---- | ||
void comparisons(std::string const& s1, std::string const& s2) { | ||
bool equal = s1 == s2; // Compliant | ||
int comparePart = s1.compare(2, 10, s2, 3, 10); // Compliant | ||
} | ||
---- | ||
|
||
With C++17 and higher, substring comparison can also be done through `std::string_view`. | ||
|
||
[source,cpp] | ||
---- | ||
auto comparePart = std::string_view(s1).substr(2, 10) <=> std::string_view(s2).substr(3, 10); | ||
---- | ||
|
||
=== Searching a substring | ||
|
||
Either searching a substring or a single character inside a string can be done with `std::string::find()`. | ||
|
||
==== Noncompliant code example | ||
|
||
[source,cpp,diff-id=4,diff-type=noncompliant] | ||
---- | ||
void search(std::string const& string) { | ||
char const* sub = strstr(string.c_str(), "substring"); // Noncompliant | ||
char const* dot = strchr(string.c_str(), '.'); // Noncompliant | ||
// ... | ||
} | ||
---- | ||
|
||
==== Compliant solution | ||
|
||
[source,cpp,diff-id=4,diff-type=compliant] | ||
---- | ||
void search(std::string const& string) { | ||
size_t sub = string.find("substring"); // Compliant | ||
size_t dot = string.find('.'); // Compliant | ||
// ... | ||
} | ||
---- | ||
|
||
=== Searching one of many characters | ||
|
||
Four methods allow searching for any character from a list inside a string: `find_first_of`, `find_first_not_of`, `find_last_of`, and `find_last_not_of`. They are better alternatives than `strspn`, `strcspn`, and `strpbrk` when manipulating strings. | ||
|
||
==== Noncompliant code example | ||
|
||
[source,cpp,diff-id=5,diff-type=noncompliant] | ||
---- | ||
void search(std::string const& string) { | ||
size_t firstNonNumeric = strspn(string.c_str(), "0123456789"); // Noncompliant | ||
size_t firstSeparator = strcspn(string.c_str(), ",. ;"); // Noncompliant | ||
// ... | ||
} | ||
---- | ||
|
||
==== Compliant solution | ||
|
||
[source,cpp,diff-id=5,diff-type=compliant] | ||
---- | ||
void search(std::string const& string) { | ||
size_t firstNonNumeric = string.find_first_not_of("0123456789"); // Compliant | ||
size_t firstSeparator = string.find_first_of(",. ;"); // Compliant | ||
// ... | ||
} | ||
---- | ||
|
||
=== Pitfalls | ||
|
||
Using string methods instead of the result of `c_str()` (or `data()`) will have different results when the string contains embedded null terminators in its content. Operations on `c_str()` will stop at the first null-terminator while string methods will apply to the whole content. | ||
|
||
Most of the time, operating on the whole content is the intended behavior, but in cases where it isn't, using methods might be less straightforward. It is still recommended to fix the code, to make this pitfall clear to other maintainers. | ||
|
||
If the goal is to only keep the first part, the string can be cropped in place using `std::string::resize()` or `std::string::erase()`. | ||
Calling `strlen(string.c_str())` inside a call to resize or erase is compliant by exception because it is the simplest way to crop a string like that. | ||
[source,cpp] | ||
---- | ||
string.resize(strlen(string.c_str())); // Compliant by exception | ||
---- | ||
|
||
Otherwise, if working with {cpp}17 or higher and if the goal is to just query the content, a good approach is to create a `std::string_view` and use its methods, which mirror those of `std::string`. | ||
|
||
[source,cpp] | ||
---- | ||
std::string_view beginning{string.c_str()}; | ||
// Use string_view methods afterward | ||
// ... | ||
---- | ||
|
||
Otherwise, if working with older {cpp} versions or if the resulting string will be modified, doing a full copy also works. | ||
|
||
[source,cpp] | ||
---- | ||
std::string beginning{string.c_str()}; | ||
// Use beginning and its methods afterward | ||
// ... | ||
---- | ||
|
||
=== Going the extra mile | ||
|
||
Direct replacement of one C function with the corresponding string method may not lead to the best code overall. When replacing a C function call, you should consider what overall functionality this call is part of and whether your {cpp} version allows to implement that functionality more easily and effectively. | ||
|
||
For example using ranges (`std::ranges::views::split`, `std::ranges::views::chunk`), using algorithms, using `std::istringstream` or `std::ispanstream`, using `std::string_view`. | ||
|
||
An example of a function splitting words in a moddern way could be as follow. | ||
|
||
[source,cpp] | ||
---- | ||
void treatWord(std::string_view word); | ||
void treatAllWords(std::string_view text) { | ||
constexpr std::string_view delim{" "}; | ||
for (const auto word : std::views::split(text, delim)) | ||
treatWord(std::string_view{word}); | ||
} | ||
---- | ||
|
||
== Resources | ||
|
||
=== Documentation | ||
|
||
* {cpp} reference - https://en.cppreference.com/w/cpp/string/basic_string[Native functions for std::string] | ||
|
||
* {cpp} reference - https://en.cppreference.com/w/cpp/ranges#Range_adaptors[ranges documentation] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
{ | ||
} |