-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
edmWriteConfigs: don't use unsecure tmpnam(...) #44314
Conversation
please test for CMSSW_14_1_UBSAN_X |
cms-bot internal usage |
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44314/39332
Code check has found code style and quality issues which could be resolved by applying following patch(s)
|
please test for CMSSW_14_1_UBSAN_X |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44314/39333
|
A new Pull Request was created by @iarspider for master. It involves the following packages:
@Dr15Jones, @makortel, @smuzaffar can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
hold |
Pull request has been put on hold by @Dr15Jones |
-1 Failed Tests: Build The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: BuildI found compilation error when building: >> Building shared library tmp/el8_amd64_gcc12/src/FWCore/ParameterSet/src/FWCoreParameterSet/libFWCoreParameterSet.so Copying tmp/el8_amd64_gcc12/src/FWCore/ParameterSet/src/FWCoreParameterSet/libFWCoreParameterSet.so to productstore area: Leaving library rule at FWCore/ParameterSet >> Compiling src/FWCore/ParameterSet/bin/edmWriteConfigs.cpp src/FWCore/ParameterSet/bin/edmWriteConfigs.cpp: In function 'void {anonymous}::writeModulesFile()': src/FWCore/ParameterSet/bin/edmWriteConfigs.cpp:166:29: error: 'buffer' was not declared in this scope; did you mean 'setbuffer'? 166 | std::filesystem::copy(buffer.data(), "modules.py"); | ^~~~~~ | setbuffer gmake: *** [tmp/el8_amd64_gcc12/src/FWCore/ParameterSet/bin/edmWriteConfigs/edmWriteConfigs.cpp.o] Error 1 >> Building binary edmWriteConfigs |
Pull request #44314 was updated. @makortel, @smuzaffar, @Dr15Jones, @cmsbuild can you please check and sign again. |
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44314/39342
Code check has found code style and quality issues which could be resolved by applying following patch(s)
|
please test |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44314/39343
|
Pull request #44314 was updated. @Dr15Jones, @smuzaffar, @makortel can you please check and sign again. |
bool open_temp(std::string& path, std::ofstream& f) { | ||
path += "/XXXXXX"; | ||
path += '\0'; // ensure that the string is null-terminated | ||
int fd = mkstemp(path.data()); | ||
if (fd != -1) { | ||
f.open(path.c_str(), std::ios_base::trunc | std::ios_base::out); | ||
close(fd); | ||
return true; | ||
} | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to separate the generation of a unique temporary file name and opening of the file. It would also be better to use the exceptions for error handling (since the caller of writeModulesFile()
already has an exception handler).
Also, from C++11 onwards the std::string::data()
is guaranteed to be null-terminated (cppreference)
Please change along
bool open_temp(std::string& path, std::ofstream& f) { | |
path += "/XXXXXX"; | |
path += '\0'; // ensure that the string is null-terminated | |
int fd = mkstemp(path.data()); | |
if (fd != -1) { | |
f.open(path.c_str(), std::ios_base::trunc | std::ios_base::out); | |
close(fd); | |
return true; | |
} | |
return false; | |
} | |
std::string makeUniqueFilename(std::string path) { | |
path += "/XXXXXX"; | |
int fd = mkstemp(path.data()); | |
if (fd == -1) { | |
throw cms::Exception("Failed unique file name") << "Unable to create a unique file name with the template " << path << ": " << std::strerror(errno); | |
} | |
close(fd); | |
return path; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach (using a call to generate the file name, and a separate call to open the file) is not safe: that's the whole point of not using tmpnam()
in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use low-level function write
instead of C++ streams?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach (using a call to generate the file name, and a separate call to open the file) is not safe
Why not? mkstemp()
documentation has
The characters shall be chosen such that the resulting pathname does not duplicate the name of an existing file at the time of the call to
mkstemp()
so the file name should be unique among other processes (which is better than the behavior of tmpnam()
it is possible that a file with that name is created by another process between the moment
std::tmpnam
returns and the moment this program attempts to use the returned name to create a file.
).
I agree it is possible that something else could tamper with the created file between the close(fd)
and the ofstream
construction. If this something else would be by us, we can change it (and currently there should be nothing else doing that). If this something else would be a malicious third party, I think we'd have bigger problems elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is exactly the behaviour of tmpnam
.
With std::tmpnam()
:
- create a unique filename that does not name a currently existing file, and stores it in the character string pointed to by
filename
- it is possible that a file with that name is created by another process between the moment
std::tmpnam
returns and the moment this program attempts to use the returned name to create a file.
With mkstemp()
closing the file and recreating
- the
mkstemp()
function generates a unique temporary filename from template, creates and opens the file, and returns an open file descriptor for the file. - if we close the file descriptor and open a new file, it's possible that another process creates a file with that same name in the meantime.
So replacing tmpnam
with mkstemp
, closing the file and re-opening, we do not gain absolutely anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe there can be a difference between tmpnam()
and mkstemp()
. With two or more processes using tmpnam()
, I'd expect it to be possible two (or more) of the processes to get the same value from their tmpnam()
calls. When all the processes use mkstemp()
, this would not happen as all the processes would be guaranteed to get different names (as long as the temporary files were kept around), or the mkstemp()
call would fail.
Written that, given that the best solution would be to move the generation of modules.py
to scram (#44314 (comment)), which could hopefully be done at the timescale of a few weeks, it doesn't seem worth to spend too much time in improving the behavior, even with respect to tmpnam()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah... yes, I see what you mean.
return false; | ||
} | ||
|
||
bool writeModulesFile() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With exceptions the function signature can return to
bool writeModulesFile() { | |
void writeModulesFile() { |
std::ofstream file; | ||
std::string path(std::filesystem::current_path()); | ||
bool res = open_temp(path, file); | ||
if (res) { | ||
file << "from FWCore.ParameterSet.ModulesProxy import _setupProxies\n" | ||
"locals().update(_setupProxies(__file__))\n"; | ||
file.close(); | ||
std::filesystem::copy(path.data(), "modules.py"); | ||
std::filesystem::remove(path.data()); | ||
return true; | ||
} | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now this could become
std::ofstream file; | |
std::string path(std::filesystem::current_path()); | |
bool res = open_temp(path, file); | |
if (res) { | |
file << "from FWCore.ParameterSet.ModulesProxy import _setupProxies\n" | |
"locals().update(_setupProxies(__file__))\n"; | |
file.close(); | |
std::filesystem::copy(path.data(), "modules.py"); | |
std::filesystem::remove(path.data()); | |
return true; | |
} | |
return false; | |
auto const path = makeUniqueFilename(std::filesystem::current_path()); | |
std::ofstream file{path.c_str()); | |
file << "from FWCore.ParameterSet.ModulesProxy import _setupProxies\n" | |
"locals().update(_setupProxies(__file__))\n"; | |
file.close(); | |
std::filesystem::copy(path.c_str(), "modules.py"); | |
std::filesystem::remove(path.c_str()); |
bool res = writeModulesFile(); | ||
if (!res) { | ||
std::cerr << "writeModulesFile() failed" << std::endl; | ||
return 1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now this would go back to
bool res = writeModulesFile(); | |
if (!res) { | |
std::cerr << "writeModulesFile() failed" << std::endl; | |
return 1; | |
} | |
writeModulesFile(); |
-1 Failed Tests: Build ClangBuild The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: BuildI found compilation error when building: >> Building shared library tmp/el8_amd64_gcc12/src/FWCore/ParameterSet/src/FWCoreParameterSet/libFWCoreParameterSet.so Copying tmp/el8_amd64_gcc12/src/FWCore/ParameterSet/src/FWCoreParameterSet/libFWCoreParameterSet.so to productstore area: Leaving library rule at FWCore/ParameterSet >> Compiling src/FWCore/ParameterSet/bin/edmWriteConfigs.cpp src/FWCore/ParameterSet/bin/edmWriteConfigs.cpp: In function 'bool {anonymous}::writeModulesFile()': src/FWCore/ParameterSet/bin/edmWriteConfigs.cpp:155:7: error: return-statement with no value, in function returning 'bool' [-fpermissive] 155 | return; | ^~~~~~ src/FWCore/ParameterSet/bin/edmWriteConfigs.cpp: In lambda function: src/FWCore/ParameterSet/bin/edmWriteConfigs.cpp:327:18: error: inconsistent types 'void' and 'int' deduced for lambda return type 327 | return 1; Clang BuildI found compilation error while trying to compile with clang. Command used:
>> Entering Package Configuration/Skimming >> Entering Package FWCore/ParameterSet >> Entering Package L1Trigger/L1THGCal >> Entering Package PhysicsTools/NanoAOD >> Compile sequence completed for CMSSW CMSSW_14_1_X_2024-03-04-2300 gmake: *** [There are compilation/build errors. Please see the detail log above.] Error 1 + eval scram build outputlog '&&' '(python3' /data/cmsbld/jenkins/workspace/ib-run-pr-tests/cms-bot/buildLogAnalyzer.py --logDir /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_14_1_X_2024-03-04-2300/tmp/el8_amd64_gcc12/cache/log/src '||' 'true)' ++ scram build outputlog >> Entering Package Configuration/Skimming Entering library rule at Configuration/Skimming >> Compiling edm plugin src/Configuration/Skimming/src/DisappearingMuonsSkimming.cc |
That said, can anybody explain why we don't do simply void writeModulesFile() {
std::ofstream file{"modules.py"};
file << "from FWCore.ParameterSet.ModulesProxy import _setupProxies\n"
"locals().update(_setupProxies(__file__))\n";
file.close();
} ? |
Multiple calls to edmWriteConfigs could be writing to the same directory. Writing a uniquely named file and then renaming it is the standard way to 'atomically' create a file of a given name. |
But that is not what the code is doing: std::filesystem::copy(buffer.data(), "modules.py");
std::filesystem::remove(buffer.data()); This is making a copy, not an atomic rename. |
Also, are you sure simply using the I've tried this example: #include <fstream>
int main(void) {
std::ofstream file("file.txt");
file << "This is a file being written concurrently\nby multiple processes.\n";
file.close();
return 0;
} running it 1000 times in parallel a few times, and the resulting |
What I would consider the 'real' solution to all of this is for scram to generate the modules.py file IFF something was written to the appropriate cfipython directory. |
PR description:
Silence the following warning in UBSAN builds:
PR validation:
Bot tests