Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error code returned by cmsRun on a FallbackFileOpenError #42249

Merged
merged 1 commit into from
Jul 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions FWCore/Catalog/src/FileLocator.cc
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#include "FWCore/Catalog/interface/FileLocator.h"
#include "FWCore/ServiceRegistry/interface/Service.h"
#include "FWCore/Utilities/interface/Exception.h"

#include <boost/algorithm/string.hpp>
#include <boost/algorithm/string/replace.hpp>
Expand Down Expand Up @@ -295,6 +296,15 @@ namespace edm {
m_prefix = found_protocol->second.get("prefix", kEmptyString);
if (m_prefix == kEmptyString) {
//get rules
if (found_protocol->second.find("rules") == found_protocol->second.not_found()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stlammel are you OK with how this is handling a mis-configured storage.json?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, if you get here it seg faults. The change is that this throws an informative exception instead of allowing the seg fault to occur. The seg fault is fatal, which is why I made the exception fatal. A couple alternatives I can imagine.

We could remove this error handling change from this PR and handle it in a separate PR.

Even if we approve this PR as is, we can modify this error handling again later in a separate PR.

The exception could be non-fatal and just result in a logWarning and skip like other exceptions in the FileLocator constructor. InputFileCatalog currently catches exceptions and swallows them after a logWarning, although if all FileLocator constructors fail it will throw a different exception anyway.

I am open to any other very simple suggestion here. If we want something more complex or we need time for discussion and consideration, then we should break it out into a separate PR and separate it from this bug fix. I only included it here because I thought it was such a small issue that it didn't deserve a separate PR.

cms::Exception ex("FileCatalog");
rappoccio marked this conversation as resolved.
Show resolved Hide resolved
ex << "protocol must contain either a prefix or rules, "
<< "neither found for protocol \"" << aCatalog.protocol << "\" for the storage site \""
<< aCatalog.storageSite << "\" and volume \"" << aCatalog.volume
<< "\" in storage.json. Check site-local-config.xml <data-access> and storage.json";
ex.addContext("edm::FileLocator:init()");
throw ex;
}
const pt::ptree& rules = found_protocol->second.find("rules")->second;
//loop over rules
for (pt::ptree::value_type const& storageRule : rules) {
Expand Down
5 changes: 3 additions & 2 deletions IOPool/Input/src/RootInputFileSequence.cc
Original file line number Diff line number Diff line change
Expand Up @@ -252,14 +252,15 @@ namespace edm {
}
for (std::vector<std::string>::const_iterator it = fNames.begin(); it != fNames.end(); ++it) {
try {
usedFallback_ = (it != fNames.begin());
rappoccio marked this conversation as resolved.
Show resolved Hide resolved
std::unique_ptr<char[]> name(gSystem->ExpandPathName(it->c_str()));
filePtr = std::make_shared<InputFile>(name.get(), " Initiating request to open file ", inputType);
usedFallback_ = (it != fNames.begin());
break;
} catch (cms::Exception const& e) {
if (!skipBadFiles && std::next(it) == fNames.end()) {
InputFile::reportSkippedFile((*it), logicalFileName());
Exception ex(errors::FileOpenError, "", e);
errors::ErrorCodes errorCode = usedFallback_ ? errors::FallbackFileOpenError : errors::FileOpenError;
Exception ex(errorCode, "", e);
ex.addContext("Calling RootInputFileSequence::initTheFile()");
std::ostringstream out;
out << "Input file " << (*it) << " could not be opened.";
Expand Down
1 change: 1 addition & 0 deletions IOPool/Input/test/BuildFile.xml
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,5 @@

<test name="TestIOPoolInputRepeating" command="testRepeatingCachedRootSource.sh"/>
<test name="TestIOPoolInputNoParentDictionary" command="testNoParentDictionary.sh"/>
<test name="TestFileOpenErrorExitCode" command="testFileOpenErrorExitCode.sh"/>
</environment>
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[
{ "site": "DUMMY",
"volume": "DummyVolume",
"protocols": [
{ "protocol": "protocolThatDoesNotExist",
"prefix": "abc"
}
]
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
<site-local-config>
<site name="DUMMY">
<data-access>
<catalog volume="DummyVolume" protocol="protocolThatDoesNotExist"/>
</data-access>
</site>
</site-local-config>
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
[
{ "site": "DUMMY",
"volume": "DummyVolume1",
"protocols": [
{ "protocol": "protocolThatDoesNotExist1",
"prefix": "abc"
}
]
},
{ "site": "DUMMY",
"volume": "DummyVolume2",
"protocols": [
{ "protocol": "protocolThatDoesNotExist2",
"prefix": "abc"
}
]
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
<site-local-config>
<site name="DUMMY">
<data-access>
<catalog volume="DummyVolume1" protocol="protocolThatDoesNotExist1"/>
<catalog volume="DummyVolume2" protocol="protocolThatDoesNotExist2"/>
</data-access>
</site>
</site-local-config>
34 changes: 34 additions & 0 deletions IOPool/Input/test/testFileOpenErrorExitCode.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#!/bin/bash

# Pass in name and status
function die { echo $1: status $2 ; exit $2; }

mkdir -p SITECONF
mkdir -p SITECONF/local
mkdir -p SITECONF/local/JobConfig

export SITECONFIG_PATH=${PWD}/SITECONF/local
LOCAL_TEST_DIR=${SCRAM_TEST_PATH}

cp ${LOCAL_TEST_DIR}/sitelocalconfig/noFallbackFile/site-local-config.xml ${SITECONFIG_PATH}/JobConfig/
cp ${LOCAL_TEST_DIR}/sitelocalconfig/noFallbackFile/local/storage.json ${SITECONFIG_PATH}/
F1=${LOCAL_TEST_DIR}/test_fileOpenErrorExitCode_cfg.py
cmsRun -j NoFallbackFile_jobreport.xml $F1 -- --input FileThatDoesNotExist.root && die "$F1 should have failed but didn't, exit code was 0" 1

CMSRUN_EXIT_CODE=$(edmFjrDump --exitCode NoFallbackFile_jobreport.xml)
echo "Exit code after first run of test_fileOpenErrorExitCode_cfg.py is ${CMSRUN_EXIT_CODE}"
if [ "x${CMSRUN_EXIT_CODE}" != "x8020" ]; then
echo "Unexpected cmsRun exit code after FileOpenError, exit code from jobReport ${CMSRUN_EXIT_CODE} which is different from the expected 8020"
exit 1
fi

cp ${LOCAL_TEST_DIR}/sitelocalconfig/useFallbackFile/site-local-config.xml ${SITECONFIG_PATH}/JobConfig/
cp ${LOCAL_TEST_DIR}/sitelocalconfig/useFallbackFile/local/storage.json ${SITECONFIG_PATH}/
cmsRun -j UseFallbackFile_jobreport.xml $F1 -- --input FileThatDoesNotExist.root && die "$F1 should have failed after file fallback but didn\'t, exit code was 0" 1

CMSRUN_EXIT_CODE=$(edmFjrDump --exitCode UseFallbackFile_jobreport.xml)
echo "Exit code after second run of test_fileOpenErrorExitCode_cfg.py is ${CMSRUN_EXIT_CODE}"
if [ "x${CMSRUN_EXIT_CODE}" != "x8028" ]; then
echo "Unexpected cmsRun exit code after FallbackFileOpenError, exit code from jobReport ${CMSRUN_EXIT_CODE} which is different from the expected 8028"
exit 1
fi
17 changes: 17 additions & 0 deletions IOPool/Input/test/test_fileOpenErrorExitCode_cfg.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import FWCore.ParameterSet.Config as cms
import argparse
import sys

parser = argparse.ArgumentParser(prog=sys.argv[0], description="Test FileOpenErrorExitCode")
parser.add_argument("--input", type=str, default=[], nargs="*", help="Optional list of input files")

argv = sys.argv[:]
if '--' in argv:
argv.remove("--")
args, unknown = parser.parse_known_args(argv)

process = cms.Process("TEST")

process.source = cms.Source("PoolSource",
fileNames = cms.untracked.vstring("/store/"+x for x in args.input)
)