Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

set TMPDIR variable to soft link pointing to PWD #21419

Merged
merged 4 commits into from
Nov 24, 2017
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions GeneratorInterface/SherpaInterface/src/SherpackUtilities.cc
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#include "GeneratorInterface/SherpaInterface/interface/SherpackUtilities.h"
#include <unistd.h>
#include <cstdlib>
namespace spu {

// functions for inflating (and deflating)
Expand Down Expand Up @@ -153,6 +154,13 @@ void zerr(int ret)
/* compress or decompress from stdin to stdout */
int Unzip(std::string infile, std::string outfile)
{
std::string tmpdir = getenv("TMPDIR");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @perrozzi - could we follow @bbockelm guidance here and first check the size of tmpdir. it seems the maximum total length is 108 for the path - so the check should be just like if (tmpdir.size() > 50 ) { ...} [probably you know better than I how long the sherpa part of that path is going to be ..]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect the Sherpa directories to be O(10) chars, cfr https://hypernews.cern.ch/HyperNews/CMS/get/edmFramework/3807/1/1/1.html
I will put >50 just to be on the safe side

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no guarantee that $TMPDIR is set, meaning that getenv could result in a nullptr.

if (tmpdir.size() > 50 ){
std::string command = "ln -s $PWD "+tmpdir+"/tmp;";
system(command.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not check if system returns a failure value.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put printouts and confirm that things get created properly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things to consider in addition to Chris's comment:

  • What happens if this path already exists (but is owned by someone else)?
  • What happens if $TMPDIR is not unique to the job (several jobs run in the same directory, potentially owned by different users)?
  • What happens if $PWD contains spaces or special characters in it?
  • Who cleans up this symlink when the job finishes?

As written, if there's a shared $TMPDIR with a long name, one job will run successfully and then all subsequent ones on the worker node will fail.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dr15Jones @bbockelm investigating a bit further I came across this
https://www.open-mpi.org/faq/?category=osx#startup-errors-with-open-mpi-2.0.x
and the conclusion is:
The workaround for the Open MPI 2.0.x and v2.1.x release series is to set the TMPDIR environment variable to /tmp or other short directory name.
as far as I see from the example, the directory created in the TMPDIR path should be unique and related to the MPI session.
Would it be an acceptable solution to assing /tmp/ or /tmp/$USER/ as TMPDIR value in SherpackUtilities.cc? I don't see how we could move forward otherwise. Thanks a lot

Copy link
Author

@perrozzi perrozzi Dec 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most important, the written data seems to be cleaned up by MPI after use (i.e. if I point TMPDIR to any directory, it is empty after the job has either succeeded or failed)

command = tmpdir+"/tmp";
setenv("TMPDIR",command.c_str(), true);
}
int ret;
FILE *in = fopen(infile.c_str(),"r");
if (!in) return -1;
Expand Down