Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix bug Can't exec /bin/sh: Argument list too long #247

Merged
merged 1 commit into from
May 5, 2016

Conversation

duytintruong
Copy link
Contributor

Hi Andrew,

I got the following error while running roary:

Can't exec "/bin/sh": Argument list too long at /tools/roary/lib/Bio/Roary/ParallelAllAgainstAllBlast.pm line 92, line 647671.
awk: fatal: cannot open file `_blast_results' for reading (No such file or directory)

Looking at that line (the bold one in the sub below):
sub combine_blast_results {
my ( $self, $output_files ) = @
;
for my $output_file ( @{$output_files} ) {
Bio::Roary::Exceptions::FileNotFound->throw( error => "Cant find blast results: " . $output_file )
unless ( -e $output_file );
}
my $output_files_param = join( ' ', @{$output_files} );
system( "cat $output_files_param > " . $self->blast_results_file_name );
return 1;
}

The error comes from the too long list in the shell while calling "cat".
So, I changed the sub above as below and it worked:

sub combine_blast_results {
my ( $self, $output_files ) = @
;
for my $output_file ( @{$output_files} ) {
Bio::Roary::Exceptions::FileNotFound->throw( error => "Cant find blast results: " . $output_file )
unless ( -e $output_file );
}
if ( -e $self->blast_results_file_name )
{
system( "rm " . $self->blast_results_file_name );
}
system( "touch " . $self->blast_results_file_name );
for my $output_file ( @{$output_files} ) {
system( "cat $output_file >> " . $self->blast_results_file_name );
}
return 1;
}

You may have a better fix then mine and it would be great if that fix can be integrated in the next version of Roary.

Thanks,
Tin

@andrewjpage
Copy link
Member

Hi Tin,
If you run:
getconf ARG_MAX

what do you get? What is your input command? How many samples are you passing in?
Andrew

@duytintruong
Copy link
Contributor Author

Hi Andrew,

I got 2097152 when running "getconf ARG_MAX". My command was:
roary -e --mafft -z -p 30 -f output -g 1000000 -i 75 input/*.gff

I passed 53 samples to roary.
Do you need the input files?

Cheers,
Tin

@duytintruong
Copy link
Contributor Author

Hi Andrew,

If I execute the command:
/bin/true $(seq 1 400000)

I still get the error:
bash: /bin/true: Argument list too long

So, I guess the maximum number of arguments in practice may be smaller the number returned by "getconf ARG_MAX".

Cheers,
Tin

@andrewjpage andrewjpage merged commit 279eaea into sanger-pathogens:master May 5, 2016
@andrewjpage
Copy link
Member

Thanks for the pull request. It appears to be something to do with your input data set. The proteins are chunked into 200k pieces, so with only 53 input genomes they would have to be massive to blow the argument list (back of an envelope says ~50 MBases in each genome).

@duytintruong
Copy link
Contributor Author

Thanks Andrew.
You were right, those genomes were the synthetic ones so they were quite large.

Cheers,
Tin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants