Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chewbbaca update #6000

Merged
merged 6 commits into from
May 11, 2024
Merged

Chewbbaca update #6000

merged 6 commits into from
May 11, 2024

Conversation

nilchia
Copy link
Contributor

@nilchia nilchia commented May 8, 2024

FOR CONTRIBUTOR:

  • I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • License permits unrestricted use (educational + commercial)
  • This PR adds a new tool or tool collection
  • This PR updates an existing tool or tool collection
  • This PR does something else (explain below)

This update resolves two issues:

  1. chewBBACA AlleleCall checks the file extensions. So it does not recognize a FASTA file in the history without an extension.
  2. ExtractCgMLST works fine with a normal chewBBACA pipeline but if the user gets the allele profiles from another package or wants to analyze the joined_profile results, there would be a problem because it only takes AlleleCall output. With this update, the user can define the inputs manually.

@@ -8,7 +8,7 @@
mkdir 'input' &&
mkdir 'schema' &&
#for $file in $input_file
ln -sf '$file' 'input/${file.element_identifier}' &&
ln -sf '$file' 'input/${file.element_identifier}.fa' &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the element identifier here? Or can we just name it input/foo.fa

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AlleleCall accepts multiple input FASTAs. I think in that case it overwrites the inputs. Am I right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use foo_1 foo_2 please search in this repo for enumerate ... We do that often

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!
I did it with:

        #for i,file in enumerate($input_file)
        ln -sf '$file' 'input/fasta_${i}.${file.ext}' &&
        #end for

One thing, the result of AlleleCall is a tabular file with row names corresponding to the FASTA file names.
So the result would be something like:
image

rather than:
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, thats bad. Ok lets keep the element_identifier then, but please sanatize them as here for example:

#set escaped_element_identifier = re.sub('[^\w\-]', '_', str($inputFile.element_identifier))

@nilchia nilchia force-pushed the chewbbaca_update branch from 98202d3 to e0d3ab7 Compare May 11, 2024 21:08
@bgruening bgruening merged commit 2564b0c into galaxyproject:main May 11, 2024
14 checks passed
@nilchia nilchia deleted the chewbbaca_update branch May 12, 2024 07:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants