Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implicit conversion in workflow causes results to differ if the input is compressed #19143

Closed
Delphine-L opened this issue Nov 14, 2024 · 3 comments

Comments

@Delphine-L
Copy link
Contributor

Describe the bug
I am using a text transformation tool on a fasta input in a workflow. The text transformation requires a txt input. If the input is a fasta file, the output of the text transformation is a fasta, but if the input is a fasta.gz file, there is a conversion step that convert the fasta to tabular before the text transformation tool, and thee output of the workflow is then a tabular.
The solution I am using to circumvent it is to explicitly asked the user if the input are compressed, and add optional decompression steps, but it adds complexity to the workflow and to the workflow form.
This is related to issue #18709, but I open a new issue because one of a solution discussed was suggesting both : fileA (as fasta) and fileA(as tabular) for the inputs, but it wouldn't solve the issue inside the workflow.

Galaxy Version and/or server at which you observed the bug
Galaxy Version: Main 24.1.3.dev0

Browser and Operating System
Operating System: macOS
Browser: Chrome

To Reproduce
Steps to reproduce the behavior:

  1. Import workflow https://usegalaxy.org/u/delphinel/w/test-implicit-conversion
  2. Import history https://usegalaxy.org/u/delphinel/h/test-implicit-conversion-fastagz-to-txt
  3. Run the workflow on both the compressed and uncompressed fastas (datasets 1 and 2)
  4. Observe that the results are different (datasets 3 and 4)

Expected behavior
Suggestions of solutions :

  • Add additional constraints for tools inside a workflow (if input is txt, I could be able to restrict it to fasta)
  • Being able to test the format of an input, in that case the optional decompression step could be triggered by the detection of a compressed fasta
  • Being able to constrain a workflow input as compressed or uncompressed
@Delphine-L
Copy link
Contributor Author

Strangely the issue doesn't happen on vgp.usegalaxy.org, the fasta.gz is converted to fasta before being used as an input:
Screenshot 2024-11-14 at 12 05 46 PM

@natefoo
Copy link
Member

natefoo commented Nov 14, 2024

A bit more detail from my testing:

  • Run as a tool outside of a workflow, the tool's input field shows 1: Compressed fasta (as tabular) and the conversion is always fasta.gz -> fasta -> tabular, and the tool outputs tabular.
  • I can reproduce @Delphine-L's findings when run in the example workflow:
    • On usegalaxy.org, the workflow input form shows 1: Compressed Fasta (as fasta) but it converts fasta.gz -> fasta -> tabular and outputs tabular, the same as when running as a plain tool.
    • On vgp.usegalaxy.org, the workflow input form also shows 1: Compressed Fasta (as fasta) but it converts fasta.gz -> fasta -> and outputs fasta.

usegalaxy.org and vgp.usegalaxy.org do run slightly different configs but the same copy of Galaxy itself, the same datatypes_conf.xml (the sample shipped with Galaxy at the revision it is running), and the same database, and I don't see any differing config options that would affect this.

@mvdbeek
Copy link
Member

mvdbeek commented Nov 15, 2024

I'm gonna close this as a duplicate of #18709 and prioritze a fix for that.

@mvdbeek mvdbeek closed this as completed Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants