Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Add a Galaxy tool for TabPFN package (by Prof. Hutter's group) #1533

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

anuprulez
Copy link
Contributor

The PR adds a Galaxy tool that uses the TabPFN package (DOI: 10.48550/arXiv.2207.01848) to create a general-purpose classifier.

  • It takes input two files - train and test data along with their corresponding labels
  • TabPFNClassifier trains on the training data and predicts classes using test data
  • Additionally, it outputs a precision-recall curve.
  • A conda package has been created using the TabPFN PyPi package (https://github.com/conda-forge/tabpfn-feedstock) because it is required by the Galaxy platform

Many more subsequent analyses can be added as required such as creating workflows with this tool.

A few challenges:

  • One challenge is the export this classifier to standard ONNX format. The standard export of Pytorch models to ONNX does not work with the trained TabPFN model. We may require it, otherwise, we may have to export the trained model as a zipped file.

  • Additionally, to move the discussion around integrating the tabpfn_client package - it requires a separate login before use, so integrating it to Galaxy would be complicated I think unless there is a way to bypass the login process.

Thank you!

ping @bgruening

tabpfn1

tabpfn2

tabpfn3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant