-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import CSV, MS-DIAL, MZmine #85
Comments
Hi, Thank you for your interest in the nPYc-Toolbox. For the CSV import, the first column must be designated as 'Sample File Name', and contain the sample IDs. All other columns need to have the Feature Name as column name, and the intensities corresponding to each sample in each row. The noFeatureParams=1 defines how many samples to skip. I think this should work to give a try with MS-DIAL and MZmine outputs. This method is not very well documented as its meant for internal use and testing. If you could provide us with some examples of MS-DIAL and MZmine outputs I could look into adding explicit import methods. |
About CSV-Import: Do the feature names need to follow the given name structure like "5.05_536.9785m/z" then ? I found another thing somewhat strange inside the XCMS-Import: Thank you for your explanations. |
Hi, The Feature Names can be anything, as long as they are unique. The current XCMS import functionality was developed for standard XCMS in mind, not XCMS online. In the standard XCMS outputs the rt is in seconds, hence the division by 60. If you have an example of XCMSOnline outputs that you are happy to share alongside the MS-DIAL and MZmine examples that would be great, so we can look into adding an explicit import option for XCMSOnline as well. Also, are you comfortable programming in Python and git to work collaboratively on this if I open a new feature branch? That would make it easier for us to test together that whatever we implement runs correctly for all your use cases. |
Hi again, I'd be happy to help you implementing import of XCMSonline/MZmine/MS-DIAL data since all of them are widely used in the metabolomics community. That should be simple coding and I have access to some example datasets of my institute. Another thing I can offer is three additional feature filters for LC-MS datasets: Signal to noise filter, Peak width filter and Detection rate filter. All are recommended by current best practices papers (Broadhurst et al., Dunn et al, and likewise) with specific threshold values. Until now I used a selfmade code that read nPYc-exported csv to apply these filters but with a little work they could be added to the Attributes section and SOP. |
Hi, I have created a new feature branch to tackle this: https://github.com/phenomecentre/nPYc-Toolbox/tree/feature/msImportUpdates |
Hi, @misch91 Thank you very much for the commit. I accidentaly approved the pull request without noticing it was already a pull request into develop :S. I am reproducing your message in the PR text below for reference: Example datasets can be found here: XCMSonline MS-DIAL MZmine nPYc reimport |
Thank you for the input files - I will push them to our unittest data git repo (https://github.com/phenomecentre/npc-standard-project) and prepare other necessary files to have a working example for testing. I will start with the MZMine and MS-DIAL imports as they are easier to get to work as they come out of the software. For XCMS Online I will see what can be done to minimise the need for the user to modify the output files. |
Hey there,
Thx for this wonderful tool.
I read in the code that there is an option for direct .csv import for MS datasets:
def _loadCSVImport(self, path, noFeatureParams=1, variableType='Discrete'):
Unfortunately, the documentation does not provide an example of how this csv should look like.
Sharing an example feature table with minimum requirements would be very helpful as I am trying to import MS-DIAL and MZmine (LC-MS) peak tables and assess their peak picking vs QI datasets.
Also, is the implementation of direc timput from MS-DIAL or MZmine datasets planned?
I assume the metadata can then nevertheless be imported as usual via
dataset.addSampleInfo()
?The text was updated successfully, but these errors were encountered: