Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create an interface that, given a plate, provides its ancestry and descendency #77

Open
josenavas opened this issue Jan 19, 2018 · 11 comments

Comments

@josenavas
Copy link
Member

i.e. which plates/process comes from and which plates/process have been generated from it.

@AmandaBirmingham AmandaBirmingham self-assigned this Jan 23, 2018
@AmandaBirmingham
Copy link
Collaborator

I suggest that a sensible terminology for this is "plate lineage".

@tanaes
Copy link
Collaborator

tanaes commented Apr 11, 2018

Specific questions one will likely need to answer:

  • Given a sample, what plate does it appear in?
  • Given a library plate, what DNA plates did it pull from?
  • Given a sequencing run, what DNA plates did it pull from?
  • Given a Qiita study, what sequencing runs are associated with the study?
  • Given a DNA plate, was a library plate made from it?
  • Given a library plate, what are the 4 DNA plate barcodes associated? I need to get them out of a freezer

@tanaes
Copy link
Collaborator

tanaes commented Apr 11, 2018

jucpgfv6

A rough sketch from @ElDeveloper and my online whiteboard session... think is that if a plate ID is queried, all the incoming edges could be retrieved by searching the plates of origin of thee ingoing compositions, and so on up to original plated samples; and all the outgoing edges could similarly be retrieved, but you would not need to "turn the corner" and retrieve other ingoing compositions to downstream plates.

@AmandaBirmingham
Copy link
Collaborator

AmandaBirmingham commented Apr 11, 2018

Just so this doesn't get lost in old email archives, here's the pseudocode I taped out back in January for what some of the necessary db queries would look like:

# Note: throughout this pseudocode, "subtype composition" means one of the specific composition types like sample_composition,
# gdna_composition, library_prep_shotgun_composition, etc.
# "subtype process" means one of the two *_process_data tables (gdna_extraction_process_data or library_prep_16s_process_data);
# if these tables are removed from the db schema, then the functionality of Get info for processes  becomes a little simpler.

Get ancestor processes (plate id):

              get all of plate’s wells from well by plate_id
               for each well
                               get its container_id from well
                               get the latest_upstream_process_id for that container_id from container
                               add latest_upstream_process_id to a list of unique process ids                   

                               get the composition_id and composition_type_id for this upstream_process_id and container_id from composition
                               look up the composition_id in the appropriate subtype composition table

                               <recurse up through input compositions>:
                               get all the input composition ids for the subtype composition (e.g., sample_composition_id is an input to gdna_composition, gdna_composition_id is an input to library_prep_16s_composition, etc)
                               for each of these input composition ids
                                               look up the input composition id in the appropriate subtype composition table (e.g., if the input is a gdna_composition_id, you look it up in gdna_composition)
                                               get the composition_id for that subtype composition
                                               get the upstream_process_id of this composition from the composition table by composition_id
                                               add process id to the list of unique process ids
                                               <recurse up through input compositions>

               # at the end of this, we have a list of all the process ids that have contributed to this plate
               # either directly or indirectly

               call Get info for processes with that list as input, and return results

Get descendants processes (plate id):

                get all of plate’s wells from well by plate_id
                for each well
                                get its container_id from well
                                get all the composition_ids for that container_id, and their associated composition_type_ids, from composition
                                for each composition_id
                                                look up the composition id_in the relevant subtype composition table
                                                get the subtype composition id
                                                <recurse down through input compositions>:
                                                look up the subtype composition id in each of the subtype composition tables that takes that kind of id as an input (e.g., if you've got a gdna_composition_id, look it up in the gdna_composition_id field of normalized_gdna_composition and in the gdna_composition_id field of library_prep_16s_composition)
                                                for record found
                                                                get the subtype composition id and the composition_id for the subtype composition having the looked-up value as an input
                                                                look up the composition_id in the composition table and get its upstream_process_id
                                                                add that process id to list of unique process ids
                                                                <recurse down through input compositions>
 
                # at the end of this, we have a list of all the processes to which material from this plate was an input
                # either directly or indirectly
 
                call Get info for processes with that list as input, and return results
 

Get info for processes(process_ids_list)

                for each unique process id in process_ids_list in ascending order of run date
                                look up process_id in process
                                get the process_type_id for this process
                                look up the process_id in the subtype process table
                                if the subtype is NOT gdna_extraction or library_prep_16s
                                                gather all fields that aren't the primary key or the process_id foreign key--these are the settings
                                else
                                                get the subtype process id
                                                get all the records in the *_process_data table for this process subtype having this subtype process id
                                                for each of them
                                                                gather all fields that aren't the primary key--these are the settings
 
                # return the list of process/setting info, in ascending order of run date
 

@tanaes
Copy link
Collaborator

tanaes commented Apr 13, 2018

Awesome! What would it mean to remove the *_process_data tables?

also attn: @ElDeveloper

@AmandaBirmingham
Copy link
Collaborator

@tanaes Actually, the *_process_data tables have already been removed (the email I copied was written before that fix, sorry) so the true sql queries will be a little simpler than the ones above.

@tanaes
Copy link
Collaborator

tanaes commented Apr 13, 2018 via email

@tanaes
Copy link
Collaborator

tanaes commented Apr 27, 2018

@qiyunzhu @mortonjt A minimal solution to this issue would be to:

  1. take a plate ID input, e.g. from the Plate List interface
  2. identify all the Composition elements within that plate
  3. for each Composition:
  • follow it 'downstream' in the database to find all Composition objects which inherit that composition
  • identify all the plate IDs in which those downstream compositions were resident
  • return a table showing all of those plate IDs.
  1. repeat for 'upstream' compositions

@charles-cowart
Copy link
Collaborator

@AmandaBirmingham Wondering if the wet-lab needs this feature for full launch, or if it can be added later.

@AmandaBirmingham
Copy link
Collaborator

@charles-cowart no one from the actual wet lab has ever asked me for it/about it, so I'm not convinced of its importance.

@AmandaBirmingham
Copy link
Collaborator

Note per discussion with @charles-cowart and @wasade today: If/when we implement this functionality (particularly the descent functionality), we would then have the necessary logic to check whether a plate can be renamed (plates that have no descendants of any of their wells can safely be renamed). However, note that @wasade notes that multiple users could be using LabControl at the same time, so descendants could be changing on the fly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants