Support records for theoretical calculations associated with experimental papers #58

GraemeWatt · 2016-07-04T17:37:25Z

Various theoretical frameworks provide multiple analyses/code/predictions that are each closely tied to an experimental paper (that might have its own HEPData record). Some examples are Rivet, MadAnalysis5, fastNLO, APPLgrid. A normal HEPData record can be used to store these results and the normal Coordinator-Uploader-Reviewer workflow can be followed, where the Coordinator would be a senior person responsible for a particular theoretical framework. But instead of prompting the Coordinator for a single Inspire ID (of the experimental paper), there should be an option "Theoretical analysis associated with experimental paper" and the Coordinator would enter two Inspire IDs corresponding to both the experimental paper and the theoretical paper describing the framework. The HEPData record of the analysis would then be linked both to the Inspire (and HEPData if it exists) record of the experimental paper and the Inspire (and HEPData if it exists) record of the theoretical paper. A possible HEPData record of the theoretical paper might contain core code that is independent of particular experimental results. Each theoretical paper could be associated with multiple experimental analyses and each experimental paper could be associated with multiple theoretical analyses.

eamonnmag · 2016-07-04T17:38:50Z

Nice! I think that can work nicely.

GraemeWatt · 2016-07-04T17:47:03Z

Some comments from Klaus Rabbertz (fastNLO author) on this issue in July 2015:

A point that is still worth considering is how to add and reference new theory
tables to existing experimental data. For example we might provide updated tables
including additional features like access to electroweak corrections.
Or theory colleagues might want to provide tables for new NNLO
predictions that didn't even exist at the time the data were published.

At the moment (in the old HepData) I have added fastNLO tables to each of the corresponding HEPData records (for the experimental publications). But it would be better to do this in a more elegant way, giving the theorists control over uploading their theoretical analyses and separating this information from the experimental data. The theoretical analyses might not be endorsed by the experimental collaborations, so it should not appear together with the experimental data, but it should be linked from it.

eamonnmag · 2016-10-14T12:29:26Z

Would also attach the code to these records.

GraemeWatt · 2022-12-09T17:02:26Z

After discussion at this morning's IPPP workshop we decided to revive this issue following a slightly different approach to allow linking at the level of data tables rather than whole records. Similar to the proposal for linking error matrices to measurements in #140, the input submission.yaml file could contain a field like:

related_to_table_dois: [10.17182/hepdata.34567.v1/t1, 10.17182/hepdata.89012.v1/t2]

A field like related_tables would be added to the DataSubmission object. The process_data_file function would persist the new field from the input YAML file to the database.

When rendering a data table, the get_table_details function would retrieve related_tables from the database:

 table_contents["related_tables"] = datasub_record.related_tables

The hepdata_tables.js and table_details.html files would be modified to render the list of related tables with the DOIs as links, e.g. "This table related to: 10.17182/hepdata.34567.v1/t1, 10.17182/hepdata.89012.v1/t2"

A database query should also be made to find entries where the doi of the current table matches an item in the related_tables of other DataSubmission objects and where the corresponding HEPSubmission object has overall_status='finished'. In this case, the corresponding DOIs could again be rendered as links, e.g. "This table referred to by: 10.17182/hepdata.12345.v1/t3".

It's probably not necessary to add information on related tables to the OpenSearch index at this stage. The submission_schema.json file of the hepdata-validator package would need to be modified to add the new related_to_table_dois field.

To allow for linking between whole HEPData records as well as individual tables (as was the original idea), a similar field like related_to_hepdata_recids could be added to the first document of the submission.yaml file and to the HEPSubmission object of the database model like related_recids. Again bidirectional links could be created between the two related records.

GraemeWatt · 2023-02-23T10:43:50Z

@ItIsJordan: following our discussion on Tuesday about the best database model for this feature, instead of adding a new field like related_tables to the existing DataSubmission object, I was thinking that it might be better to create a new object RelatedTable (__tablename__ = "relatedtable") with fields id, table_doi and related_to_table_doi. This would simplify the bidirectional linking, i.e. instead of searching the related_tables field of all DataSubmission objects, you would query the related_to_table_doi field of the (few) RelatedTable objects.

Similarly, instead of adding a new field related_recids to the existing HEPSubmission object, you would create a new object RelatedRecids (__tablename__ = "relatedrecids") with fields id, recid and related_to_recid.

The names of the new objects and their fields should be chosen to fit into the existing database model.

GraemeWatt · 2023-05-19T19:13:18Z

Summarising the tasks needed to complete this feature:

Update the hepdata-validator package to support new related_to_table_dois and related_to_hepdata_recids fields (Modify JSON schema to support bidirectional linking hepdata-validator#50). Initially, modifications can be made on a new branch, but a new release of the hepdata-validator package should be made after everything is working.
Extend the database model to add new tables to support the bidirectional linking. Check if a migration will be required for the production database or if the new tables will be created automatically.
Extend the submission code to persist the new fields from the input submission.yaml file to the database. This probably only needs to be done for normal records (not in the Sandbox). DOIs are not assigned for Sandbox records and bidirectional linking will not be required, so there is no need to store information in the new database tables for Sandbox records.
Extend the Python/JavaScript/HTML code to extract information on related data tables or records from the database and render it on the web pages.
Check that deletion is working as expected, i.e. if a submission is made with the new fields, followed by another upload with different fields (or none), check that the original fields are deleted from the new database tables. Also check that deletion of a record removes the relevant information on related data tables or records.
Extend the tests to automatically test the new functionality, i.e. that the new fields are persisted to the database and that they appear rendered on the web pages.
Extend the submission documentation to mention the new fields (Explain how to use bidirectional linking hepdata-submission#13).
Probably also need to extend the hepdata_lib package to provide methods to write the new fields.
Check that the hepdata-converter package can convert from YAML to other formats (CSV, ROOT, YODA) if the new fields are present. Modifications might be needed to write the new fields in the various output formats.

@ItIsJordan, this is my understanding of what's needed after our discussion yesterday, but feel free to edit or add to these tasks.

eamonnmag self-assigned this Jul 4, 2016

eamonnmag modified the milestone: 1.0 Release Aug 11, 2016

eamonnmag assigned lukasheinrich Aug 12, 2016

GraemeWatt mentioned this issue Sep 5, 2016

badges: support badges on records with rivet analysis for example #47

Closed

eamonnmag added the hacktoberfest label Oct 6, 2016

eamonnmag removed the hacktoberfest label Dec 16, 2016

GraemeWatt removed this from the 1.0 Release milestone Feb 3, 2021

GraemeWatt unassigned eamonnmag and lukasheinrich Feb 4, 2021

GraemeWatt added complexity: high priority: low type: enhancement Indicates new feature requests labels Feb 4, 2021

GraemeWatt added this to @HEPData Feb 3, 2022

GraemeWatt moved this to To do in @HEPData Feb 3, 2022

GraemeWatt added priority: medium and removed priority: low labels Dec 9, 2022

GraemeWatt assigned ItIsJordan Dec 9, 2022

GraemeWatt added priority: high and removed priority: medium labels Dec 9, 2022

GraemeWatt added this to the Support bidirectional linking milestone May 19, 2023

GraemeWatt moved this from To do to In Progress in @HEPData Jun 28, 2023

ItIsJordan mentioned this issue Aug 3, 2023

Theoretical records #683

Merged

GraemeWatt closed this as completed in #683 Aug 18, 2023

github-project-automation bot moved this from In Progress to Done in @HEPData Aug 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support records for theoretical calculations associated with experimental papers #58

Support records for theoretical calculations associated with experimental papers #58

GraemeWatt commented Jul 4, 2016

eamonnmag commented Jul 4, 2016

GraemeWatt commented Jul 4, 2016 •

edited

Loading

eamonnmag commented Oct 14, 2016

GraemeWatt commented Dec 9, 2022

GraemeWatt commented Feb 23, 2023

GraemeWatt commented May 19, 2023 •

edited

Loading

Support records for theoretical calculations associated with experimental papers #58

Support records for theoretical calculations associated with experimental papers #58

Comments

GraemeWatt commented Jul 4, 2016

eamonnmag commented Jul 4, 2016

GraemeWatt commented Jul 4, 2016 • edited Loading

eamonnmag commented Oct 14, 2016

GraemeWatt commented Dec 9, 2022

GraemeWatt commented Feb 23, 2023

GraemeWatt commented May 19, 2023 • edited Loading

GraemeWatt commented Jul 4, 2016 •

edited

Loading

GraemeWatt commented May 19, 2023 •

edited

Loading