Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to find a run uid given an image file name? #20

Closed
prjemian opened this issue Mar 29, 2022 · 9 comments · Fixed by #14
Closed

How to find a run uid given an image file name? #20

prjemian opened this issue Mar 29, 2022 · 9 comments · Fixed by #14
Assignees
Labels
question Further information is requested
Milestone

Comments

@prjemian
Copy link
Contributor

This is a high priority question for the BDP project.

@prjemian prjemian added the question Further information is requested label Mar 29, 2022
@prjemian prjemian added this to the BDP M3 milestone Mar 29, 2022
@prjemian prjemian self-assigned this Mar 29, 2022
@tacaswell
Copy link

You would have to build the inverse lookup table. We have talked about writing the code to do this, but never have.

@prjemian
Copy link
Contributor Author

For now, it sounds like custom mongoquery might be the most efficient.

@prjemian
Copy link
Contributor Author

An extremely easy and contemporary harvest is from the take_image() plan which captures the run's uid and learns the HDF5 file name from the resource:

uid = yield from bp.count([adsimdet], md=_md)
print(f"DIAGNOSTIC: {uid = }")
try:
# write the image file name to a PV
run = cat[uid]
print(f"DIAGNOSTIC: {run = }")
r = run.primary._get_resources()[0]
print(f"DIAGNOSTIC: {r = }")
hdffile = pathlib.Path(r["root"]) / r["resource_path"]
print(f"DIAGNOSTIC: {hdffile = }, {hdffile.exists()=}")
logger.info("Image file '%s' (exists: %s)", hdffile, hdffile.exists())
yield from bps.mv(image_file_created, str(hdffile))

This is the way we can capture this information for the future. Suggestions to save this info, from the bluesky developers on Slack, include:

  • redis key:value database (already included with queueserver installation)
  • new collection in mongodb server
  • local file

Of these, local TEXT file seems extremely easy.

@prjemian
Copy link
Contributor Author

Text file could actually be structured, such as YAML, to make it fast to append new entries and easy to load in Python:

In [30]: import yaml

In [31]: s = """
    ...: a: 1
    ...: b: 2
    ...: """

In [33]: yaml.load(s, yaml.Loader)
Out[33]: {'a': 1, 'b': 2}

@prjemian
Copy link
Contributor Author

Given an HDF5 file name /tmp/docker_ioc/iocbdpad/tmp/adsimdet/2022/03/29/a4700b27-2666-44cf-a86f_000.h5 from run uid=155d3536-f225-4c17-852a-6367792830f4, the entry would be:

a4700b27-2666-44cf-a86f_000: 155d3536-f225-4c17-852a-6367792830f4

We assume here that each HDF5 file will only appear in a single run uid. If we further assume that these identifiers are truly unique uuid codes, then we can record the swapped pair as well and allow for searches given either run uid or HDF5 file base name, find the other one:

a4700b27-2666-44cf-a86f_000: 155d3536-f225-4c17-852a-6367792830f4
155d3536-f225-4c17-852a-6367792830f4: a4700b27-2666-44cf-a86f_000

@prjemian
Copy link
Contributor Author

If proceeding with a mongoquery, see see: https://docs.mongodb.com/manual/reference/operator/query/

@prjemian
Copy link
Contributor Author

In [27]: dl = list(cat.v1[-1].documents())

In [28]: dl[2]
Out[28]: 
('resource',
 {'spec': 'AD_HDF5',
  'root': '/',
  'resource_path': 'tmp/docker_ioc/iocbdpad/tmp/adsimdet/2022/03/29/a4700b27-2666-44cf-a86f_000.h5',
  'resource_kwargs': {'frame_per_point': 1},
  'path_semantics': 'posix',
  'uid': '51d30cff-4580-4dda-a58a-2e05ea724886',
  'run_start': '155d3536-f225-4c17-852a-6367792830f4'})

In [29]: dl[3]
Out[29]: 
('datum',
 {'datum_id': '51d30cff-4580-4dda-a58a-2e05ea724886/0',
  'datum_kwargs': {'point_number': 0},
  'resource': '51d30cff-4580-4dda-a58a-2e05ea724886'})

@prjemian
Copy link
Contributor Author

fill out the mongoquery search dictionary {} here:

In [51]: from apstools.utils import db_query

In [52]: db_query(cat, {})
Out[52]: bdp2022:
  args:
    asset_registry_db: mongodb://dbbluesky4.xray.aps.anl.gov:27017/bdp2022-bluesky
    metadatastore_db: mongodb://dbbluesky4.xray.aps.anl.gov:27017/bdp2022-bluesky
    name: bdp2022
  description: ''
  driver: databroker._drivers.mongo_normalized.BlueskyMongoCatalog
  metadata:
    catalog_dir: /home/beams/JEMIAN/.local/share/intake/

@prjemian
Copy link
Contributor Author

example writing YAML file from take_image() plan:

(bdp2022) jemian@wow ~/.../bdp_controls/qserver $ tail -f xref_image_run.yml 
# file: xref_image_run.yml
# created: 2022-03-29 16:03:06.119378
# purpose: cross-reference bluesky run uid and HDF5 file name

00714a91-c33e-4e7b-90fd-2e8f385bebc9: add9e2d0-7f20-419d-a6a8_000
add9e2d0-7f20-419d-a6a8_000: 00714a91-c33e-4e7b-90fd-2e8f385bebc9
c96b08be-bf17-4623-9ee7-062effddbde9: 32b8278b-eded-42c1-85e2_000
32b8278b-eded-42c1-85e2_000: c96b08be-bf17-4623-9ee7-062effddbde9

prjemian added a commit that referenced this issue Mar 29, 2022
prjemian added a commit that referenced this issue Mar 29, 2022
@prjemian prjemian moved this to In Review in BDP controls preparation Mar 29, 2022
Repository owner moved this from In Review to Done in BDP controls preparation Mar 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

2 participants