How to find a run uid given an image file name? #20

prjemian · 2022-03-29T18:42:34Z

This is a high priority question for the BDP project.

tacaswell · 2022-03-29T19:01:13Z

You would have to build the inverse lookup table. We have talked about writing the code to do this, but never have.

prjemian · 2022-03-29T19:14:15Z

For now, it sounds like custom mongoquery might be the most efficient.

prjemian · 2022-03-29T20:03:58Z

An extremely easy and contemporary harvest is from the take_image() plan which captures the run's uid and learns the HDF5 file name from the resource:

bdp_controls/qserver/instrument/plans/image_acquisition.py

Lines 86 to 98 in bef79d1

    
           uid = yield from bp.count([adsimdet], md=_md) 
        
           print(f"DIAGNOSTIC: {uid = }") 
        
           try: 
        
               # write the image file name to a PV 
        
               run = cat[uid] 
        
               print(f"DIAGNOSTIC: {run = }") 
        
               r = run.primary._get_resources()[0] 
        
               print(f"DIAGNOSTIC: {r = }") 
        
               hdffile = pathlib.Path(r["root"]) / r["resource_path"] 
        
               print(f"DIAGNOSTIC: {hdffile = },  {hdffile.exists()=}") 
        
               logger.info("Image file '%s' (exists: %s)", hdffile, hdffile.exists()) 
        
               yield from bps.mv(image_file_created, str(hdffile))

This is the way we can capture this information for the future. Suggestions to save this info, from the bluesky developers on Slack, include:

redis key:value database (already included with queueserver installation)
new collection in mongodb server
local file

Of these, local TEXT file seems extremely easy.

prjemian · 2022-03-29T20:06:16Z

Text file could actually be structured, such as YAML, to make it fast to append new entries and easy to load in Python:

In [30]: import yaml

In [31]: s = """
    ...: a: 1
    ...: b: 2
    ...: """

In [33]: yaml.load(s, yaml.Loader)
Out[33]: {'a': 1, 'b': 2}

prjemian · 2022-03-29T20:11:46Z

Given an HDF5 file name /tmp/docker_ioc/iocbdpad/tmp/adsimdet/2022/03/29/a4700b27-2666-44cf-a86f_000.h5 from run uid=155d3536-f225-4c17-852a-6367792830f4, the entry would be:

a4700b27-2666-44cf-a86f_000: 155d3536-f225-4c17-852a-6367792830f4

We assume here that each HDF5 file will only appear in a single run uid. If we further assume that these identifiers are truly unique uuid codes, then we can record the swapped pair as well and allow for searches given either run uid or HDF5 file base name, find the other one:

a4700b27-2666-44cf-a86f_000: 155d3536-f225-4c17-852a-6367792830f4
155d3536-f225-4c17-852a-6367792830f4: a4700b27-2666-44cf-a86f_000

prjemian · 2022-03-29T20:17:46Z

If proceeding with a mongoquery, see see: https://docs.mongodb.com/manual/reference/operator/query/

prjemian · 2022-03-29T20:20:47Z

In [27]: dl = list(cat.v1[-1].documents())

In [28]: dl[2]
Out[28]: 
('resource',
 {'spec': 'AD_HDF5',
  'root': '/',
  'resource_path': 'tmp/docker_ioc/iocbdpad/tmp/adsimdet/2022/03/29/a4700b27-2666-44cf-a86f_000.h5',
  'resource_kwargs': {'frame_per_point': 1},
  'path_semantics': 'posix',
  'uid': '51d30cff-4580-4dda-a58a-2e05ea724886',
  'run_start': '155d3536-f225-4c17-852a-6367792830f4'})

In [29]: dl[3]
Out[29]: 
('datum',
 {'datum_id': '51d30cff-4580-4dda-a58a-2e05ea724886/0',
  'datum_kwargs': {'point_number': 0},
  'resource': '51d30cff-4580-4dda-a58a-2e05ea724886'})

prjemian · 2022-03-29T20:38:18Z

fill out the mongoquery search dictionary {} here:

In [51]: from apstools.utils import db_query

In [52]: db_query(cat, {})
Out[52]: bdp2022:
  args:
    asset_registry_db: mongodb://dbbluesky4.xray.aps.anl.gov:27017/bdp2022-bluesky
    metadatastore_db: mongodb://dbbluesky4.xray.aps.anl.gov:27017/bdp2022-bluesky
    name: bdp2022
  description: ''
  driver: databroker._drivers.mongo_normalized.BlueskyMongoCatalog
  metadata:
    catalog_dir: /home/beams/JEMIAN/.local/share/intake/

prjemian · 2022-03-29T21:04:39Z

example writing YAML file from take_image() plan:

(bdp2022) jemian@wow ~/.../bdp_controls/qserver $ tail -f xref_image_run.yml 
# file: xref_image_run.yml
# created: 2022-03-29 16:03:06.119378
# purpose: cross-reference bluesky run uid and HDF5 file name

00714a91-c33e-4e7b-90fd-2e8f385bebc9: add9e2d0-7f20-419d-a6a8_000
add9e2d0-7f20-419d-a6a8_000: 00714a91-c33e-4e7b-90fd-2e8f385bebc9
c96b08be-bf17-4623-9ee7-062effddbde9: 32b8278b-eded-42c1-85e2_000
32b8278b-eded-42c1-85e2_000: c96b08be-bf17-4623-9ee7-062effddbde9

prjemian added the question Further information is requested label Mar 29, 2022

prjemian added this to the BDP M3 milestone Mar 29, 2022

prjemian self-assigned this Mar 29, 2022

prjemian added this to BDP controls preparation Mar 29, 2022

prjemian added a commit that referenced this issue Mar 29, 2022

MNT #20 ignore new xref file

e3e3c18

prjemian added a commit that referenced this issue Mar 29, 2022

MNT #20 record new xref file

75c2b6d

prjemian moved this to In Review in BDP controls preparation Mar 29, 2022

prjemian mentioned this issue Mar 29, 2022

implement bluesky-queueserver #14

Merged

prjemian closed this as completed in #14 Mar 29, 2022

Repository owner moved this from In Review to Done in BDP controls preparation Mar 29, 2022

prjemian mentioned this issue Sep 11, 2024

DOC: How to search for data? BCDA-APS/bluesky_training#320

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to find a run uid given an image file name? #20

How to find a run uid given an image file name? #20

prjemian commented Mar 29, 2022

tacaswell commented Mar 29, 2022

prjemian commented Mar 29, 2022

prjemian commented Mar 29, 2022

prjemian commented Mar 29, 2022

prjemian commented Mar 29, 2022

prjemian commented Mar 29, 2022

prjemian commented Mar 29, 2022

prjemian commented Mar 29, 2022

prjemian commented Mar 29, 2022

How to find a run uid given an image file name? #20

How to find a run uid given an image file name? #20

Comments

prjemian commented Mar 29, 2022

tacaswell commented Mar 29, 2022

prjemian commented Mar 29, 2022

prjemian commented Mar 29, 2022

prjemian commented Mar 29, 2022

prjemian commented Mar 29, 2022

prjemian commented Mar 29, 2022

prjemian commented Mar 29, 2022

prjemian commented Mar 29, 2022

prjemian commented Mar 29, 2022