-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a backend for rucio #300
Changes from 3 commits
8d194b6
57012a5
879db29
c4e55c2
6404ae5
9211eff
ddcb46d
92d2fa8
f5b810c
8ba0f85
0d9724d
a0bd98b
5574804
52d6a99
c05ef52
30ed55e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
import json | ||
import hashlib | ||
import os.path as osp | ||
|
||
import strax | ||
from strax.storage.files import dirname_to_prefix | ||
|
||
export, __all__ = strax.exporter() | ||
|
||
|
||
@export | ||
class rucio(strax.StorageBackend): | ||
"""Get data from a rucio directory | ||
""" | ||
|
||
def get_metadata(self, dirname, **kwargs): | ||
dirname = str(dirname) | ||
prefix = dirname_to_prefix(dirname) | ||
metadata_json = f'{prefix}-metadata.json' | ||
fn = rucio_path(metadata_json, dirname) | ||
with open(fn, mode='r') as f: | ||
return json.loads(f.read()) | ||
|
||
def _read_chunk(self, dirname, chunk_info, dtype, compressor): | ||
#print('yes') | ||
fn = rucio_path(chunk_info['filename'], dirname) | ||
return strax.load_file(fn, dtype=dtype, compressor=compressor) | ||
|
||
def _saver(self, dirname, metadata): | ||
raise NotImplementedError( | ||
"Cannot save directly into rucio, upload with admix instead") | ||
|
||
|
||
def rucio_path(filename, dirname): | ||
root_path ='/dali/lgrandi/rucio' | ||
scope = "xnt_"+dirname.split('-')[0] | ||
rucio_did = "{0}:{1}".format(scope,filename) | ||
rucio_md5 = hashlib.md5(rucio_did.encode('utf-8')).hexdigest() | ||
t1 = rucio_md5[0:2] | ||
t2 = rucio_md5[2:4] | ||
return osp.join(root_path,scope,t1,t2,filename) |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,41 @@ | ||||||
import json | ||||||
import hashlib | ||||||
import os.path as osp | ||||||
|
||||||
import strax | ||||||
from strax.storage.files import dirname_to_prefix | ||||||
|
||||||
export, __all__ = strax.exporter() | ||||||
|
||||||
|
||||||
@export | ||||||
class rucio(strax.StorageBackend): | ||||||
"""Get data from a rucio directory | ||||||
""" | ||||||
|
||||||
def get_metadata(self, dirname, **kwargs): | ||||||
dirname = str(dirname) | ||||||
prefix = dirname_to_prefix(dirname) | ||||||
metadata_json = f'{prefix}-metadata.json' | ||||||
fn = rucio_path(metadata_json, dirname) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you raise the same error as here if the md is not available. Strax may rely on this later: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, furthermore, I had much more outputs with this error than without 😅 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Alright thanks for adding it |
||||||
with open(fn, mode='r') as f: | ||||||
return json.loads(f.read()) | ||||||
|
||||||
def _read_chunk(self, dirname, chunk_info, dtype, compressor): | ||||||
#print('yes') | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you forgot to remove this line ;) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry 😅 |
||||||
fn = rucio_path(chunk_info['filename'], dirname) | ||||||
return strax.load_file(fn, dtype=dtype, compressor=compressor) | ||||||
|
||||||
def _saver(self, dirname, metadata): | ||||||
raise NotImplementedError( | ||||||
"Cannot save directly into rucio, upload with admix instead") | ||||||
|
||||||
|
||||||
def rucio_path(filename, dirname): | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This bit is a bit hard to follow without comments (sorry I'm not an expert in the rucio naming convention). What do the paths look like and are we sure its always these hard-coded conversions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Te path look like ('/dali/lgrandi/rucio/xnt_008710/1f/fb/peaklets-b7dgmtzaef-000047'), It is a hard-coded convention on the Rucio code, you can see it here https://github.com/rucio/rucio/blob/671fe6253981eb632aae3c9ddfe54eb83e63fd1a/lib/rucio/rse/protocols/protocol.py#L114 |
||||||
root_path ='/dali/lgrandi/rucio' | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We shouldn't hard-code it like this as this might change to e.g. '/dali/lgrandi/xenonnt/rucio' I guess you want to get the info from the dirname. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unfortunately, dirname does not include the root path 😕 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm okay. So how about this: You pass it on at initialization: You would also have to change these lines: Line 15 in 6404ae5 Line 25 in 6404ae5
If you want some help I can also commit this idea to prevent hardcodes? |
||||||
scope = "xnt_"+dirname.split('-')[0] | ||||||
rucio_did = "{0}:{1}".format(scope,filename) | ||||||
rucio_md5 = hashlib.md5(rucio_did.encode('utf-8')).hexdigest() | ||||||
t1 = rucio_md5[0:2] | ||||||
t2 = rucio_md5[2:4] | ||||||
return osp.join(root_path,scope,t1,t2,filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you accidentally uploaded this file twice. Can you remove this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks for the remark