-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor flashloader #329
Refactor flashloader #329
Conversation
Pull Request Test Coverage Report for Build 9733289561Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put in a few comments, but honestly this is way too large to properly review. Also, the three pull requests you opened seem to contain the same code changes in different stages of modification. I don't understand really how they relate, and which is supposed to change and contain what. I don't really see how I can review this in the current state.
When testing with the tutorial 4, I get the following error:
|
I have figured a way to make reviewing slightly easier. I will put all the classes in the loader file, and not have different modules. So at least the main branch and this are easy to compare against each other. I will address your issues in that commit |
…et accepts parquet_paths which is useful for loading only subset of files
… to appropriate directory
…lace with pulserSignAdc
@zain-sohail you are still working on this, can you let me know when you consider it done, so I can review a coverged version? |
There's work more on feature side than refactoring but I put some of those features here since I didn't wanna do it on older version. I'd suggest we merge this to V1 branch. Basically now:
Older changes:
|
Can you merge/rebase and update typing before I review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems mostly fine to me. The code for generating the dataframes is indeed apparently several times faster than the old one, but I don't completely understand it. Some more and fine-grained description would be helpful.
For future tracking, a benchmark of the buffer file generation would also be great.
I made some detailed commends and questions in-between.
One thing does not seem to work for me with the new version, the saving of h5:
saving data to binned.h5
saving data to binned.h5
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[54], line 1
----> 1 sp.save('binned.h5')
File /mnt/pcshare/users/Laurenz/AreaB/sed/sed/sed/core/processor.py:2503, in SedProcessor.save(self, faddr, **kwds)
2497 to_tiff(
2498 data=data,
2499 faddr=faddr,
2500 **kwds,
2501 )
2502 elif extension in (".h5", ".hdf5"):
-> 2503 to_h5(
2504 data=data,
2505 faddr=faddr,
2506 **kwds,
2507 )
2508 elif extension in (".nxs", ".nexus"):
2509 try:
File /mnt/pcshare/users/Laurenz/AreaB/sed/sed/sed/io/hdf5.py:128, in to_h5(data, faddr, mode)
122 if "metadata" in data.attrs and isinstance(
123 data.attrs["metadata"],
124 dict,
125 ):
126 meta_group = h5_file.create_group("metadata")
--> 128 recursive_write_metadata(meta_group, data.attrs["metadata"])
130 print("Saving complete!")
File /mnt/pcshare/users/Laurenz/AreaB/sed/sed/sed/io/hdf5.py:44, in recursive_write_metadata(h5group, node)
42 elif isinstance(item, dict):
43 group = h5group.create_group(key)
---> 44 recursive_write_metadata(group, item)
45 else:
46 try:
File /mnt/pcshare/users/Laurenz/AreaB/sed/sed/sed/io/hdf5.py:43, in recursive_write_metadata(h5group, node)
41 print(f"Saved {key} as string.")
42 elif isinstance(item, dict):
---> 43 group = h5group.create_group(key)
44 recursive_write_metadata(group, item)
45 else:
File /mnt/pcshare/users/Laurenz/AreaB/sed/poetry_envs/virtualenvs/sed-processor-3qnpZCFI-py3.9/lib/python3.9/site-packages/h5py/_hl/group.py:62, in Group.create_group(self, name, track_order)
59 track_order = h5.get_config().track_order
61 with phil:
---> 62 name, lcpl = self._e(name, lcpl=True)
63 gcpl = Group._gcpl_crt_order if track_order else None
64 gid = h5g.create(self.id, name, lcpl=lcpl, gcpl=gcpl)
File /mnt/pcshare/users/Laurenz/AreaB/sed/poetry_envs/virtualenvs/sed-processor-3qnpZCFI-py3.9/lib/python3.9/site-packages/h5py/_hl/base.py:206, in CommonStateObject._e(self, name, lcpl)
204 coding = h5t.CSET_UTF8
205 else:
--> 206 raise TypeError(f"A name should be string or bytes, not {type(name)}")
208 if lcpl:
209 return name, get_lcpl(coding)
TypeError: A name should be string or bytes, not <class 'int'>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM. Some small comments only left.
The time_stamps are tuples, so they are being saved as str. We could either convert them to to an array, or split them into two channels before saving.
Summary
The FlashLoader is modularized into 2 classes. The fel module contains these classes and can be reused by SXP (already done in #331) and the lab setup at DESY.
'fel' module
DataFrameCreator:
BufferFileHandler:
Tests should also be available.