-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor with recent changes #36
Conversation
This function was in both Filterbank and Waterfall classes, but was slightly different. The waterfall invocation was marginally better, albeit pretty much identical. This has been moved to Filterbank only, the Waterfall class will still have access through inheritance.
This functionality can be done simply with np.array.size, or np.prod(np.array.shape)
EE: Please check this does what you think it should.
(CamelCase for classes in PEP8)
Also made read_header return the parsed dictionary, for consistency with FilReader.read_header() Now both H5 and Fil readers return a dictionary, which is useful.
Other methods such as _setup_freqs and _setup_chans
At top setting plt.rcParams['axes.formatter.useoffset'] = False instead of calling ax.get_xaxis().get_major_formatter().set_useOffset(False) This means fewer lines of code, but should be more resilient to matplotlib versions that don't support this API.
plot_time_series was failing if orientation=None
We are not strictly following PEP8 but fixing some simple things
This function was a helper, that was never used and confusing. No functionality has been lost.
although blob_dim is a keyword argument, it isn't actually used.
Not sure we still need / use this functionality (generation of a filterbank file from a header and data blob), perhaps it could be removed in the future.
Here is the output of code coverage:
Which is telling us what fraction of code gets executed by the unit tests. We don't have any tests for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been removing the dependence of waterfall.py from filterbank.py. Please move this function back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Danny, this will do for now. We can think in the mid future on how to either make it more general, or just add each telescope at a time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't remember why I pulled these out from sigproc. Did you double check they where the same code?
Hey @jeenriquez , I think the pull request grew too large to see your specific comments? Re: waterfall and filterbank dependece, how would you feel long-term about fully deprecating Filterbank in favor of Waterfall? Re: sigproc calls in waterfall, was this to remove dependencies maybe? The earliest versions of Waterfall were pretty much stand-alone right? |
Hey @telegraphic, yeah, I was just looking into this, I thought I was giving comments for each individual commit. I'll make a summary and then give one single comment. |
No need for this in filterbank -- use waterfall if you need this functionality.
Hi Danny, I had a look at all your changes. Quite a lot! and I think in general all looks very good! I would like to merge soon, since I think version is more stable that the current half way of the refactor. But not sure what happens then when all these comments. Could you reply to my comments below? Consolidated blank_dc functionI have been removing the dependence of waterfall.py from filterbank.py. Please move this function back. (blanc_dc). Removing duplicated code from sigproc.py:I can't remember why I pulled these out from sigproc. Maybe I did to be able to edit it and avoid breaking the original. Did you double check they where the same code? Follow-up on removal of gen_from_header.Good question, I don’t think we are using this anymore. For now, unless is “on the way” we could keep it and just keep an eye. Updating unit testsThis made me think that it would be good to have a monthly meeting to discuss the status of blimpy, as well as the new developments. Updated to build h5py betterCould you explain this one to me. How does it work now? This may need to be updated in the readme. More bytestring Py3 shennanigansWondering about the use of six here. If I understand this correctly that is the preferred option for python 2.5. https://docs.python.org/3/howto/pyporting.html Py3 throws a hissy fit with None comparisons, making these more robustMissing to change Updates for #31 #32 #33 #34 plotting issuesThe plotting issues are #30 to #33, this seems to be affecting the comments on the issues. Not sure if there is a way to fix it. calc_n_coarse_channels not working at Parkes, changing logic to fixAre you planning to make it work also for the other Parkes resolutions? |
Hey Emilio, thanks for going through it all! Tests on big filesWe definitely need some tests with full-size data products from GBT + PKS. Shall we set up a dedicated test platform somewhere? Could be at GBT or PKS, or even in the cloud -- at Berkeley might make the most sense though?
Data in HDF5 and FIL for all resolutions. Consolidated blank_dc functionWe can add this back for now. Maybe this is part of a bigger discussion about Filterbank vs Waterfall -- should we totally deprecate Filterbank? I am pretty pro-consolidation in general and Waterfall has more functionality. Here's the current state: Waterfall inherits the functions from Filterbank:
Things with Waterfall overrides:
New methods that Waterfall adds:
My thought is that we move everything apart from barebones functionality from Filterbank to Waterfall. We would remove
instead of
But it seems like this isn't worth supporting anymore, so I would get rid of the read functions. Removing duplicated code from sigproc.py:I did double-check the code was the same. Follow-up on removal of gen_from_headerIt is a little useful to be able to take a header and data and make a Filterbank object, so we can keep the functionality for now but earmark it for future tidy-up or removal. Updating unit testsAgreed - we should schedule one in every 2 weeks, and cancel it if we have no changes to discuss (keep meetings to a minimum!). Or we could have a 'software stack' meeting? Updated to build h5py betterThis one is only for the Travis CI testing stuff. Basically, ubuntu puts the hdf5 libraries in an unusual place and bitshuffle / h5py don't always find them. The CFLAGS include just points to the location of the hdf5 libraries for inclusion when building. More bytestring Py3 shennanigansAgreed, I don't think we should support Py 2.5! Maybe not even Py 2.6? We might be able to get rid of six in the future, but at the moment it's being used to catch something that Py3 doesn't support, and to reroute that code so Py3 is happy. Py3 throws a hissy fit with None comparisons, making these more robustGood catch, changed. I really hate Py 3 for these, I wonder if there is a neater way to do it that I'm not aware of? Updates for #31 #32 #33 NOT # 34 plotting issuesOops, sorry, yeah I think this is pretty hard to change :(. calc_n_coarse_channels not working at Parkes, changing logic to fixThis code probably needs some robust testing. Really would be nice to have N_COARSE_CHANS in the metadata headers :/. Here's the code as it currently stands. I think it will fail with Parkes, but will raise an error making it clear that it has failed. def calc_n_coarse_chan(self):
""" This makes an attempt to calculate the number of coarse channels in a given file.
Note:
This is unlikely to work on non-Breakthrough Listen data, as a-priori knowledge of
the digitizer system is required.
"""
nchans = int(self.header[b'nchans'])
# Do we have a file with enough channels that it has coarse channelization?
if nchans >= 2**20:
# Does the common FFT length of 2^20 divide through without a remainder?
# This should work for most GBT and all Parkes hires data
if nchans % 2**20 == 0:
n_coarse_chan = nchans // 2**20
return n_coarse_chan
# Early GBT data has non-2^N FFT length, check if it is GBT data
elif self.header[b'telescope_id'] == 6:
coarse_chan_bw = 2.9296875
bandwidth = abs(self.f_stop - self.f_start)
n_coarse_chan = int(bandwidth / coarse_chan_bw)
return n_coarse_chan
else:
raise RuntimeError("Couldn't figure out n_coarse_chan")
else:
raise RuntimeError("This function currently only works for hires BL Parkes or GBT data.") |
Refactor with recent changes
There is quite a lot in here!
The main change is creation of a generic
Reader
class, that subclasses H5 and FIL readers. This is on the same lines as the earlier subclass ofFilterbank
byWaterfall
, and again is driven to avoid code duplication and to use OOP design patterns.I also added a new test, which runs through and compares the
Waterfall
loaded Voyager data in both fil and h5 formats.The rest of the changes are docstrings/typo/readability changes.