-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hla 810 main body #7
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7 +/- ##
========================================
- Coverage 7.84% 3.41% -4.43%
========================================
Files 3 4 +1
Lines 153 351 +198
========================================
Hits 12 12
- Misses 141 339 +198 ☔ View full report in Codecov by Sentry. |
src/aa_remove_data/remove_data.py
Outdated
) | ||
args = parser.parse_args() | ||
|
||
if args.new_filename is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some behaviour I noticed. If you specify a new filename, you still get a backup file. I don't think we need a backup file when the new file is being written elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I'm thinking about the following behaviour:
If no new filename and no new backup filename is given:
new file overwrites existing file
backup called {existing_filename}_backup.pb is created
If a new filename is given but no backup:
Create new file
No need for a backup
If a new filename and a backup filename is given:
Create new file
Also create backup
def aa_remove_data_before(): | ||
parser = argparse.ArgumentParser() | ||
parser = add_generic_args(parser) | ||
parser.add_argument( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timestamp part of this is not very clear, I'm struggling to follow the syntax of the input. We can discuss this one in person
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed to take the argument as a single data type so that it can be placed in any order in the cli command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now taken as a string in the form month,day,hour,minute,second,nanosecond. Could also be month.day hh:mm:ss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed in a code review. We will look to implement typer to clean up the CLI
A comment on the layout. I noticed that remove_data.py has a lot of functions, but no class. I later see in the testing branch that you have to import the whole file as remove_data to be able to use it. Edit: We can talk about this more in person |
I noticed that aa-reduce-by-factor doesn't keep the first line and instead skips to the first 'factor'. I think it would be good to start with the first line and then reduce by a factor from there. |
Using aa-reduce-by-factor, I noticed that if you add a --block value it will keep blocks of whatever was specified, but then also skip that same amount, rather than using the factor value. For example. If I specify a block of 5 and factor of 2, it will delete 5 lines and keep 5 lines, rather than delete 2 keep 5. |
Added algorithms and entry points to remove data in different ways:
aa-reduce-data-freq - reduces the frequency of data in a PB file by setting a maximum time period between adjacent data points.
aa-reduce-data-by-factor - reduces the size of data in a PB file by keeping points at regular intervals and removing the rest.
aa-remove-data-every-nth - reduces the size of data in a PB file by removing data points at regular intervals.
aa-remove-data-before - removes all data points before a certain time stamp.
aa-remove-data-after - removes all data points after a certain time stamp.