Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hla 810 main body #7

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open

Hla 810 main body #7

wants to merge 28 commits into from

Conversation

jacob720
Copy link
Collaborator

Added algorithms and entry points to remove data in different ways:

aa-reduce-data-freq - reduces the frequency of data in a PB file by setting a maximum time period between adjacent data points.
aa-reduce-data-by-factor - reduces the size of data in a PB file by keeping points at regular intervals and removing the rest.
aa-remove-data-every-nth - reduces the size of data in a PB file by removing data points at regular intervals.
aa-remove-data-before - removes all data points before a certain time stamp.
aa-remove-data-after - removes all data points after a certain time stamp.

Copy link

codecov bot commented Jan 17, 2025

Codecov Report

Attention: Patch coverage is 0% with 201 lines in your changes missing coverage. Please review.

Project coverage is 3.41%. Comparing base (fbbcdaa) to head (0383488).

Files with missing lines Patch % Lines
src/aa_remove_data/remove_data.py 0.00% 198 Missing ⚠️
src/aa_remove_data/pb_utils.py 0.00% 3 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##            main      #7      +/-   ##
========================================
- Coverage   7.84%   3.41%   -4.43%     
========================================
  Files          3       4       +1     
  Lines        153     351     +198     
========================================
  Hits          12      12              
- Misses       141     339     +198     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

)
args = parser.parse_args()

if args.new_filename is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some behaviour I noticed. If you specify a new filename, you still get a backup file. I don't think we need a backup file when the new file is being written elsewhere.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I'm thinking about the following behaviour:
If no new filename and no new backup filename is given:
new file overwrites existing file
backup called {existing_filename}_backup.pb is created
If a new filename is given but no backup:
Create new file
No need for a backup
If a new filename and a backup filename is given:
Create new file
Also create backup

def aa_remove_data_before():
parser = argparse.ArgumentParser()
parser = add_generic_args(parser)
parser.add_argument(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timestamp part of this is not very clear, I'm struggling to follow the syntax of the input. We can discuss this one in person

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed to take the argument as a single data type so that it can be placed in any order in the cli command

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now taken as a string in the form month,day,hour,minute,second,nanosecond. Could also be month.day hh:mm:ss.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed in a code review. We will look to implement typer to clean up the CLI

@MichaelStubbings
Copy link
Collaborator

MichaelStubbings commented Jan 27, 2025

A comment on the layout. I noticed that remove_data.py has a lot of functions, but no class. I later see in the testing branch that you have to import the whole file as remove_data to be able to use it.

Edit: We can talk about this more in person

@MichaelStubbings
Copy link
Collaborator

I noticed that aa-reduce-by-factor doesn't keep the first line and instead skips to the first 'factor'. I think it would be good to start with the first line and then reduce by a factor from there.

@MichaelStubbings
Copy link
Collaborator

Using aa-reduce-by-factor, I noticed that if you add a --block value it will keep blocks of whatever was specified, but then also skip that same amount, rather than using the factor value.

For example. If I specify a block of 5 and factor of 2, it will delete 5 lines and keep 5 lines, rather than delete 2 keep 5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants