Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Approximation downloaded data #126

Merged

Conversation

renaudjester
Copy link
Collaborator

Fix CMT-63

@renaudjester renaudjester changed the base branch from main to copernicusmarine-toolbox-v2 September 4, 2024 15:44
@renaudjester renaudjester requested a review from uriii3 September 4, 2024 15:45
estimated_size_message += f"{estimate_size:.3f} MB"

download_estimated_size = 0
for variable_name in dataset.data_vars:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to iterate through the variables? do they have different points or different types of description within the same dataset? (I would expect that the see temperature and the wind might be have values over the same points, so we would only need to multiply for the number of variables, no?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they have different steps?

Copy link
Collaborator Author

@renaudjester renaudjester Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because they might have different types (not sure but possible)
But imagine the wind is in float64 and the thetao is in float32

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what would be the problem there? the amount of information per step (per chunk) would be different, but not the way to check for the amount of points to account, no? I think I'm not yet fully understanding the methodology followed, still checking the code hehe

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wouldn't be a problem but it would change the result.

Also some variables don't have depth, so they are lighter than others

for var in service.variables
if var.short_name == variable_name
][0].coordinates
if coord.coordinate_id == coordinate_name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to have a for for all the dimensions and then do them one by one? is there any case in which we are using 'lon' and then 'longitude' or something like that? or what is this logic for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's just that we are in a loop over the coordinates of the dataset object and what we want for this given coordinate is its metadata, so this loop is to get the metadata of the specific coordinate

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you are right though, there could be some performance improvements haha

@renaudjester renaudjester force-pushed the approximation-downloaded-data branch from d34b64c to b3b48d7 Compare September 5, 2024 12:09
@uriii3
Copy link
Collaborator

uriii3 commented Sep 5, 2024

Then I assume is good, no? although maybe I would like a short explanation backend wise, I don't think anything is going to break and we can upgrade the methodology with the feedback we get, no?

@renaudjester renaudjester merged commit a9c6ea9 into copernicusmarine-toolbox-v2 Sep 9, 2024
2 checks passed
@renaudjester renaudjester deleted the approximation-downloaded-data branch September 9, 2024 08:46
renaudjester added a commit that referenced this pull request Oct 28, 2024
We now display an approximation of the amount of data that will be downloaded next to the expected size of the final file.
renaudjester added a commit that referenced this pull request Oct 28, 2024
We now display an approximation of the amount of data that will be downloaded next to the expected size of the final file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants