Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify JPL and UMD virtual computing environments #44

Open
wkiri opened this issue Jul 20, 2021 · 14 comments
Open

Unify JPL and UMD virtual computing environments #44

wkiri opened this issue Jul 20, 2021 · 14 comments
Assignees

Comments

@wkiri
Copy link
Collaborator

wkiri commented Jul 20, 2021

We are seeing slightly different behavior in DORA runs between the JPL and UMD virtual environments, which probably means a different version of some Python package(s) are employed, and likely can be resolved by updating setup.py to specify those package versions as well.

@hannah-rae noted these issues:

  • When running the planetary_rover PNG sample case, the PNG images seem to be read in using a different ordering
  • When analyzing the planetary_rover PNG images, the selections are roughly the same but the scores are different.
@wkiri
Copy link
Collaborator Author

wkiri commented Aug 31, 2021

@hannah-rae Was this resolved?

@hannah-rae
Copy link
Contributor

No, not yet, but it is on my to do list.

@bdubayah
Copy link
Contributor

I think the issue might be partly related to how os.listdir or glob.glob lists directories on different machines (used when the image loader loads images). When listing the files in the fmnist or planetary test directory, I get different orderings on the UMD cluster vs my local machine. This causes a labels.csv file to be totally wrong between machines. One fix could be sorting directory contents once they're loaded, and making sure labels correspond to that order. Or, labels files could use the filename/sample id rather than it's index.

@jakehlee
Copy link
Collaborator

@wkiri and I ran into this when running experiments for our DEMUD paper - every glob.glob() or os.listdir() call should be wrapped by a sorted(), the lists/iterators they return is in some arbitrary order determined by the individual filesystem.

https://docs.python.org/3/library/os.html#os.listdir

Return a list containing the names of the entries in the directory given by path. The list is in arbitrary order...

@wkiri
Copy link
Collaborator Author

wkiri commented Sep 18, 2021

My preference would be for labels.csv to use an identifier for each item (as noted by @bdubayah) instead of relying on ordering. In addition to increasing robustness across machines, it would mean we can more easily change the experiment to include/exclude items without having to regenerate every line in this file. This makes sense for individually named items like the images in an image data set. It's less clear how it would work for some of our other data set types. Ideas welcome :)

@bdubayah
Copy link
Contributor

What does everyone think of this approach? I changed a few lines in the data loader so that each sample would have a string id (just converted the sample indexes to a string for tabular data), and then in the results organization used data id rather than data index to make the comparison plot (so you could run a modified experiment with a exhaustive labels file). We would still need to change the combined plot script but wanted to get thoughts first.

@wkiri
Copy link
Collaborator Author

wkiri commented Oct 14, 2021

@bdubayah Yes, this looks great!

I think the update in dora_results_organization.py to read string names instead of integers from the labels file should also occur in combined_plot_script.py. It looks like some additional changes are needed to the latter script too. I will work on this. In the meantime, is this branch ready to merge? (issue44-unify-envs)

@bdubayah
Copy link
Contributor

Yes, it's good to go (aside from the combined plot issues you mentioned). I think the labels files for the experiments will need to be updated as well.

@wkiri
Copy link
Collaborator Author

wkiri commented Oct 14, 2021

That's right. I'm updating the planetary experiment label files, but it's a good point that this will trigger updates needed for the other use cases too.

@bdubayah
Copy link
Contributor

👍 Should I merge this to master or did you want to include the combined plot script in the same PR?

@wkiri
Copy link
Collaborator Author

wkiri commented Oct 14, 2021

@bdubayah Let me commit the updates to that script. It's worth alerting the team that this merge may break compatibility with experiments until folks update their label files, too.

@wkiri
Copy link
Collaborator Author

wkiri commented Oct 14, 2021

@bdubayah Ok, it should be ready if you want to take a look.

Note that I also changed the y axis to start from 0, since it's possible for an algorithm to not select at least one novel item in the beginning.

@bdubayah
Copy link
Contributor

Looks good to me!

@wkiri
Copy link
Collaborator Author

wkiri commented Oct 15, 2021

@bdubayah Feel free to PR when ready!

bdubayah added a commit that referenced this issue Oct 17, 2021
- Rely on data id instead of row index to match results to labels file, addressing an issue where os.listdir provides inconsistent ordering across machines
- Use data id from labels.csv instead of row number for comparison plots, allowing experiments to be run on a subset of the test data
- Ensure all data ids are strings

Co-authored-by: Kiri Wagstaff <[email protected]>
@bdubayah bdubayah mentioned this issue Oct 20, 2021
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants