Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why no pickle? #360

Closed
mathematicalmichael opened this issue Sep 24, 2019 · 10 comments · Fixed by #382
Closed

why no pickle? #360

mathematicalmichael opened this issue Sep 24, 2019 · 10 comments · Fixed by #382
Milestone

Comments

@mathematicalmichael
Copy link
Collaborator

Minimal working example:

import pickle
import bet.sample as sample
import os

s_set = sample.sample_set(2)
base_path = '/home/jovyan/work/'
model_path = os.path.join(base_path, 'BET')
with open(os.path.join(model_path, 'model.pkl'), 'wb') as out:
    pickle.dump(s_set, out)
    print('dumped to %s'%out)

with open(os.path.join(model_path, 'model.pkl'), 'rb') as inp:
    s_set_load = pickle.load(inp)
    print('loaded %s'%inp)

print(s_set, s_set_load)      

Is there any problem with saving/loading like this?

@smattis smattis added this to the Version 3 milestone Nov 22, 2019
@smattis
Copy link
Contributor

smattis commented Nov 22, 2019

I agree that pickle would be more efficient and should move towards that for version 3.

@mathematicalmichael
Copy link
Collaborator Author

@smattis do you think you could handle the move to pickle or should I attempt it?

@smattis
Copy link
Contributor

smattis commented Nov 25, 2019

I probably have more time to commit to it.

@mathematicalmichael
Copy link
Collaborator Author

mathematicalmichael commented Dec 22, 2019

I might have just done this as part of addressing SciPy upgrades and handling adaptive sampling... stay tuned.

@mathematicalmichael
Copy link
Collaborator Author

Yes. I think I figured it out.
sio.savemat changed it's behavior to mimic that of pickle (a safer approach, closing out file buffers instead of keeping them open). What breaks in adaptiveSampling from upgrading scipy is exactly the same error that starts to get thrown once the change to pickle is implemented.

meaning, pickle + old scipy = same problem as no pickle + new scipy.

I'll put in thirty minutes trying to resolve it, but if I can't, I'm going to throw a "skip test in parallel" bandaid for the time being so I can unblock myself from testing SciPy upgrades with the new sampling approach.

@mathematicalmichael
Copy link
Collaborator Author

confirmed: everything in the library works except for adaptive sampling when I upgrade spicy. Simply skipping in parallel for now. It's actually just the hotstart=1 (incomplete run) that fails, so I'm simply removing that one from the test.

@mathematicalmichael
Copy link
Collaborator Author

well, I got it to work without skipping. I don't have confidence in the hot start from partial run, the tests kind of looks incomplete for it, but ... whatever. better than skipping it altogether.

@mathematicalmichael
Copy link
Collaborator Author

mathematicalmichael commented Dec 29, 2019

update: I am now successfully running code from 2016-2017 using the new BET (with pickling).
Steps:

  1. Run 2to3 -w *.py in examples directory (I'm on @mpilosov 's fork of BET, on some dev branches corresponding to different experiments). Optionally autopep8 -i *.py for styling.

  2. Get through a couple undocumented dependencies (dolfin, pyprind) of scripts that aren't part of BET. conda/pip actually made this seamless.

  3. My scripts had hard-coded .mat extensions in a couple of locations, that was literally the only thing I had to swap out. sed would work but I only had five of them (found with grep) so I just changed them manually. Removed the extensions, because the dev branch of modern BET that I have installed doesn't add .pkl, it just leaves off extensions, to play nicer with backwards compatibility (since we supported a lack of extensions before anyway... they mean nothing in the end, so I support this choice).

So, that's great! My scripts create a bunch of directories (in each of these are the individual savefiles for each trial.

.
├── checkskew_and_plot.py
├── estimatedist_build_pointers.py
├── estimatedist_compute_metrics.py
├── estimatedist_funs.py
├── estimatedist_plotdata.py
├── estimatedist.py
├── estimatedist_setup.py
├── file_struct.txt
├── __pycache__
│   ├── estimatedist_funs.cpython-37.pyc
│   └── estimatedist_setup.cpython-37.pyc
└── results_heatrod_3
    ├── est_discretizations
    │   ├── rand_N_1280
    │   ├── rand_N_160
    │   ├── rand_N_20
    │   ├── rand_N_2560
    │   ├── rand_N_320
    │   ├── rand_N_40
    │   ├── rand_N_640
    │   └── rand_N_80
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_0.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_10.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_11.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_12.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_13.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_14.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_15.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_16.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_17.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_18.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_19.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_1.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_20.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_21.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_22.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_23.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_24.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_2.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_3.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_4.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_5.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_6.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_7.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_8.npy
    │       ├── ptr_from_rand_I_100000_to_rand_N_80_trial_9.npy
    │       ├── rand_N_80_trial_0
    │       ├── rand_N_80_trial_1
    │       ├── rand_N_80_trial_10
    │       ├── rand_N_80_trial_11
    │       ├── rand_N_80_trial_12
    │       ├── rand_N_80_trial_13
    │       ├── rand_N_80_trial_14
    │       ├── rand_N_80_trial_15
    │       ├── rand_N_80_trial_16
    │       ├── rand_N_80_trial_17
    │       ├── rand_N_80_trial_18
    │       ├── rand_N_80_trial_19
    │       ├── rand_N_80_trial_2
    │       ├── rand_N_80_trial_20
    │       ├── rand_N_80_trial_21
    │       ├── rand_N_80_trial_22
    │       ├── rand_N_80_trial_23
    │       ├── rand_N_80_trial_24
    │       ├── rand_N_80_trial_3
    │       ├── rand_N_80_trial_4
    │       ├── rand_N_80_trial_5
    │       ├── rand_N_80_trial_6
    │       ├── rand_N_80_trial_7
    │       ├── rand_N_80_trial_8
    │       └── rand_N_80_trial_9
    ├── est_solutions
    │   ├── QoI_choice_1
    │   │   ├── reg_M_1
    │   │   ├── reg_M_4
    │   │   └── reg_M_9
    │   └── QoI_choice_2
    │       ├── reg_M_1
    │       ├── reg_M_4
    │       └── reg_M_9
    ├── integration_sets
    │   └── rand_IntSet_100000
    ├── postprocess_rectscale_6
    │   └── reg_BigN_40000
    │       └── reg_M_1
    └── ref_solutions
        ├── Partition_Disc-reg_M_1
        ├── Partition_Disc-reg_M_4
        ├── Partition_Disc-reg_M_9
        └── reg_BigN_40000
            ├── ptr_from_rand_I_100000_to_reg_BigN_40000.npy
            ├── Reference_Disc-reg_BigN_40000
            ├── reg_M_1
            ├── reg_M_4
            └── reg_M_9

29 directories, 357 files



This allows for swapping out metrics and choices of observed distribution to create convergence plots really quickly.

I'll close out this issue once I get the develop branch fully validated. These simulations running with so few changes is a good validation test.

I'll create a separate repo to house the example above.

@mathematicalmichael
Copy link
Collaborator Author

(PS that example takes up 1/3 GB).

@mathematicalmichael
Copy link
Collaborator Author

pickle default in v3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants