-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
best practices for structuring nested experiment runs #416
Comments
Well I think your basic idea is right. However, given that you want to run the same experiment just with different configuration (model, data) you can reuse the same experiment object and just update the config. It would look something like this ex = Experiment("generic_experiment")
@ex.main
def run(dataset, model):
...
for dataset in datasets:
for model in models:
ex.run(config_updates={"dataset": dataset, "model": model},
options={'--name': f"{dataset}_{model}") For a more complete example you might look at Klaus' code https://github.com/Qwlouse/Binding/blob/master/run_evaluation.py |
@JarnoRFB thanks for the reply. In the end that's what I ended up doing but it was a bit more involved since I was using different files The other thing that I would like to ask is how would anyone go about saving validation folds in each experiment. Should that be in different columns using |
Sorry, but I cannot quite follow. A minimal code example would greatly help here.
What exactly do you want to save from the validation fold. If it is just a metric, e.g. accuracy, why not save this into a metric of the run? You can call
for each validation fold. It would also be nice if you could ask such general question under the |
I apologize for the confusion let me provide a MWE as you requested in order to make things clear
Yup that's doable but when you examine through omniboard I think it shows the validation loss of the last training epoch and not the best valid. In other words since _
Thanks for the pointer, wasn't aware of it, from now on I'll post related stuff there. |
I believe that if you do not set the step explicitly, it will append to the metrics array. If you want to see the current best validation metric, you could set it as a result. While the experiment is running with
and for the final result by returning the result value from the main function. See also https://sacred.readthedocs.io/en/latest/collected_information.html#live-information. This way it would be displayed in omniboard in the result column. On the ingredient issue I unfortunately cannot comment without looking a bit deeper into it. I have not really used ingredients myself. But do I see it right that you want to access parameters from the ingredient config in the experiment main function? |
Exactly, without having to again use |
Hi @kirk86, ingredients create their own namespace in the configuration, as if the values where part of a dictionary with the name of the ingredient. If you slightly modify your example to use a python-compatible name for the ingredient you can access it from there: import sacred
ingred = sacred.Ingredient('default_params')
ingred.add_config('some/yaml/file') # <--- added config
ingred.add_config('another/yaml/file') # <--- added config
class MyModel(object):
@ingred.capture
def __init__(self):
pass # do some stuff... import sacred
from src import MyModel, ingred
ex = sacred.Experiment('test_exper', ingredients=[ingred])
@ex.main # <--- this also works as capture decorator
def main(default_params):
param1 = default_params['param1']
param2 = default_params['param2']
MyModel() |
@Qwlouse |
@JarnoRFB, I came up with a similar solution but I’m having problems passing the dataset as an argument for the Is there any way to pass data to the |
@pedropalb Sorry not quite sure what you mean. I guess in the example I meant |
@JarnoRFB I see! I misunderstood your I’ve been using the dataset path as a config entry. But now I need to run multiple times with the same dataset. Passing the dataset path is not an option anymore since I would have to load it from the data path and preprocess it in every single call to the I need a way to pass the same loaded and preprocessed dataset to multiple ex = Experiment("generic_experiment")
dataset = None
@ex.command
def train(model):
global dataset
...
@ex.commnad
def run(dataset_paths, models):
for dataset_path in dataset_paths:
global dataset
dataset = load_dataset(dataset_path)
for model in models:
ex.run('train', config_updates={"model": model}, options={'--name': f"{dataset}_{model}") But I’m wondering if there is a better and elegant way to do it. Thanks! |
Hi folks,
I was wondering if there's a way to structure experiments for each each individual choice of dataset and algorithm.
For instance, you could have something in your code base like this:
Eventually since these runs are based on individual combo of dataset and algo you would want to have an experiment for each of them.
How would you go about doing that?
One very bad way I came up with is the following:
Please let me know what's a better way of doing that. Thank you!
The text was updated successfully, but these errors were encountered: