Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for hidden cells executed only at test time #72

Closed
adamjstewart opened this issue Jul 10, 2022 · 9 comments
Closed

Support for hidden cells executed only at test time #72

adamjstewart opened this issue Jul 10, 2022 · 9 comments
Labels
enhancement New feature or request

Comments

@adamjstewart
Copy link

Is your feature request related to a problem? Please describe.

Many notebooks are extremely time intensive to run. In machine learning, notebooks may involve downloading very large datasets or training a model over hundreds of epochs. A single notebook may take hours to run, but we would still like to test these notebooks quickly in CI.

Describe the solution you'd like

We already have nbmake support for skipping cells, it would be interesting to be able to add hidden cells that are executed only by nbmake. With this, I could have a normal cell like:

download = True
num_epochs = 100

that is executed by all users, then a hidden cell containing:

download = False
num_epochs = 1

that is only executed by nbmake. This would keep testing time down while keeping the notebook as simple as possible.

Hidden cells definitely aren't the only way to implement this. For my use case, I really just need to change the values of a couple of variables. If nbmake offered a way to do this in a configuration file where it automatically replaced variables with these values that would also suffice. Instead of a configuration file, I would be fine with including this metadata in the notebook itself.

My goal here is that the notebook appears as simple as possible and doesn't include any visible code that is specific to nbmake testing. Any solution that offers this is equally valid to me, I definitely welcome alternative proposals.

Describe alternatives you've considered

Some alternatives I've come across:

  • Use sed to replace the variable value during CI (difficult on Windows, doesn't work locally)
  • Use environment variables like num_epochs = os.environ.get('NUM_EPOCHS', 100) (ugly, doesn't work locally)

Additional context

Some examples of these alternatives in the wild:

@alex-treebeard
Copy link
Member

Hi Adam, this makes sense as an ask.

Do you have a preference as to how you would like to specify these hidden cells?

E.g. we could (a) inject some expressions into metadata tags, or (b) reference another notebook on the filesystem on include.

@adamjstewart
Copy link
Author

The former sounds simpler to me.

@alex-treebeard
Copy link
Member

alex-treebeard commented Jul 12, 2022

Ok, I'm having a think about this one for now.

Increasingly, I can see that it's necessary for nbmake users to customise execution in order to mock dependencies/config and assert things.

As a result, I'm considering exposing the nbclient instance so that you can execute python code during the test.

From your perspective, this may mean you tag a cell with my_module.set_epochs so you can invoke a python function to run this code.

Input welcome!

@adamjstewart
Copy link
Author

I'm not familiar with nbclient so I can't offer too much of an opinion of if this is the right thing to do. Is this akin to a hidden cell that runs only during testing, or more like an alternative cell that runs instead of the one seen by users?

@alex-treebeard
Copy link
Member

We would write test code outside of the ipynb file that you are testing (e.g. in a python script or another notebook).

My assumption is that your users may want to run the ipynb file after seeing the docs, therefore it should not contain test code.

@adamjstewart
Copy link
Author

Yes, the ipynb file should be able to run normally without the test code.

@alex-treebeard alex-treebeard mentioned this issue Jul 15, 2022
@alex-treebeard
Copy link
Member

@adamjstewart I have released v1.3.3a1 which allows you to mock variables after they are defined in a cell.

To do so, please add cell metadata to the cell which defines the variables. Nbmake will apply your mocks after the cell succeeds.

Example:

  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "nbmake": {
     "mock": {
      "x": 2,
      "y": "fish",
      "z": {
       "x": 42
      }
     }
    }
   },
   "outputs": [],
   "source": [
    "x = 5\n",
    "y = 'y'"
   ]
  },

Please provide feedback! It will determine if we continue in this direction.

@adamjstewart
Copy link
Author

Will test this out when I get a chance!

@alex-treebeard
Copy link
Member

Awesome, I've added docs in the readme and made this feature available in 1.3.3

@alex-treebeard alex-treebeard added the enhancement New feature or request label Jul 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants