Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: doctest for code-cell #290

Open
jakevdp opened this issue Jan 15, 2021 · 13 comments
Open

Feature Request: doctest for code-cell #290

jakevdp opened this issue Jan 15, 2021 · 13 comments
Labels
enhancement New feature or request

Comments

@jakevdp
Copy link
Contributor

jakevdp commented Jan 15, 2021

It would be useful to be able to doctest myst-nb notebooks within a sphinx project. For example, sphinx.ext.doctest provides the testcode and testoutput directives that let you do something like this in myst-md:

```{testcode}
1 + 2
```

```{testoutput}
4
```

It would be nice if a similar mechanism existed to specify expected outputs for myst-nb code-cells; e.g.

```{code-cell}
1 + 1
```

```{code-cell-output}
2
```

If this were automatically tested in the same way that testcode and testoutput are, it would make it easier to detect when myst-nb docs get stale.

@jakevdp jakevdp added the enhancement New feature or request label Jan 15, 2021
@welcome
Copy link

welcome bot commented Jan 15, 2021

Thanks for opening your first issue here! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out EBP's Code of Conduct. Also, please try to follow the issue template as it helps other community members to contribute more effectively.

If your issue is a feature request, others may react to it, to raise its prominence (see Feature Voting).

Welcome to the EBP community! 🎉

@choldgraf
Copy link
Member

I think this is a cool idea...I wonder if this could also be part of the spec of myst notebooks so that they could include outputs as well?

@jakevdp
Copy link
Contributor Author

jakevdp commented Jan 20, 2021

If the team is open to it, I'd like to work on adding code-cell-output to the myst notebook spec. Some semi-organized thoughts:

Jupyter notebooks can have multiple outputs per cell; this could be represented in myst-nb by having multiple code-cell-output blocks immediately following a code-cell block. They main types of output are listed here: they are execute_result/display_data, stream, and error.

I think it would make sense for the spec to be something like this:

```{code-cell}
import sys
print('this is the stderr content', file=sys.stderr)
print('this is the stdout content')
display(1 + 1)
raise ValueError("some error reason")
```

```{code-cell-output} display_result     // display_result is default, and thus optional
:type: text/plain                        // text/plain is the default for display_result, and thus optional
2
```

```{code-cell-output} stream
:type: stdout                            // stdout is the default for stream, and thus optional
this is the stdout content
```

```{code-cell-output} stream
:type: stderr
this is the stderr content
```

```{code-cell-output} error
:ename: ValueError                       // optional
:evalue: error string                    // optional
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
----> 1 raise ValueError("error string")

ValueError: error string
```

If a code-cell-output did not immediately follow either a code-cell or another code-cell-output, it would result in a parsing error.

Note that this is tabling the idea of doctesting for the time being, but this seems like a good foundation on which doctesting could be built.

Would the team support an approach along these lines?

@chrisjsewell
Copy link
Member

Heya,

Perhaps code-cell-assert would be a better name, since it is an assertion of what the output should be, not the actual output.

If a code-cell-output did not immediately follow either a code-cell or another code-cell-output, it would result in a parsing error.

this is probably the tricky part, note I'm changing how the AST works quite a bit soon (currently on a personal branch), so I would be wary of relying on exactly how it is now.

Just to note some other potential solutions (not to say they are better than your one); with pytest-notebook and nbval you can have a fully populated notebook then test that the same outputs are present on re-execution.
Then the other way would be to have these expected outputs in the code cell's metadata, something like

```{code-cell}
---
expected:
  outputs/0/output_type: stream
  outputs/0/text: printed 
  outputs/1/output_type: error
  outputs/1/evalue: oopsie!
---

print("printed")
raise ValueError("oopsie!")
```

@jakevdp
Copy link
Contributor Author

jakevdp commented Jan 20, 2021

I was imagining this as a more general thing than asserting outputs. Currently ipynb can store code cell outputs within the file structure, while myst-nb has no mechanism to do so. I'm proposing a general mechanism whereby myst-nb could store cell outputs.

This would allow approaches like pytest-notebook or nbval to be used on myst-nb notebooks, because these tools pre-suppose that the notebook file is able to store its expected outputs.

Does that make sense?

@chrisjsewell
Copy link
Member

I'm proposing a general mechanism whereby myst-nb could store cell outputs.

eurgh, personally I don't think its a good idea to be storing outputs in a Markdown file; trying to store heirarchical data in a non-heirarchical format is just going to lead to overly verbose, unreadable files.
You might as well just use an ipynb then, or if you wanted it to be a bit more human-readable then have a simple two-way conversion from JSON <-> YAML

At any rate, I would not really want to support this until it was supported in jupytext, i.e. there has to be two-way conversion between the Markdown files and Notebooks: see mwouts/jupytext#220 for discussion

@chrisjsewell
Copy link
Member

I'm open to discussion, I'm just not convinced it is a good idea to basically try to turn a Markdown file into a JSON one

@jakevdp
Copy link
Contributor Author

jakevdp commented Jan 20, 2021

OK, fair enough. If you’re not open to storing outputs in markdown, are you open to potentially supporting a different way to validate code output in-situ, since the testing tools you mentioned require the presence of outputs in the file?

@choldgraf
Copy link
Member

good point about jupytext @chrisjsewell - agreed that this is probably where text-based outputs should be instead of something that myst-specific

@jakevdp for context, we use jupytext for all of the myst <--> ipynb conversion. Jupytext isn't myst-specific, it can also output to a bunch of other text-based formats. I know that @mwouts, the lead dev on jupytext, is really interested in how to combine outputs with text-based notebooks (discussed in mwouts/jupytext#220).

Perhaps we could brainstorm a bit in that issue? Or @mwouts could provide his latest thinking on this topic? I do think it warrants a bit of thought and discussion that'd benefit from multiple perspectives, since we do run the risk of just re-creating all of the problems with ipynb un-readability but in a markdwon format instead of json. That said, I think this is a pretty commonly asked-for feature so it would be nice to have some kind of option for people.

Shall we take conversation around "generic outputs for text-based notebooks" into a different issue (or the jupytext one) so that we can continue discussing the right way to handle assert-style statements in a myst notebook?

back to testing now

One other thought that we could consider following for the output tests is something similar to pytest-regressions. Basically write the output structure to a file, and then test that the file is un-changed in subsequent runs. Benefits are that you don't clutter the md file with outputs, but OTOH it means those outputs aren't as glance-able to sanity check. (but in the future, if mwouts/jupytext#220 is resolved, perhaps those external assets could be a part of the jupytext spec, and thus could have some nice UI support, just thinking out loud here)

@jakevdp
Copy link
Contributor Author

jakevdp commented Jan 20, 2021

Thanks for the context, @choldgraf. I was aware of the code at jupytext, but I thought that this repo was the source of truth for the myst-nb spec. If it makes sense to discuss there instead, then we can do that.

@choldgraf
Copy link
Member

I do think that this repo is the "source of truth", it's just that one of MyST's principles is to piggy-back off of other projects and their patterns/standards as much as possible, rather than invent things that are myst-specific. I think it'd potentially be a more wide-reaching and impactful feature if outputs were supported in Jupytext, and then we could inherit those into MyST-NB and start upstreaming improvements. Does that make sense?

@mwouts
Copy link

mwouts commented Jan 21, 2021

Hi everyone, yes the project to add outputs to Jupytext's Markdown formats mwouts/jupytext#220 has been there for a while!

What I am waiting for is the following: I'd like to find a satisfactory way to include a text file within a Markdown file. By satisfactory I mean that (just like for images, and for HTML files)

  • the inclusion is just a one liner in the .md file
  • and there is at least one editor (VS Code?) / renderer (GitHub) that can show the result of the inclusion when looking at the .md file.

Do you think this will happen?

@psychemedia
Copy link
Contributor

Re: {testcode} directives, I note a complementary approach taking in testbook where a test script loads in appropriately labeled cells from a specified notebook and then executes and tests against those.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants