Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor CurveAnalysis tutorial description of ScatterTable #1398

Merged
merged 3 commits into from
Feb 10, 2024

Conversation

wshanks
Copy link
Collaborator

@wshanks wshanks commented Feb 8, 2024

This change refactors the description of ScatterTable in the CurveAnalysis tutorial to try to provide a more intuitive picture of what series and category are. It tries to the split the discussion into two parts: first a high level description of what the terms are intended to mean and then a walkthrough of the default CurveAnalysis flow pointing out how the terms are used in practice.

The short version is that series and category could be used as arbitrary labels, but in practice category is used to distinguish stages of data processing ("raw", "formatted", "fitted") while series is used to label data associated with a specific function (a model but not necessarily one written out as a fit model for the class). Together series and category (and analysis if considering a CompositeCurveAnalysis subclass) should distinguish a specific y vs x curve.

This change refactors the description of `ScatterTable` in the
`CurveAnalysis` tutorial to try to provide a more intuitive picture of
what series and category are. It tries to the split the discussion into
two parts: first a high level description of what the terms are intended
to mean and then a walkthrough of the default `CurveAnalysis` flow
pointing out how the terms are used in practice.

The short version is that `series` and `category` could be used as arbitrary
labels, but in practice `category` is used to distinguish stages of data
processing ("raw", "formatted", "fitted") while `series` is used to
label data associated with a specific function (a model but not
necessarily one written out as a fit model for the class). Together
`series` and `category` (and `analysis` if considering a
`CompositeCurveAnalysis` subclass) should distinguish a specific y vs x
curve.
@wshanks
Copy link
Collaborator Author

wshanks commented Feb 8, 2024

I marked this as a draft because I haven't tried getting the sphinx syntax correct yet.

@wshanks wshanks added the backport stable potential The issue or PR might be minimal and/or import enough to backport to stable label Feb 8, 2024
@coruscating coruscating added this to the Release 0.7 milestone Feb 8, 2024
Copy link
Collaborator

@nkanazawa1989 nkanazawa1989 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Will. I really like new documentation. It's very easy to understand thanks to good amount of examples. I also checked html artifact and docs look okey. I think we can merge after you undraft the PR.

for which the default behavior is to average all of the y values within a
series with the same x value.
The formatted data are added to the :class:`.ScatterTable` with the same
`series` labels and the category of ``"formatted"``.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`series` labels and the category of ``"formatted"``.
`series` labels and the category of ``"formatted"``.

Only this formatted is formatted as a code. I'm fine with this is intentional :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I changed this to italics. I am unsure about what formatting is most consistent here. There are table column names, string table values, and generic terms like "series" as a set of data rather than the column for labeling that set of data. I tried to use italics for the column names whether referring to the column names specifically (or to the concept in the context of the column name but not for the concept on its own -- I didn't italicize all use of "series") and similarly use italicized and quoted text when referring to table values like "fitted". I could also see using code as the way to interact with this table would be to write out these names and values as strings in the code.

@wshanks wshanks marked this pull request as ready for review February 10, 2024 04:26
@wshanks
Copy link
Collaborator Author

wshanks commented Feb 10, 2024

Thanks @nkanazawa1989!

By the way, one other thing I noticed while working on this was that some terminology might not be ideal. For instance, the "raw" data is what comes out of the data processor, so from some perspective it is more "processed" than "raw".

Also, I wonder if CurveAnalysis should be refactored a little to be more readable. Maybe the documentation should be enough but someone wanting to subclass it and replace parts might look at the code to under exactly how it works. One small example is that table = self._format_data(self._run_data_processing(experiment_data.data())) might become two lines so it is more clear visually that there are two steps that the curve analysis writer could override.

@wshanks wshanks enabled auto-merge February 10, 2024 04:34
@wshanks wshanks added this pull request to the merge queue Feb 10, 2024
Merged via the queue into qiskit-community:main with commit 3d211bf Feb 10, 2024
11 checks passed
mergify bot pushed a commit that referenced this pull request Feb 10, 2024
This change refactors the description of `ScatterTable` in the
`CurveAnalysis` tutorial to try to provide a more intuitive picture of
what series and category are. It tries to the split the discussion into
two parts: first a high level description of what the terms are intended
to mean and then a walkthrough of the default `CurveAnalysis` flow
pointing out how the terms are used in practice.

The short version is that `series` and `category` could be used as
arbitrary labels, but in practice `category` is used to distinguish
stages of data processing ("raw", "formatted", "fitted") while `series`
is used to label data associated with a specific function (a model but
not necessarily one written out as a fit model for the class). Together
`series` and `category` (and `analysis` if considering a
`CompositeCurveAnalysis` subclass) should distinguish a specific y vs x
curve.

(cherry picked from commit 3d211bf)
coruscating pushed a commit that referenced this pull request Feb 11, 2024
…1398) (#1405)

This is an automatic backport of pull request #1398 done by
[Mergify](https://mergify.com).


---


<details>
<summary>Mergify commands and options</summary>

<br />

More conditions and actions can be found in the
[documentation](https://docs.mergify.com/).

You can also trigger Mergify actions by commenting on this pull request:

- `@Mergifyio refresh` will re-evaluate the rules
- `@Mergifyio rebase` will rebase this PR on its base branch
- `@Mergifyio update` will merge the base branch into this PR
- `@Mergifyio backport <destination>` will backport this PR on
`<destination>` branch

Additionally, on Mergify [dashboard](https://dashboard.mergify.com) you
can:

- look at your merge queues
- generate the Mergify configuration with the config editor.

Finally, you can contact us on https://mergify.com
</details>

Co-authored-by: Will Shanks <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport stable potential The issue or PR might be minimal and/or import enough to backport to stable
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants