Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fit model to CurveAnalysis #726

Conversation

nkanazawa1989
Copy link
Collaborator

Summary

This PR is the first step to introduce group in #715. The purpose of the PR is to add a fit model class to CurveAnalysis and to remove curve_fitter from analysis option.

Details and comments

According to the review comments from @yaelbh and @eggerdj, the complexity of the CurveAnalysis is likely an issue to further extend the class. This PR introduces new class FitModel and its subclass SingleFitFunction and CompositeFitFunction that user can also check to grasp the model without reading the code. For example:

>>> from qiskit_experiments.curve_analysis.standard_analysis import ErrorAmplificationAnalysis
>>> ErrorAmplificationAnalysis().fit_model

SingleFitFunction(x, amp, d_theta, phase_offset, base, angle_per_gate)

>>> from qiskit_experiments.library.characterization.analysis import FineAmplitudeAnalysis
>>> FineAmplitudeAnalysis().fit_model

CompositeFitFunction(x, amp, d_theta, base; @ Fixed angle_per_gate, phase_offset)

Now you can recall FineAmplitudeAnalysis does SPAM correction with extra experiment, and also some parameters are fixed during the fitting.

From software point of view, the complexity of multi objective optimization is all offloaded to the fit model, and the rest of code doesn't need to take care how the fit model is implemented. If there is only single curve in the analysis, it implicitly uses SingleFitFunction which implements lightweight logic to compute y data.

@nkanazawa1989 nkanazawa1989 requested review from yaelbh and eggerdj March 8, 2022 23:02
@yaelbh
Copy link
Collaborator

yaelbh commented Mar 9, 2022

@nkanazawa1989 I need more documentation in order to review this PR. Required are both inline comments inside methods, to describe the inner variables, their data structures and the manipulations that they're going through; and more detailed doc strings, with more explanation and examples.

Specifically I started with the new file fit_model.py. It looks like, when fit_models is not None, then its length should be equal to the number of fit functions. This is what I understand, but I'm not sure if I understand correctly, because such a requirement is not documented nor checked. I also understand that fixed_parameters, although it is a list like all the other constructor arguments, is not restricted to any length, because (again, to my understanding) it's a list of parameter names that are shared among all the fit functions. If so, then it should be documented, otherwise one may tend to think it's again (like the previous arguments) a list where each element corresponds to a fit function.

Trying to figure out what's the role of the _uniton_params data member, I thought it'd be easier if I first review the sub-class SingleFitFunction. But then in the args documentation, I don't understand what you mean by "Composite X values array" and "Variadic argument provided from the fitter". Also, the types are missing (I only know that x is an array, but I don't know the type of elements of the array).

When strings are involved - in signatures and fit models - it's important to specify the expected syntax of the strings. Namely, how to write a string that implements to a certain signature.

@nkanazawa1989 nkanazawa1989 force-pushed the upgrade/cleanup-curve-analysis-add-fit-model branch from 79193a8 to 8be5f0c Compare March 9, 2022 10:00
@nkanazawa1989
Copy link
Collaborator Author

I added more documentation. Basically this PR combines parameter mapping code distributed across multiple functions/files into a single place as FitModel so there is nothing changed drastically (i.e. this is just a cleanup). The FitModel is mainly called internally, thus I didn't provide enough description for the data model (remain unchanged as well). Anyways I hope new documentation added in 8be5f0c helps your review.

These are the logic currently implementing the curve analysis:
https://github.com/Qiskit/qiskit-experiments/blob/d27fe41cec995072fa1db02a14d8437028d81a4e/qiskit_experiments/curve_analysis/curve_fit.py#L27-L35

and parameter mapping is done in
https://github.com/Qiskit/qiskit-experiments/blob/d27fe41cec995072fa1db02a14d8437028d81a4e/qiskit_experiments/curve_analysis/curve_fit.py#L326-L332

I also removed Composite to avoid confusion. For the variadic arguments you can refer to the API docs of scipy curve fit. There is no detailed docs there too.
https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html

Copy link
Contributor

@eggerdj eggerdj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use this PR as an opportunity to refactor how fixed_parameters are defined. Overall the class FitModel and its sub-classes seems like a nice idea.

qiskit_experiments/curve_analysis/fit_models.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/fit_models.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/fit_models.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/fit_models.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/fit_models.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/fit_models.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/curve_data.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/curve_analysis.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/fit_models.py Outdated Show resolved Hide resolved
nkanazawa1989 and others added 4 commits March 9, 2022 23:14
qiskit_experiments/curve_analysis/fit_models.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/fit_models.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/fit_models.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/fit_models.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/fit_models.py Outdated Show resolved Hide resolved
upgrade:
- |
Analysis option `curve_fitter` of the :class:`CurveAnalysis` has be removed
because of the serialization problem. To use custom curve fitting library,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which serialization problem?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know, the callable is not serializable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
because of the serialization problem. To use custom curve fitting library,
because callable lambda functions are not serializable. To use a custom curve fitting library,

?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite true. Please check the update
cd19b81

:meth:`CurveAnalysis.curve_fit` has been added to the curve analysis and
its subclasses. Now you can directly access to the core fitting code
with bare numpy arrays representing data to be fit.
This may help debugging of new fit function.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This may help debugging of new fit function.
This may help debugging of new fit functions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a reference to an example that demonstrates how to use the new feature for debugging fit functions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no example, but user can directly input ndarrays without constructing ExperimentData container. Since there is no user guide how to construct experiment data, it is very tough to test fit functions in the curve analysis without reading the code of experiment data which is super long. I can remove

This may help debugging of new fit functions.

If I need to write mode docs just for it.

qiskit_experiments/curve_analysis/curve_analysis.py Outdated Show resolved Hide resolved
nkanazawa1989 and others added 2 commits March 14, 2022 23:51
Co-authored-by: Yael Ben-Haim <[email protected]>
@nkanazawa1989 nkanazawa1989 force-pushed the upgrade/cleanup-curve-analysis-add-fit-model branch from 95855cb to 17560c3 Compare March 14, 2022 15:09
@nkanazawa1989 nkanazawa1989 force-pushed the upgrade/cleanup-curve-analysis-add-fit-model branch from 08f7cbc to a7f5375 Compare March 14, 2022 17:46
Copy link
Contributor

@eggerdj eggerdj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still a bit confused by this PR. See my questions that try to clarify. It seems like there is a tight coupling between SeriesDef and FitModel. This is fine but the code does not make it very explicit. I wonder if both classes are needed. Or if SeriesDef should be called FitModelConfig and FitModel should be initialized form the config, i.e. FitModel.from_config(config: FitModelConfig)?

qiskit_experiments/curve_analysis/curve_analysis.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/curve_analysis.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/curve_analysis.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/curve_data.py Outdated Show resolved Hide resolved

y = np.zeros(x.size, dtype=float)
for i, (func, sig) in enumerate(zip(self._fit_functions, self._signatures)):
inds = self.data_allocation == i
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are so worried about performance, wouldn't a dict be better to have O(1) lookup? E.g.

inds = self.data_allocation[i]

This should replace O(n) for the == with O(1) for the dict look-up.

upgrade:
- |
Analysis option `curve_fitter` of the :class:`CurveAnalysis` has be removed
because of the serialization problem. To use custom curve fitting library,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
because of the serialization problem. To use custom curve fitting library,
because callable lambda functions are not serializable. To use a custom curve fitting library,

?

test/curve_analysis/fit_options.py Outdated Show resolved Hide resolved
test/curve_analysis/fit_options.py Outdated Show resolved Hide resolved

function = SingleFitFunction(
fit_functions=[child_function],
signatures=[["par0", "par1"]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if I give signatures=[["dummy1", "dummy2"]] i.e. the names do not match? I though that part of this whole exercise is so that we do not need to worry about the signature of the fit_functions. Looking at SeriesDef I see that SeriesDef figures this out. Would it be more natural to do this inspection in FitModel instead of SeriesDef or something like FitModel.from_series_def?

Co-authored-by: Daniel J. Egger <[email protected]>
@nkanazawa1989 nkanazawa1989 force-pushed the upgrade/cleanup-curve-analysis-add-fit-model branch from 593c393 to ad70b0d Compare March 15, 2022 13:53
@nkanazawa1989 nkanazawa1989 force-pushed the upgrade/cleanup-curve-analysis-add-fit-model branch from 8c71a8a to 50d984a Compare March 15, 2022 14:32
@nkanazawa1989
Copy link
Collaborator Author

nkanazawa1989 commented Mar 15, 2022

Thanks @eggerdj . Indeed I'm thinking of FitModel.from_config(config: List[SeriesDef]) in the follow-up. Originally this is the PR just to add fit model with minimum functionality. Note that SeriesDef and FitModel are different object. The series def contains the filtering keyword for data extraction and default drawing properties. Thus, series def = fit model + data processor + drawer. I intentionally exclude filter kwargs and draw options from the fit model to avoid creating super complicated object that can do everything.

(EDIT)
Adding factory method can be done with this PR. This is good for simplification and probably explicitly showing the coupling between series defs and fit model. See 3c9040f

Copy link
Contributor

@eggerdj eggerdj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty much fine with this PR my comments are only minor. @yaelbh can you take a second look?

qiskit_experiments/curve_analysis/curve_analysis.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/curve_analysis.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/curve_analysis.py Outdated Show resolved Hide resolved
qiskit_experiments/curve_analysis/fit_models.py Outdated Show resolved Hide resolved
@nkanazawa1989 nkanazawa1989 mentioned this pull request Mar 16, 2022
7 tasks
Co-authored-by: Daniel J. Egger <[email protected]>
@nkanazawa1989 nkanazawa1989 force-pushed the upgrade/cleanup-curve-analysis-add-fit-model branch from ddedefa to 08c65fc Compare March 16, 2022 17:05
nkanazawa1989 added a commit to nkanazawa1989/qiskit-experiments that referenced this pull request Mar 17, 2022
@zlatko-minev
Copy link

The lmfit package (Non-Linear Least-Squares Minimization and Curve-Fitting for Python) is pretty powerful with a lot of out of the box features and i've used with good success in the past for fitting qubit experiments on 1D and 2D real and complex data.

It has the Parameter and Parameters classes that can be also populated by a guess function that can be overwritten by user for iniitial guess. It tracks the convergence etc pretty nicely and ultimately make model fitting and composition of models possible with a decently nice user interface.

There's a lot of examples here https://lmfit.github.io/lmfit-py/examples/index.html

@nkanazawa1989
Copy link
Collaborator Author

Thanks @zlatko-minev I did detailed investigation for lmfit. Looks like this is a great tool and we can offload most of functionalities to the fitter there. However there is a blocker in multi-objective optimization (it's called multiple data sets in the lmfit example).

This is quite important in our experiment library (e.g. Ramsey XY, fine amplitude, HEAT, CR Hamiltonian tomography, IRB, etc...) but the model interface for this is not supported in the library.

TODO: this could/should be using the Model interface / built-in models!

For example, in the fine amp example,

from lmfit import Model, CompositeModel
import numpy as np
import matplotlib.pyplot as plt

def spam_cal(x, amp, base):
    return base + 0.5 * amp * (2 * x - 1)

def ping_pong(x, amp, d_theta, phase_offset, base, angle_per_gate):
    return 0.5 * amp * np.cos((d_theta + angle_per_gate) * x - phase_offset) + base

# x values for each model
x1 = np.array([0, 1])
x2 = np.arange(0, 15)

# some parameters
amp = 0.9
d_theta = 0.01
phase_offset = np.pi/2
base = 0.48
angle_per_gate = np.pi

# simulated y data
data1 = spam_cal(x1, amp, base)
data2 = ping_pong(x2, amp, d_theta, phase_offset, base, angle_per_gate)

# standalone model
# How to combine them? Currently lmfit doesn't provide interface to combine.
# Note that this is not CompositeModel where it takes the same x-value and different parameters.
m1 = Model(spam_cal)
m2 = Model(ping_pong)

# so we need to dynamically generate function like below
def composite_func(x, amp, base, d_theta, phase_offset, angle_per_gate, separator):
    # giving these as kwargs is more efficient but lmfit Model parses signature for the model
    # **kwargs doesn't provide any explicit parameter name
    params = {
        "amp": amp,
        "base": base, 
        "d_theta": d_theta, 
        "phase_offset": phase_offset, 
        "angle_per_gate": angle_per_gate,
    } 
    y = []
    for mi, xi in zip((m1, m2), np.split(x, separator)):
        y.append(mi.eval(x=xi, **{p: params[p] for p in mi.param_names}))
    return np.concatenate(y, 0)

# this is what we need
composite_model = Model(composite_func)

Considering this, I concluded we cannot employ lmfit library until it supports multi-objective model. However, I still think this is great library because it can provide more statistical information on the fitting, and it can hide uncertainties package (unumpy functions) from the fit model. I'll continue to watch this library.

@zlatko-minev
Copy link

That's interesting. I definitely used it for Ramsey XY fitting and multiple data sets before. You just concatenate the data from the different sets into a large one dimensional array, my experience that usually works pretty well.

@nkanazawa1989
Copy link
Collaborator Author

Yes, their fitter (minimize) itself supports multi-objective with concatenated array so we need to use the fitter directly. However this doesn't provide any gain over what we have now, i.e. just replacement of the fitter from spicy.curve_fit from lmfit.minimize. The Model interface they implemented (and I'm going to implement with this PR) is nice object to cleanup the data structure, so perhaps it's worth implementing MultiModel class by ourselves.

Another approach would be merging this PR as-is, and write LmfitCurveAnalysis and gradually deprecate CurveAnalysis to migrate.

@nkanazawa1989
Copy link
Collaborator Author

nkanazawa1989 commented Mar 29, 2022

Perhaps concatenated array that you mention is a technique to use Model object, i.e. combining two fit functions into a single function with concatenated array. The trickily part is, we allow sub-models to have different signature, for example,

F(x, a, b, c) = F1(x1, a, b) \oplus F2(x2, b, c) where x = x1 \oplus x2 (concatenated)

In this case we need to dynamically generate F from the signature of F1 and F2 without using python eval. This is the reason I gave up lmfit.

@nkanazawa1989
Copy link
Collaborator Author

Replaced with #806 with different implementation with LMFIT.

@nkanazawa1989 nkanazawa1989 deleted the upgrade/cleanup-curve-analysis-add-fit-model branch October 27, 2022 06:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants