CurveAnalysis refactoring #737

nkanazawa1989 · 2022-03-16T16:45:20Z

Background

The CurveAnalysis can perform multi-objective optimization with different fit functions. However, if we want to perform two independent fittings, we need to implement the experiment as batch experiment. This is quite heavy coding/maintenance overhead and doesn't scale. For example, we need to update CR Ham tomo analysis from the fitting of F(px0, py0, pz0, px1, py1, pz1, b) to a set of F0(px, py, pz, b) @ ctrl=0 and F1(px, py, pz, b) @ ctrl=1 for speedup (see #668). This is naive implementation of this analysis

Current structure

CrossResonanceHamiltonian (experiment)
CrossResonanceHamiltonianAnalysis (analysis)

Naive implementation

TomographyElement(BaseExperiment) (experiment for each control state)
CrossResonanceHamiltonian(BatchExperiment) (experiment for batch, instantiate sub experiments)
TomographyElementAnalysis(CurveAnalysis) (analysis for each control state)
CrossResonanceHamiltonianAnalysis(CompositeAnalysis) (analysis for batch; compute Hamiltonian coefficient)

Here CrossResonanceHamiltonian should implement weird logic to synchronize all experiments, analysis, transpile, run options of itself to sub experiments so that it behaves as if a single experiment (since user doesn't expect one is running it as a batch experiment). As you can see, this is pretty much overhead for just separating fit functions.

New implementation

Here we are going to introduce new concept group to the curve analysis.

group: A collection of series.
series: A collection of curve.
curve: A definition of fit function (fitter), data filtering keywords (data processing), and drawing options (drawer)
To avoid complication, the group is implemented on the separate class.

New CR Hamiltonian analysis may look like

class CrossResonanceHamiltonianAnalysis(GroupCurveAnalysis):
    __group__ = {
        "ctrl0": [
          curve.SeriesDef(
              name="x|c=0",
              fit_func=lambda x, t_off, px0, py0, pz0, b: curve.fit_function.bloch_oscillation_x(
                  x + t_off, px=px0, py=py0, pz=pz0, baseline=b
              ),
              filter_kwargs={"control_state": 0, "meas_basis": "x"},
          ),
          curve.SeriesDef(
              name="y|c=0",
              fit_func=lambda x, t_off, px0, py0, pz0, b: curve.fit_function.bloch_oscillation_y(
                  x + t_off, px=px0, py=py0, pz=pz0, baseline=b
              ),
              filter_kwargs={"control_state": 0, "meas_basis": "y"},
          ),
          curve.SeriesDef(
              name="z|c=0",
              fit_func=lambda x, t_off, px0, py0, pz0, b: curve.fit_function.bloch_oscillation_z(
                  x + t_off, px=px0, py=py0, pz=pz0, baseline=b
              ),
              filter_kwargs={"control_state": 0, "meas_basis": "z"},
          ),  
        ],
        "ctrl1": [
          curve.SeriesDef(
              name="x|c=1",
              fit_func=lambda x, t_off, px1, py1, pz1, b: curve.fit_function.bloch_oscillation_x(
                  x + t_off, px=px1, py=py1, pz=pz1, baseline=b
              ),
              filter_kwargs={"control_state": 1, "meas_basis": "x"},
          ),
          curve.SeriesDef(
              name="y|c=1",
              fit_func=lambda x, t_off, px1, py1, pz1, b: curve.fit_function.bloch_oscillation_y(
                  x + t_off, px=px1, py=py1, pz=pz1, baseline=b
              ),
              filter_kwargs={"control_state": 1, "meas_basis": "y"},
          ),
          curve.SeriesDef(
              name="z|c=1",
              fit_func=lambda x, t_off, px1, py1, pz1, b: curve.fit_function.bloch_oscillation_z(
                  x + t_off, px=px1, py=py1, pz=pz1, baseline=b
              ),
              filter_kwargs={"control_state": 1, "meas_basis": "z"},
          ),      
        ]
    }

Since group curve analysis is an outer-loop of CurveAnalysis, i.e. it repeats curve fitting for each group, the inheritance of curve analysis GroupCurveAnalysis(CurveAnalysis) doesn't make sense (we need to implement _run_analysis in a loop). In addition, there is attribute name inconsistency, for example, self._fit_model (curve analysis) -> self._fit_models (group curve analysis), it would be better to introduce base curve analysis that doesn't implement _run_analysis, where the reusable functionalities are implemented as hook-like methods.

class BaseCurveAnalysis(BaseAnalysis):
    @classmethod
    def _default_options(cls, ) -> Options:
    @classmethod
    def _curve_fit(cls, ...) -> FitData:
    def _generate_fit_guesses(self, user_opt: FitOptions, curve_data: CurveData) -> List[FitOptions]:
    def _format_data(self, xdata, ydata, y_err_data, data_allocation, weights) -> CurveData:
    def _evaluate_quality(self, fit_data: FitData) -> str:
    def _run_curve_analysis(self, curve_data: CurveData, fit_model: FitModel) -> FitData:
    def _create_analysis_results(self, fit_result: FitData) -> List[AnalysisResultData]:
    def _create_raw_data(self, fit_data: CurveData) -> AnalysisResultData:

Then each curve analysis subclass can be implemented as

class CurveAnalysis(BaseCurveAnalysis):
    
    __series__: List[SeriesDef]
    self._fit_model: FitModel
    
    @property
    def fit_model(self) -> FitModel:
    @property
    def parameters(self) -> List[str]:
    def _extra_database_entry(self, fit_data: FitData) -> List[AnalysisResultData]: 
    def _run_analysis(self, experiment_data: ExperimentData) -> Tuple[List[AnalysisResultData], List[Figure]]:
        #### pseudo code ####
        # do data extraction
        # do _run_curve_analysis
        # do _create_analysis_results

class GroupCurveAnalysis(BaseCurveAnalysis):
    
    __group__: Dict[str, List[SeriesDef]]
    self._fit_models: List[FitModel]

    def component_fit_model(self, index: Optional[Union[str, int]]) -> FitModel:
    def component_parameters(self, index: Optional[Union[str, int]]) -> List[str]:
    @property
    def parameters(self) -> List[str]:
    def _extra_database_entry(self, fit_data: List[FitData]) -> List[AnalysisResultData]: 
    def _run_analysis(self, experiment_data: ExperimentData) -> Tuple[List[AnalysisResultData], List[Figure]]:
        #### pseudo code ####
        # can be called with thread pool executor for speedup, if the performance matters
        for group in self.__group__: 
            # do data extraction
            # do _run_curve_analysis  
            # do _create_analysis_results

Note: Separation of fit function will drastically reduce unittest execution time for CR Ham tomo from ~100s to ~10s

Currently there are some blockers to implement this framework.

Too tight coupling to class attribute and external fit function
This will be solved with Add fit model to CurveAnalysis #726 where a self-contained and portable fit model represetation is introduced.
Internal state for processed data
Currently curve analysis saves raw data and formatted data as internal state, and calls these data set from the method self._data. This is problematic because this internal state is not aware of group, so it will be overridden in the loop. It is possible to make it group-aware, but this will introduce unnecessary coupling to group in the standard curve analysis class. Fortunately, this internal state is only used by RB experiments, thus we can drop this internal state with refactoring of RB.
Drawer
Another reason of having processed data is to draw raw data together with the formatted data. In the figures of this tutorial, transparent gray crosses are the raw values and bold symbols are formatted (weight averaged) values. To drop the processed data from the internal state, we need to update the drawer so that it can draw raw data points immediately.
Data processor (optional, not necessary for group curve analysis)
Data processor can be moved to BaseAnalysis. This will solve many issues regarding the run option necessary to instantiate the processor. Basically the processor will be a base analysis option (DataProcessor is already serializable), and a data processing subroutine will be inserted just before _run_analysis is called. Experiment instance can instantiate a custom processor within the _finalize call and attach this to analysis. This can eventually address Add an IQ plot for meas_level=1 #614 since the base analysis can draw IQ plot when the output of data processor is IQ format (then it appends IQ plot to figures). However, this indicates all experiments in the codebase should use DataProcessor, and this is indeed overkill to just extract count dictionary (e.g. QV experiment) or compute probability without standard error (e.g. tomography, where the fitter doesn't take care of stdev). Performance optimization should be done before it is globally used within the experiments, perhaps rust?

Steps

The step of refactoring will be

Add fit model Add fit model to CurveAnalysis #726
Drawer update (I think mixin is good approach to separately implement heavy drawer functionality while keeping good connection) Curve analysis drawer refactoring #738
RB refactoring Cleanup RB module #762
Introduce base curve analysis and switch baseclass of curve analsis CurveAnalysis base class #765
Introduce group curve analysis
Update CR Ham tomo analysis with group
Update HEAT HEAT experiments #625

The text was updated successfully, but these errors were encountered:

nkanazawa1989 · 2022-03-17T19:42:16Z

We can add _finalize_fit method that is called after curve fit. #732 for use case.

nkanazawa1989 added the enhancement New feature or request label Mar 16, 2022

nkanazawa1989 self-assigned this Mar 16, 2022

nkanazawa1989 added this to the Release 0.3 milestone Mar 16, 2022

nkanazawa1989 added Changelog: API Change Include in the "Changed" section of the changelog Changelog: New Feature Include in the "Added" section of the changelog labels Mar 16, 2022

This was referenced Apr 4, 2022

Cleanup RB module #762

Merged

CurveAnalysis base class #765

Merged

chriseclectic removed this from the Release 0.3 milestone Apr 20, 2022

coruscating added this to the Release 0.4 milestone Apr 27, 2022

nkanazawa1989 mentioned this issue Jun 9, 2022

Add multi-group curve analysis #824

Merged

nkanazawa1989 closed this as completed in #824 Jul 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CurveAnalysis refactoring #737

CurveAnalysis refactoring #737

nkanazawa1989 commented Mar 16, 2022 •

edited

Loading

nkanazawa1989 commented Mar 17, 2022

CurveAnalysis refactoring #737

CurveAnalysis refactoring #737

Comments

nkanazawa1989 commented Mar 16, 2022 • edited Loading

Background

New implementation

Steps

nkanazawa1989 commented Mar 17, 2022

nkanazawa1989 commented Mar 16, 2022 •

edited

Loading