Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CurveAnalysis refactoring #737

Closed
6 of 7 tasks
nkanazawa1989 opened this issue Mar 16, 2022 · 1 comment · Fixed by #824
Closed
6 of 7 tasks

CurveAnalysis refactoring #737

nkanazawa1989 opened this issue Mar 16, 2022 · 1 comment · Fixed by #824
Assignees
Labels
Changelog: API Change Include in the "Changed" section of the changelog Changelog: New Feature Include in the "Added" section of the changelog enhancement New feature or request
Milestone

Comments

@nkanazawa1989
Copy link
Collaborator

nkanazawa1989 commented Mar 16, 2022

Background

The CurveAnalysis can perform multi-objective optimization with different fit functions. However, if we want to perform two independent fittings, we need to implement the experiment as batch experiment. This is quite heavy coding/maintenance overhead and doesn't scale. For example, we need to update CR Ham tomo analysis from the fitting of F(px0, py0, pz0, px1, py1, pz1, b) to a set of F0(px, py, pz, b) @ ctrl=0 and F1(px, py, pz, b) @ ctrl=1 for speedup (see #668). This is naive implementation of this analysis

Current structure

  • CrossResonanceHamiltonian (experiment)
  • CrossResonanceHamiltonianAnalysis (analysis)

Naive implementation

  • TomographyElement(BaseExperiment) (experiment for each control state)
  • CrossResonanceHamiltonian(BatchExperiment) (experiment for batch, instantiate sub experiments)
  • TomographyElementAnalysis(CurveAnalysis) (analysis for each control state)
  • CrossResonanceHamiltonianAnalysis(CompositeAnalysis) (analysis for batch; compute Hamiltonian coefficient)

Here CrossResonanceHamiltonian should implement weird logic to synchronize all experiments, analysis, transpile, run options of itself to sub experiments so that it behaves as if a single experiment (since user doesn't expect one is running it as a batch experiment). As you can see, this is pretty much overhead for just separating fit functions.

New implementation

Here we are going to introduce new concept group to the curve analysis.

  • group: A collection of series.
  • series: A collection of curve.
  • curve: A definition of fit function (fitter), data filtering keywords (data processing), and drawing options (drawer)
    To avoid complication, the group is implemented on the separate class.

New CR Hamiltonian analysis may look like

class CrossResonanceHamiltonianAnalysis(GroupCurveAnalysis):
    __group__ = {
        "ctrl0": [
          curve.SeriesDef(
              name="x|c=0",
              fit_func=lambda x, t_off, px0, py0, pz0, b: curve.fit_function.bloch_oscillation_x(
                  x + t_off, px=px0, py=py0, pz=pz0, baseline=b
              ),
              filter_kwargs={"control_state": 0, "meas_basis": "x"},
          ),
          curve.SeriesDef(
              name="y|c=0",
              fit_func=lambda x, t_off, px0, py0, pz0, b: curve.fit_function.bloch_oscillation_y(
                  x + t_off, px=px0, py=py0, pz=pz0, baseline=b
              ),
              filter_kwargs={"control_state": 0, "meas_basis": "y"},
          ),
          curve.SeriesDef(
              name="z|c=0",
              fit_func=lambda x, t_off, px0, py0, pz0, b: curve.fit_function.bloch_oscillation_z(
                  x + t_off, px=px0, py=py0, pz=pz0, baseline=b
              ),
              filter_kwargs={"control_state": 0, "meas_basis": "z"},
          ),  
        ],
        "ctrl1": [
          curve.SeriesDef(
              name="x|c=1",
              fit_func=lambda x, t_off, px1, py1, pz1, b: curve.fit_function.bloch_oscillation_x(
                  x + t_off, px=px1, py=py1, pz=pz1, baseline=b
              ),
              filter_kwargs={"control_state": 1, "meas_basis": "x"},
          ),
          curve.SeriesDef(
              name="y|c=1",
              fit_func=lambda x, t_off, px1, py1, pz1, b: curve.fit_function.bloch_oscillation_y(
                  x + t_off, px=px1, py=py1, pz=pz1, baseline=b
              ),
              filter_kwargs={"control_state": 1, "meas_basis": "y"},
          ),
          curve.SeriesDef(
              name="z|c=1",
              fit_func=lambda x, t_off, px1, py1, pz1, b: curve.fit_function.bloch_oscillation_z(
                  x + t_off, px=px1, py=py1, pz=pz1, baseline=b
              ),
              filter_kwargs={"control_state": 1, "meas_basis": "z"},
          ),      
        ]
    }

Since group curve analysis is an outer-loop of CurveAnalysis, i.e. it repeats curve fitting for each group, the inheritance of curve analysis GroupCurveAnalysis(CurveAnalysis) doesn't make sense (we need to implement _run_analysis in a loop). In addition, there is attribute name inconsistency, for example, self._fit_model (curve analysis) -> self._fit_models (group curve analysis), it would be better to introduce base curve analysis that doesn't implement _run_analysis, where the reusable functionalities are implemented as hook-like methods.

class BaseCurveAnalysis(BaseAnalysis):
    @classmethod
    def _default_options(cls, ) -> Options:
    @classmethod
    def _curve_fit(cls, ...) -> FitData:
    def _generate_fit_guesses(self, user_opt: FitOptions, curve_data: CurveData) -> List[FitOptions]:
    def _format_data(self, xdata, ydata, y_err_data, data_allocation, weights) -> CurveData:
    def _evaluate_quality(self, fit_data: FitData) -> str:
    def _run_curve_analysis(self, curve_data: CurveData, fit_model: FitModel) -> FitData:
    def _create_analysis_results(self, fit_result: FitData) -> List[AnalysisResultData]:
    def _create_raw_data(self, fit_data: CurveData) -> AnalysisResultData:

Then each curve analysis subclass can be implemented as

class CurveAnalysis(BaseCurveAnalysis):
    
    __series__: List[SeriesDef]
    self._fit_model: FitModel
    
    @property
    def fit_model(self) -> FitModel:
    @property
    def parameters(self) -> List[str]:
    def _extra_database_entry(self, fit_data: FitData) -> List[AnalysisResultData]: 
    def _run_analysis(self, experiment_data: ExperimentData) -> Tuple[List[AnalysisResultData], List[Figure]]:
        #### pseudo code ####
        # do data extraction
        # do _run_curve_analysis
        # do _create_analysis_results

class GroupCurveAnalysis(BaseCurveAnalysis):
    
    __group__: Dict[str, List[SeriesDef]]
    self._fit_models: List[FitModel]

    def component_fit_model(self, index: Optional[Union[str, int]]) -> FitModel:
    def component_parameters(self, index: Optional[Union[str, int]]) -> List[str]:
    @property
    def parameters(self) -> List[str]:
    def _extra_database_entry(self, fit_data: List[FitData]) -> List[AnalysisResultData]: 
    def _run_analysis(self, experiment_data: ExperimentData) -> Tuple[List[AnalysisResultData], List[Figure]]:
        #### pseudo code ####
        # can be called with thread pool executor for speedup, if the performance matters
        for group in self.__group__: 
            # do data extraction
            # do _run_curve_analysis  
            # do _create_analysis_results

Note: Separation of fit function will drastically reduce unittest execution time for CR Ham tomo from ~100s to ~10s

Currently there are some blockers to implement this framework.

  • Too tight coupling to class attribute and external fit function
    This will be solved with Add fit model to CurveAnalysis #726 where a self-contained and portable fit model represetation is introduced.
  • Internal state for processed data
    Currently curve analysis saves raw data and formatted data as internal state, and calls these data set from the method self._data. This is problematic because this internal state is not aware of group, so it will be overridden in the loop. It is possible to make it group-aware, but this will introduce unnecessary coupling to group in the standard curve analysis class. Fortunately, this internal state is only used by RB experiments, thus we can drop this internal state with refactoring of RB.
  • Drawer
    Another reason of having processed data is to draw raw data together with the formatted data. In the figures of this tutorial, transparent gray crosses are the raw values and bold symbols are formatted (weight averaged) values. To drop the processed data from the internal state, we need to update the drawer so that it can draw raw data points immediately.
  • Data processor (optional, not necessary for group curve analysis)
    Data processor can be moved to BaseAnalysis. This will solve many issues regarding the run option necessary to instantiate the processor. Basically the processor will be a base analysis option (DataProcessor is already serializable), and a data processing subroutine will be inserted just before _run_analysis is called. Experiment instance can instantiate a custom processor within the _finalize call and attach this to analysis. This can eventually address Add an IQ plot for meas_level=1 #614 since the base analysis can draw IQ plot when the output of data processor is IQ format (then it appends IQ plot to figures). However, this indicates all experiments in the codebase should use DataProcessor, and this is indeed overkill to just extract count dictionary (e.g. QV experiment) or compute probability without standard error (e.g. tomography, where the fitter doesn't take care of stdev). Performance optimization should be done before it is globally used within the experiments, perhaps rust?

Steps

The step of refactoring will be

@nkanazawa1989 nkanazawa1989 added the enhancement New feature or request label Mar 16, 2022
@nkanazawa1989 nkanazawa1989 self-assigned this Mar 16, 2022
@nkanazawa1989 nkanazawa1989 added this to the Release 0.3 milestone Mar 16, 2022
@nkanazawa1989 nkanazawa1989 added Changelog: API Change Include in the "Changed" section of the changelog Changelog: New Feature Include in the "Added" section of the changelog labels Mar 16, 2022
@nkanazawa1989
Copy link
Collaborator Author

We can add _finalize_fit method that is called after curve fit. #732 for use case.

This was referenced Apr 4, 2022
@chriseclectic chriseclectic removed this from the Release 0.3 milestone Apr 20, 2022
@coruscating coruscating added this to the Release 0.4 milestone Apr 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: API Change Include in the "Changed" section of the changelog Changelog: New Feature Include in the "Added" section of the changelog enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants