Skip to content
FriedrichFoerster edited this page May 27, 2020 · 2 revisions

Autofocused classification in 3D

Overview

Other than CPCA and MCO, we also have a new classification method: Autofocused Classification for 3D subtomograms (AC3D). AC3D is a clustering algorithm (unsupervised) and its basic idea is to automatically focus the classification on the most variable parts of 3D structures. This new type of similarity score (focused score) enables better discriminative ability.

Please refer to the paper for more details: Autofocused 3D Classification of Cryoelectron Subtomograms, Y. Chen et al., Structure 2014.

Script description

Here we assume you already have the subtomograms aligned, either by template matching or subtomogram alignment.

Subtomogram classification using auto_focus_classify.py

This script should be run in parallel:
mpirun -np "numberOfCPUs" pytom PathToPytom/classification/auto_focus_classify.py
The parameters are explained below:
  • -p: Aligned particle list.
  • -k: Number of expected classes.
  • -f: Maximal frequency involved during the calculation (in band unit).
  • -i: Number of iterations to run (by default 10).
  • -a: Run the classification WITHOUT alignment (if the particle list is already aligned).
  • -s: Potential translational offset of the particle from the center (radius, in pixel unit, only specify if the particle list is not aligned).
  • -m: Alignment mask. This mask is only used for the alignment purpose. Only specify it if the particle list is not aligned.
  • -c: Focused mask. This mask is used for constraining the calculation of the focused mask. (Optional)
  • -r: Use external references as the starting point (optional). By default, AC3D use K-means++ to generate the starting class centers. If you wish to provide your own starting class centers, specify them here (filenames seperated by comma).
  • -n: Noise percentage (between 0 and 1). If you estimate your dataset contains certain amount of noise outliers, specify it here.
And two parameters for controlling the calculation of difference mask.
  • -g: Particle density threshold for calculating the difference map (optional, by default 0). Two other most common choise are -2 and 2. -2 means all the values of the subtomogram below the -2 sigma will be used for calculating the difference mask (negative values count). 2 means all the values of the subtomogram above the 2 sigma will be used for calculating the difference mask (positive values count). Finally, 0 means all the values are used for the calculation.
  • -t: STD threshold for the difference map (optional, by default 0.4). This value should be between 0 and 1. 1 means only the place with the peak value will be set to 1 in the difference map (too much discriminative ability). 0 means all the places with the value above the average of STD will be set to 1 (not enough discriminative ability).

If you have further questions, please email to: [email protected].