Subtomogram Averaging (STA) in RELION5

We will now use the particle positions obtained from Template Matching to perform STA on our target of interest (here, cytosolic ribosomes).

In the following section, the parameters you need to use in the RELION Running tab will depend on your system, e.g. if you run on a local machine, or through a computing cluster, etc …

To know whether our job is going to run on GPU or CPU, here’s a list:

Make pseudo-subtomos (CPU)
3D initial model (single GPU)
3D classification (GPU normally, but CPU if not doing alignments)
3D auto-refine (GPU)
3D multi-body (GPU)
Tomo reconstruct particle (CPU)
Tomo CTF refinement (CPU)
Tomo frame alignment (CPU)
Local resolution (CPU)

Extract your particles and check your average
Refining and cleaning your particles
Fast alignment using 3D classification with a single class
3D classification without alignment
Creating a mask
3D Refinement
Post-processing
More classification or going to High-resolution?
CTF and alignment refinement cycle
Run CTF refinement
Run Bayesian polishing
What to do next?

Extract your particles and check your average

The first step to begin Subtomogram Averaging (STA) is to run an Extract Subtomos job. This job will extract cropped subtomograms from all tilts, around the centre of the particle positions we identified using Template Matching.

Typically, we start with a binning of 4 or sometimes even 8. We use high binning in the initial steps because the focus is on cleaning and roughly aligning the particles—high-resolution information is not essential at this stage. In this example, we start at bin 4. You want the box size to be larger than your target—ideally at least ~1.5 times larger. Ribosomes are approximately 350 Å wide, which corresponds to about 46 pixels at bin 4.

In general, it’s best to use box sizes that are powers of 2 or 3. These “magic numbers” optimise computational performance. You can find more about that here: https://blake.bcm.edu/emanwiki/EMAN2/BoxSize. In our case, 46 is not ideal, and we want something about 1.5 to 2 times larger anyway. A box size of 72 or 84 pixels would be appropriate, let’s use 84. Of course the larger the box, the longer the computation time.

You want to use the new RELION5 “Write out as 2D stacks” for faster processing. Here for inputs, we used fused particles from Gain1 and Gain2 and created a tomograms.star containing both Gain1 and Gain2 tomos. Fusing particle list can be done in RELION (Join star files) or using the pytom_merge_stars.py from pyTOM. For the concatenated tomograms.star you can just copy/paste the info below the header of Gain1 into the tomograms.star of Gain2 and create a concatenated tomograms_gain1n2.star.

At this stage, you can check how the extracted particles look.

To do this, run a Reconstruct Particle job. In the I/O tab, use the optimisation_set.star file that was generated during the Extract Subtomo job. In the Average section, make sure to set the Box size and Binning factor to exactly the same values you used in the extraction step.

Once the job is complete, you can open the resulting average in Chimera, ChimeraX, or IMOD. The reconstructed volume is typically saved as data_merged.mrc or merged.mrc inside the ~/ReconstructParticleTomo/job00X directory. The result should resemble your expected structure—in this case, a low-resolution ribosome.

This works well here because we used Template Matching, which provides initial orientation angles. If your particles were picked using a different method, you might need to generate an initial model using the 3D Initial Model job instead.

This initial average will serve as a reference for the next steps, such as 3D classification or refinement.

Refining and cleaning your particles

To improve your average, you have a few key options—mainly alignment and classification. You can perform both simultaneously or run them separately, depending on your goal.

Just like in single-particle analysis, achieving high resolution requires a homogeneous particle set. That’s where classification comes in—it helps you sort out different types of particles or complexes (e.g., ribosomes vs. HSP60), remove junk or noise, and identify structural heterogeneity within a complex (e.g. different translational states of ribosomes).

The first step here is to clean out false positives that may have been picked during Template Matching.

A good approach is to first align all particles together using a quick local refinement. Since our particles already have initial orientations from pytom, a fast alignment with local searches only should be sufficient at this stage.

Once you’ve done that, you can proceed to classification without alignment to separate good particles from bad ones, or to distinguish between different structural states.

Fast alignment using 3D classification with a single class

To align your particles, you have two option, 3D Refinement and 3D Classification. 3D Refinement is the best but is a bit more time consuming, because of how it handles the particle, than 3D classification. At this first step we can use 3D classification, which is a bit less accurate, but faster.

Let’s launch an alignment with the 3D classification using only one class.

I/O Tab

Input images STAR file: Use the particles.star file generated by the PseudoSubtomo job.
Reference map: Use the merged.mrc file produced by the ReconstructParticleTomo job.

Reference Tab

Initial low-pass filter: Start with 50 Å. Later, adjust this value to be slightly above the estimated resolution from the previous step. This is important to low-pass filter your reference to avoid over-fitting.
Example: If 3D Classification gave you a resolution of 22 Å, set this to 25 Å in 3D Refinement.
Symmetry: We don’t work with a symmetric complex. If you do, I would recommend initially not applying symmetry. Apply it in later stages.

CTF Tab

CTF Correction: Always enable this.
Ignore CTFs until first peak: It can be useful to turn on at the beginning, especially for initial alignment and classification. This option acts like a low-pass filter.

Optimisation Tab

Number of classes: Set to the number of desired classes. Current case: We are using 1 class.
T parameter: This is a parameter that you will have to play with. Usually launching multiple jobs with different T values is smart (e.g 0.5, 1, 2 and 4)
Number of iterations: Start with the default of 25.
Mask diameter: Should be about 90% of the box size.
Example: For box size = 84 px, pixel size = 1.91 Å, binning = 4:
84 × 1.91 × 4 × 0.9 = ~570 Å
Limit resolution E-step to: To avoid overfitting and noisy reconstructions, set this to the Nyquist resolution at your current binning.
Example: 1.91 × 4 × 2 = ~15 Å

Sampling Tab

Perform alignment: Set to Yes (since we are aligning). If not aligning, set to No.
Off search range and step. This defines the translation parameters, how much you let your thing move in the box. Our particles should be centred thanks to TM, and we are high binning, so best is use small values like a range of 2 and a step of 1.
Local searches only: Enable this, and use the suggested parameters. This overrides the Angular sampling interval. We only want to perform local searches because our particles are already well aligned from TM.

Helix Tab

Not applicable in our case. Leave it untouched. This is only if you are working with something with helical symmetry, e.g. a filament.

Compute Tab

Configure settings as shown.
Enable GPU acceleration only if performing image alignment. If not aligning, disable GPU acceleration.

Adapt the running parameters, and press Run!

When running refinement or classification in RELION, it’s helpful to keep track of progress using a few key metrics in the output files.

You can run these commands from the terminal, in the folder of the job.

Monitor Resolution Over Iterations This will give you the resolution for each iteration. Ideally, it should improve (i.e., the number decreases) as refinement proceeds.
```
 grep _rlnCurrentResolution run_it???_model.star
```
Follow class population and resolution If you’re running classification, use relion_star_printtable to examine how the classes evolve:
```
 relion_star_printtable run_it001_model.star data_model_classes rlnClassDistribution rlnEstimatedResolution
```
Monitor Class Convergence (Optimum Changes) Track how well classes are stabilising. This number should start high (near 1) and trend toward 0 as the classes converge
```
 grep _rlnChangesOptimalClasses run_it???_optimiser.star
```

Let’s see how the resolution evolved over time in that particular case:

You can see that from iteration 20 it stopped decreasing. This is the iteration that we are going to select to continue.

You can also see the resolution at each iteration in the RELION GUI, in the log window. Double click on the log window and it will pop up. You will be able to scroll through the log and check the resolution.

We can also open our volumes in ChimeraX and check if it improved. Remember that this step is almost always mandatory! Even if your resolution is increasing, the volume could look worse, so always check your volumes!

Result:

You can see that at iteration 1, our ribosome is very smooth; you can only distinguish the large and small subunit. At iteration 20, we start seeing domains.

3D classification without alignment

Your particles should now be all aligned, more or less the same way. The next step would be to run classification without alignment, but this might be different for you.

Let’s again launch a Class3D job.