Tuesday, 22 January 2013

Remerging xia2 data to a lower resolution (technical)

A fairly common question I get from people is how to remerge data from xia2 to a lower resolution, without rerunning the whole xia2 process. There are three answers to this. The first is that xia2 does not do this, which is not so helpful. The second is that this will be in the pipeline at some point along with being able to retrospectively change other decisions and have xia2 just run with those. The third, which follows, is that there is a relatively straightforward way of doing this with tools included with xia2 and CCP4.

The Example

The example to be considered here is some myoglobin data, where diffraction went past the edge of the detector, and xia2 gave:


For AUTOMATIC/DEFAULT/SAD
High resolution limit                             1.20    5.39    1.20
Low resolution limit                             39.15   39.15    1.24
Completeness                                     87.8    96.0    18.7
Multiplicity                                      7.4     9.3     1.6
I/sigma                                          43.3   108.4     2.9
Rmerge                                          0.025   0.013   0.206
Rmeas(I)                                        0.031   0.018   0.314
Rmeas(I+/-)                                     0.028   0.014   0.284
Rpim(I)                                         0.011   0.006   0.214
Rpim(I+/-)                                      0.013   0.007   0.195
Wilson B factor                                 6.648
Partial bias                                    0.000   0.000   0.000
Anomalous completeness                           79.4    95.7     8.3
Anomalous multiplicity                            3.6     5.0     1.1
Anomalous correlation                            0.418   0.506  -0.119
Anomalous slope                                 1.345   0.000   0.000
dF/F                                            0.037
dI/s(dI)                                        1.104
Total observations                              424674  6953    1428
Total unique                                    57065   746     879

From processing with xia2 -3d ... Clearly it would be nice to have somewhere close to complete data in the outer shell, so a limit of around 1.35A would be appropriate (looking at the Scala log file.) So, this is what we need to do:

mkdir remerge
cd remerge/
get_ccp4_commands ../LogFiles/AUTOMATIC_DEFAULT_scala.log > runit

The last command reads the scala log file and essentially reconstructs an input file to rerun the job:

HKLIN /tmp/example/DEFAULT/scale/AUTOMATIC_DEFAULT_sorted.mtz
/tmp/example/DEFAULT/scale/AUTOMATIC_DEFAULT_scaled.mtz /tmp/example/DEFAULT/scale/AUTOMATIC_DEFAULT_scaled.mtz
SYMINFO /dls_sw/apps/ccp4/64/6.3.0/13nov2012/ccp4-6.3.0/lib/data/syminfo.lib
bins 20
resolution 1.200000
run 1 batch 1 to 321
resolution run 1 high 1.200000
name run 1 project AUTOMATIC crystal DEFAULT dataset SAD
scales constant
exclude sdmin 2.0
sdcorrection fixsdb noadjust norefine both 1.0 0.0
anomalous on

So we can edit this to do what we want:

scala \
HKLIN /tmp/example/DEFAULT/scale/AUTOMATIC_DEFAULT_sorted.mtz \
HKLOUT remerge.mtz << eof
bins 20
resolution 1.35
run 1 batch 1 to 321
name run 1 project AUTOMATIC crystal DEFAULT dataset SAD
scales constant
exclude sdmin 2.0
sdcorrection fixsdb noadjust norefine both 1.0 0.0
anomalous on
eof

and run:

bash runit

to get

...

                                           Overall  InnerShell  OuterShell
  Low resolution limit                       39.15     39.15      1.39
  High resolution limit                       1.35      6.04      1.35

  Rmerge                                     0.024     0.013     0.108
  Rmerge in top intensity bin                0.014        -         -
  Rmeas (within I+/I-)                       0.027     0.014     0.139
  Rmeas (all I+ & I-)                        0.030     0.018     0.140
  Rpim (within I+/I-)                        0.012     0.007     0.086
  Rpim (all I+ & I-)                         0.010     0.006     0.070
  Fractional partial bias                    0.000     0.000     0.000
  Total number of observations              397951      4729     12624
  Total number unique                        46162       521      3333
  Mean((I)/sd(I))                             52.1     107.9      10.5
  Completeness                                99.7      94.4      97.9
  Multiplicity                                 8.6       9.1       3.8

  Anomalous completeness                      96.6      93.9      78.8
  Anomalous multiplicity                       3.9       4.9       1.9
  DelAnom correlation between half-sets      0.410     0.587     0.115
  Mid-Slope of Anom Normal Probability       1.422       -         -
...

i.e. essentially complete data. To get this output in scalepack format, adjust the script to:

scala \
HKLIN /tmp/example/DEFAULT/scale/AUTOMATIC_DEFAULT_sorted.mtz \
SCALEPACK unmerged.sca \
HKLOUT remerge.mtz << eof
bins 20
output polish unmerged
resolution 1.35
run 1 batch 1 to 321
name run 1 project AUTOMATIC crystal DEFAULT dataset SAD
scales constant
exclude sdmin 2.0
sdcorrection fixsdb noadjust norefine both 1.0 0.0
anomalous on
eof

and run again - now you can see why I usually just advise running xia2 again with the -resolution flag :o)

No comments: