Processing ETBR bonsentan data (5XPR)

The following describes how endothelin ET_B receptor+bonsentan datasets can be processed using KAMO (documentation in Japanese / English).

References

Original paper
- Shihoya et al. (2017) "X-ray structures of endothelin ET_B receptor bound to clinical antagonist bosentan and its analog." Nature Structural & Molecular Biology doi: 10.1038/nsmb.3450 PDB: 5XPR

Raw data

Available in Zenodo.
Collected on BL32XU, SPring-8
MX225HS CCD detector (2x2 binning), 18×10 μm² beam, 1 Å wavelength, 250.0 mm camera length
10°/dataset, 0.2°/frame (shutterless)
16 datasets collected automatically (ZOO system) from 2 cryoloops
P3₂21; a=b= 74.7, c= 218.9 Å

How data were processed in the original paper

GUI command 'kamo' was used by default parameters, that is, XDS (ver. May 1, 2016 BUILT=20160617) was used for integration and no prior crystal information was employed. All 16 datasets were indexed and integrated with consistent unit cells:

[ 1] 16 members:
 Averaged P1 Cell= 74.52 74.72 218.74 90.08 90.29 119.75
 Members= [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
 Possible symmetries:
   freq symmetry     a      b      c     alpha  beta   gamma reindex
      0 P 1         74.52  74.72 218.74  90.08  90.29 119.75 a,b,c
      0 P 1 2 1     74.52 218.74  74.72  89.92 119.75  89.71 a,-c,b
      0 C 1 2 1     74.72 129.40 218.74  89.62  90.08  90.34 b,-2*a-b,c
      0 C 1 2 1    129.09  74.90 218.74  90.37  90.12  90.18 a-b,a+b,c
      0 C 1 2 1     74.52 129.75 218.74  90.26  90.29  89.84 a,a+2*b,c
      0 C 1 2 1    129.75  74.52 218.74  89.71  90.26  90.16 a+2*b,-a,c
      0 C 1 2 1    129.40  74.72 218.74  90.08  90.38  89.66 2*a+b,b,c
      0 C 1 2 1     74.90 129.09 218.74  89.88  90.37  89.82 a+b,-a+b,c
      0 C 2 2 2     74.52 129.75 218.74  90.26  90.29  89.84 a,a+2*b,c
      0 C 2 2 2     74.72 129.40 218.74  89.62  89.92  89.66 b,2*a+b,-c+1/4
      0 C 2 2 2     74.90 129.09 218.74  89.88  90.37  89.82 a+b,-a+b,c
      0 P 3         74.52  74.72 218.74  90.08  90.29 119.75 a,b,c
      0 P 3 1 2     74.72  74.52 218.74  89.71  89.92 119.75 b,a,-c
     14 P 3 2 1     74.52  74.72 218.74  90.08  90.29 119.75 a,b,c
      0 P 6         74.52  74.72 218.74  90.08  90.29 119.75 a,b,c
      2 P 6 2 2     74.52  74.72 218.74  90.08  90.29 119.75 a,b,c

As P321 symmetry was the most frequent one, P321 was assumed and the XDS_ASCII files were re-indexed to P321 symmetry.

As P321 symmetry is lower than highest possible symmetry (P622; their unit cells exactly match), there was a need to resolve indexing ambiguity problem; that is, (h,k,l) and (-h,-k,l) operators need to be tested for each dataset to make all indexing modes consistent. To do this, just type

kamo.resolve_indexing_ambiguity formerge.lst

and selective-breeding algorithm developed by Kabsch (2014) converged in 2 cycles and 7 datasets were reindexed.

Next, the template script merged_blend.sh was edited to use the updated list file (with appropriately reindexed files).

#!/bin/sh
# settings
dmin=3.5
anomalous=false # true or false
lstin=formerge_reindexed.lst
use_ramdisk=true # set false if there is few memory or few space in /tmp
# _______/setting

kamo.multi_merge \
        workdir=blend_${dmin}A_framecc_b \
        lstin=${lstin} d_min=${dmin} anomalous=${anomalous} \
        space_group=None reference.data=None \
        program=xscale xscale.reference=bmin \
        reject_method=framecc+lpstats rejection.lpstats.stats=em.b \
        clustering=blend blend.min_cmpl=90 blend.min_redun=2 blend.max_LCV=None blend.max_aLCV=None \
        xscale.use_tmpdir_if_available=${use_ramdisk} \
        batch.engine=sge batch.par_run=merging batch.nproc_each=8 nproc=8 batch.sge_pe_name=par

After running this script, the largest cluster was found to have the best statistics. However, the inner-shell R-meas value was a little bit high (blend_3.5A_framecc_b/cluster_0015/run_03; 14 datasets):

 SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
 RESOLUTION     NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA   R-meas  CC(1/2)  Anomal  SigAno   Nano
   LIMIT     OBSERVED  UNIQUE  POSSIBLE     OF DATA   observed  expected                                      Corr

    10.45        2737     418       432       96.8%      11.2%     11.3%     2712   17.50     12.0%    98.3*     9    0.966     172
     7.41        4900     685       687       99.7%      13.9%     11.9%     4879   15.87     14.8%    99.0*    -4    0.966     383
     6.05        6359     861       874       98.5%      22.9%     22.4%     6339    8.93     24.6%    96.8*    -4    0.769     514
     5.25        7154     978       982       99.6%      34.8%     36.7%     7123    5.93     37.5%    95.5*     5    0.789     603
     4.69        8372    1125      1130       99.6%      34.3%     35.1%     8338    6.19     37.0%    95.2*     0    0.756     735
     4.29        9068    1182      1190       99.3%      43.3%     46.3%     9043    5.30     46.4%    93.9*    -3    0.770     785
     3.97       10100    1324      1337       99.0%      75.1%     86.2%    10066    3.09     80.4%    85.5*    -6    0.693     883
     3.71       10765    1428      1440       99.2%     115.8%    141.0%    10734    1.94    123.9%    69.1*     2    0.671     946
     3.50       11403    1466      1479       99.1%     227.2%    285.4%    11352    0.98    242.3%    35.4*     3    0.612    1006
    total       70858    9467      9551       99.1%      33.4%     36.8%    70586    5.62     35.8%    98.2*     0    0.735    6027

We found that increasing NBATCH= value to 50 and setting low resolution limit to 30 Å helped improve the statistics. And finally, high resolution limit was set to 3.6 Å based on the paired-refinement result. Here is the final result:

 SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
 RESOLUTION     NUMBER OF REFLECTIONS    COMPLETENESS R-FACTOR  R-FACTOR COMPARED I/SIGMA   R-meas  CC(1/2)  Anomal  SigAno   Nano
   LIMIT     OBSERVED  UNIQUE  POSSIBLE     OF DATA   observed  expected                                      Corr

    10.23        2849     431       463       93.1%       4.6%      4.7%     2825   32.60      4.9%    99.8*     3    0.990     184
     7.45        4484     636       637       99.8%       6.0%      5.8%     4465   25.89      6.4%    99.8*     0    0.908     354
     6.15        5695     783       795       98.5%      16.2%     16.1%     5677   12.32     17.4%    98.9*     3    0.816     461
     5.35        6628     907       911       99.6%      29.6%     30.6%     6598    7.69     31.8%    98.1*     4    0.805     563
     4.80        7794    1027      1032       99.5%      30.9%     31.0%     7767    8.08     33.1%    98.5*     2    0.819     670
     4.39        8193    1088      1095       99.4%      35.8%     35.9%     8167    7.20     38.4%    96.3*     0    0.805     719
     4.07        9345    1213      1221       99.3%      65.9%     70.2%     9315    4.25     70.4%    89.8*     1    0.773     813
     3.82        9750    1274      1286       99.1%     100.3%    112.5%     9721    2.62    107.1%    76.0*     0    0.702     853
     3.60       10231    1332      1347       98.9%     181.9%    205.1%    10194    1.50    194.1%    54.5*     4    0.678     900
    total       64969    8691      8787       98.9%      28.0%     29.8%    64729    8.49     29.9%    99.3*     2    0.779    5517

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Processing ETBR bonsentan data (5XPR)

References

Raw data

How data were processed in the original paper

Clone this wiki locally