-
Notifications
You must be signed in to change notification settings - Fork 12
Processing ETBR bonsentan data (5XPR)
The following describes how endothelin ETB receptor+bonsentan datasets can be processed using KAMO (documentation in Japanese / English).
- Original paper
- Shihoya et al. (2017) "X-ray structures of endothelin ETB receptor bound to clinical antagonist bosentan and its analog." Nature Structural & Molecular Biology doi: 10.1038/nsmb.3450 PDB: 5XPR
- Available in Zenodo.
- Collected on BL32XU, SPring-8
- MX225HS CCD detector (2x2 binning), 18×10 μm2 beam, 1 Å wavelength, 250.0 mm camera length
- 10°/dataset, 0.2°/frame (shutterless)
- 16 datasets collected automatically (ZOO system) from 2 cryoloops
- P3221; a=b= 74.7, c= 218.9 Å
GUI command 'kamo' was used by default parameters, that is, XDS (ver. May 1, 2016 BUILT=20160617) was used for integration and no prior crystal information was employed. All 16 datasets were indexed and integrated with consistent unit cells:
[ 1] 16 members:
Averaged P1 Cell= 74.52 74.72 218.74 90.08 90.29 119.75
Members= [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
Possible symmetries:
freq symmetry a b c alpha beta gamma reindex
0 P 1 74.52 74.72 218.74 90.08 90.29 119.75 a,b,c
0 P 1 2 1 74.52 218.74 74.72 89.92 119.75 89.71 a,-c,b
0 C 1 2 1 74.72 129.40 218.74 89.62 90.08 90.34 b,-2*a-b,c
0 C 1 2 1 129.09 74.90 218.74 90.37 90.12 90.18 a-b,a+b,c
0 C 1 2 1 74.52 129.75 218.74 90.26 90.29 89.84 a,a+2*b,c
0 C 1 2 1 129.75 74.52 218.74 89.71 90.26 90.16 a+2*b,-a,c
0 C 1 2 1 129.40 74.72 218.74 90.08 90.38 89.66 2*a+b,b,c
0 C 1 2 1 74.90 129.09 218.74 89.88 90.37 89.82 a+b,-a+b,c
0 C 2 2 2 74.52 129.75 218.74 90.26 90.29 89.84 a,a+2*b,c
0 C 2 2 2 74.72 129.40 218.74 89.62 89.92 89.66 b,2*a+b,-c+1/4
0 C 2 2 2 74.90 129.09 218.74 89.88 90.37 89.82 a+b,-a+b,c
0 P 3 74.52 74.72 218.74 90.08 90.29 119.75 a,b,c
0 P 3 1 2 74.72 74.52 218.74 89.71 89.92 119.75 b,a,-c
14 P 3 2 1 74.52 74.72 218.74 90.08 90.29 119.75 a,b,c
0 P 6 74.52 74.72 218.74 90.08 90.29 119.75 a,b,c
2 P 6 2 2 74.52 74.72 218.74 90.08 90.29 119.75 a,b,c
As P321 symmetry was the most frequent one, P321 was assumed and the XDS_ASCII files were re-indexed to P321 symmetry.
As P321 symmetry is lower than highest possible symmetry (P622; their unit cells exactly match), there was a need to resolve indexing ambiguity problem; that is, (h,k,l) and (-h,-k,l) operators need to be tested for each dataset to make all indexing modes consistent. To do this, just type
kamo.resolve_indexing_ambiguity formerge.lst
and selective-breeding algorithm developed by Kabsch (2014) converged in 2 cycles and 7 datasets were reindexed.
Next, the template script merged_blend.sh was edited to use the updated list file (with appropriately reindexed files).
#!/bin/sh
# settings
dmin=3.5
anomalous=false # true or false
lstin=formerge_reindexed.lst
use_ramdisk=true # set false if there is few memory or few space in /tmp
# _______/setting
kamo.multi_merge \
workdir=blend_${dmin}A_framecc_b \
lstin=${lstin} d_min=${dmin} anomalous=${anomalous} \
space_group=None reference.data=None \
program=xscale xscale.reference=bmin \
reject_method=framecc+lpstats rejection.lpstats.stats=em.b \
clustering=blend blend.min_cmpl=90 blend.min_redun=2 blend.max_LCV=None blend.max_aLCV=None \
xscale.use_tmpdir_if_available=${use_ramdisk} \
batch.engine=sge batch.par_run=merging batch.nproc_each=8 nproc=8 batch.sge_pe_name=par
After running this script, the largest cluster was found to have the best statistics. However, the inner-shell R-meas value was a little bit high (blend_3.5A_framecc_b/cluster_0015/run_03; 14 datasets):
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas CC(1/2) Anomal SigAno Nano
LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr
10.45 2737 418 432 96.8% 11.2% 11.3% 2712 17.50 12.0% 98.3* 9 0.966 172
7.41 4900 685 687 99.7% 13.9% 11.9% 4879 15.87 14.8% 99.0* -4 0.966 383
6.05 6359 861 874 98.5% 22.9% 22.4% 6339 8.93 24.6% 96.8* -4 0.769 514
5.25 7154 978 982 99.6% 34.8% 36.7% 7123 5.93 37.5% 95.5* 5 0.789 603
4.69 8372 1125 1130 99.6% 34.3% 35.1% 8338 6.19 37.0% 95.2* 0 0.756 735
4.29 9068 1182 1190 99.3% 43.3% 46.3% 9043 5.30 46.4% 93.9* -3 0.770 785
3.97 10100 1324 1337 99.0% 75.1% 86.2% 10066 3.09 80.4% 85.5* -6 0.693 883
3.71 10765 1428 1440 99.2% 115.8% 141.0% 10734 1.94 123.9% 69.1* 2 0.671 946
3.50 11403 1466 1479 99.1% 227.2% 285.4% 11352 0.98 242.3% 35.4* 3 0.612 1006
total 70858 9467 9551 99.1% 33.4% 36.8% 70586 5.62 35.8% 98.2* 0 0.735 6027
We found that increasing NBATCH= value to 50 and setting low resolution limit to 30 Å helped improve the statistics. And finally, high resolution limit was set to 3.6 Å based on the paired-refinement result. Here is the final result:
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF RESOLUTION
RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR R-FACTOR COMPARED I/SIGMA R-meas CC(1/2) Anomal SigAno Nano
LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected Corr
10.23 2849 431 463 93.1% 4.6% 4.7% 2825 32.60 4.9% 99.8* 3 0.990 184
7.45 4484 636 637 99.8% 6.0% 5.8% 4465 25.89 6.4% 99.8* 0 0.908 354
6.15 5695 783 795 98.5% 16.2% 16.1% 5677 12.32 17.4% 98.9* 3 0.816 461
5.35 6628 907 911 99.6% 29.6% 30.6% 6598 7.69 31.8% 98.1* 4 0.805 563
4.80 7794 1027 1032 99.5% 30.9% 31.0% 7767 8.08 33.1% 98.5* 2 0.819 670
4.39 8193 1088 1095 99.4% 35.8% 35.9% 8167 7.20 38.4% 96.3* 0 0.805 719
4.07 9345 1213 1221 99.3% 65.9% 70.2% 9315 4.25 70.4% 89.8* 1 0.773 813
3.82 9750 1274 1286 99.1% 100.3% 112.5% 9721 2.62 107.1% 76.0* 0 0.702 853
3.60 10231 1332 1347 98.9% 181.9% 205.1% 10194 1.50 194.1% 54.5* 4 0.678 900
total 64969 8691 8787 98.9% 28.0% 29.8% 64729 8.49 29.9% 99.3* 2 0.779 5517