-
Notifications
You must be signed in to change notification settings - Fork 10
/
abdussamad_11_nonadaptive_791454.pdf.txt
1997 lines (1509 loc) · 62.4 KB
/
abdussamad_11_nonadaptive_791454.pdf.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>The nonadaptive nature of the H1N1 2009 Swine Flu pandemic contrasts with the adaptive facilitation of transmission to a new host</title>
<meta name="Subject" content="BMC Evolutionary Biology 2011, 11:6. doi:10.1186/1471-2148-11-6"/>
<meta name="Keywords" content=" "/>
<meta name="Author" content="Juwaeriah Abdussamad"/>
<meta name="Creator" content="Arbortext Advanced Print Publisher 10.0.1082/W Unicode"/>
<meta name="Producer" content="Acrobat Distiller 9.0.0 (Windows)"/>
<meta name="CreationDate" content=""/>
</head>
<body>
<pre>
Abdussamad and Aris-Brosou BMC Evolutionary Biology 2011, 11:6
http://www.biomedcentral.com/1471-2148/11/6
RESEARCH ARTICLE
Open Access
The nonadaptive nature of the H1N1 2009
Swine Flu pandemic contrasts with the adaptive
facilitation of transmission to a new host
Juwaeriah Abdussamad1,2, Stéphane Aris-Brosou1,3,4*
Abstract
Background: The emergence of the 2009 H1N1 Influenza pandemic followed a multiple reassortment event from
viruses originally circulating in swines and humans, but the adaptive nature of this emergence is poorly
understood.
Results: Here we base our analysis on 1180 complete genomes of H1N1 viruses sampled in North America
between 2000 and 2010 in swine and human hosts. We show that while transmission to a human host might
require an adaptive phase in the HA and NA antigens, the emergence of the 2009 pandemic was essentially
nonadaptive. A more detailed analysis of the NA protein shows that the 2009 pandemic sequence is characterized
by novel epitopes and by a particular substitution in loop 150, which is responsible for a nonadaptive structural
change tightly associated with the emergence of the pandemic.
Conclusions: Because this substitution was not present in the 1918 H1N1 pandemic virus, we posit that the
emergence of pandemics is due to epistatic interactions between sites distributed over different segments.
Altogether, our results are consistent with population dynamics models that highlight the epistatic and
nonadaptive rise of novel epitopes in viral populations, followed by their demise when the resulting virus is too
virulent.
Background
Viruses are the cause of several deadly diseases such as
yellow fever, dengue, hepatitis or seasonal Influenza.
The etiologic agent of the latter, the Influenza virus, can
cause mild to severe illnesses depending on the Influenza type and strain. The case of the 2009 H1N1 outbreak, first detected in humans in early 2009 [1], was
caused by a antigenically novel strain that led the World
Health Organization to declare the outbreak as the first
Influenza pandemic of the 21st century. The emergence
of such viruses in the human population has since
attracted intense scrutiny, with a particular focus on two
of their properties: virulence and interspecies transmission [2].
The H1N1 virus is an Influenza A virus that belongs
to the family of orthomyxoviruses, and has a segmented
negative single-stranded RNA genome made of eight
* Correspondence: [email protected]
1
Department of Biology, University of Ottawa, Ottawa, Canada
Full list of author information is available at the end of the article
segments that each encode 1-2 proteins necessary for
virus attachment to host cells and spread of viral infection. By approximate order of decreasing sizes, these
genes code for polymerase subunits (PB2, PB1 and PA),
the hemagglutinin (HA) and neuraminidase (NA) antigens, a nucleoprotein (NP), a ribonucleoprotein exporter
(NS2, also called NEP), an interferon antagonist (NS1),
an ion channel protein (M2) and a matrix protein (M1).
Two other proteins, PB2-F1 [3] and PB1-N40 [4], whose
roles are now emerging, have also been characterized.
This segmented genome is constantly evolving either by
accumulating mutations, which generally lead to small
antigenic differences ("antigenic drift”) or by exchanging
genomic segments, a process termed reassortment,
which, when occurring between different subtypes, can
lead to dramatic changes in antigenic properties, also
called “antigenic shift” (e.g., [5]).
The actual changes that may have led to the emergence of past pandemics start to become clearer thanks
to a number of studies. For instance, the first pandemic
© 2011 Abdussamad and Aris-Brosou; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the
Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and
reproduction in any medium, provided the original work is properly cited.
Abdussamad and Aris-Brosou BMC Evolutionary Biology 2011, 11:6
http://www.biomedcentral.com/1471-2148/11/6
of the 20th century in 1918, also known as ‘Spanish Flu’,
was caused by an H1N1 virus, which was isolated and
sequenced from a casualty preserved in the Alaskan permafrost [6]. Structural and genetic studies have shown
that this particular 1918 virus lacked a cleavage site in
HA [7], that virulence was determined by several proteins including HA, the replication complex, NS1 and
PB1-F2 [2,8], while HA and PB2 played an important
role in viral transmissibility [9]. The precise origin of
this 1918 virus is however difficult to trace back in time
due to the absence of genetic information on the viruses
circulating before the 20th century.
The emergence of the 2009 H1N1 pandemic is, on the
other hand, not as well understood. Structural information revealed that the 2009 HA protein had a striking
similarity to its 1918 counterpart [10]. In a landmark
study, Smith and collaborators showed that the etiologic
agent of the 2009 pandemic had three key features: (i)
the polymerase genes as well as HA, NP and the NS
genes emerged from triple reassortant North-American
swine viruses while the NA and M genes originated
from avian-like swine viruses, (ii) that the pandemic
viruses diversified about a year before the onset of the
pandemic and (iii) that a long branch separated the
diversification of these pandemic viruses from their first
emergence [11]. These authors suggested that the long
branch leading to the diversification of the pandemic
viruses both reflects a long unsampled history and mild
evidence for positive selection, but they did not fully
characterize the adaptive nature of the pandemic. It is
also unclear whether the actual host-switch events from
non-human animals to humans have an adaptive nature.
Here we revisit the adaptive nature of the 2009 H1N1
pandemic with a detailed analysis of the role of selection
in (i) the emergence of this virus, and (ii) its adaptation
to human hosts. On the basis of an extended data set
compared to [11], we show that while the acquisition of
efficient human-to-human transmission was driven by
positive selection, the emergence of the 2009 H1N1
pandemic was essentially nonadaptive, and resulted
from stochastic processes, which in turn are expected to
make the prediction of such dramatic events difficult.
Results and Discussion
Phylodynamics of the 2009 H1N1 pandemic
We downloaded 1180 complete Influenza A genomes of
the H1N1 subtype in North America collected between
year 2000 and 2010, and selected only the gene
sequences with at most 99.99% similarity for each of the
ten “canonical” protein-coding genes (see Methods).
This clustering step allowed us to performed all phylogenetic analyses in a reasonable timeframe while conserving most of the sequence diversity present in the
original data set. In order to reconstruct rooted
Page 2 of 13
phylogenetic trees for each of these genes, we used the
‘relaxed molecular clock’ approach implemented in
BEAST [12] (see [13] for rooting a tree with a clock),
where tip dates were set to the collection year of each
virus. A calibration scheme at a finer time-scale was not
used because the information about the collection
month was missing from some of the sampled genomes.
The substitution models selected by the Akaike Information Criterion [14] were all GTR + Γ + I, except for
PB1 (GTR + Γ), M2, M1 and NS2 (TVM + Γ) and NS1
(TVM + I). Since TVM-based models are not implemented in BEAUTi, we employed the next best AIC
model which in all cases was based on GTR for the
relaxed clock analyses with BEAST.
The results show that H1N1 sequences across all ten
genes have very similar histories (Figure 1 for the NA
gene; see Additional File 1 and 2). If we assume that the
ancestral H1N1 genome is of swine or other nonhuman origin [11], there were a minimum of three
host-switch events to human: two occurred on internal
("deep”) branches, one of which led to the 2009 pandemic. This particular host-switch event was placed on
the long branch sustaining the 2009 clade rather than
on the short branch leading to the Mexico-swine-2009
genome because the position of this latter genome
within the 2009 clade is weakly supported (posterior
probabilities ≤ 0.20 over the ten genes analyzed). The
third host-switch event occurred on a terminal branch
of the tree (Figure 1). It is notable that only one of the
two internal host-switch events led to a pandemic,
which suggests that the two processes of host-switch
event and ‘pandemicity’ are not tightly coupled, as
already suggested by the 2005 H5N1 viruses. In the rest
of the text, we will denote the part of the tree that leads
to the human 2009 pandemic sequences as the “pandemic clade”, while all the other human sequences are
part of the “non-pandemic clade” (Figure 1).
The relaxed clock analyses also allow us to derive
three additional results on (i) the population dynamics
of the different segments and genes, (ii) their rates of
evolution and (iii) their coalescence times. First, the
results of the skyline analyses show that the population
dynamics of the different segments and genes exhibit
two very contrasted trajectories (Figure 2). While most
segments followed similar and downwards dynamics in
the past, a decoupling event or a series of such events
took place at most five years before 2009, ca. 2004,
when seven of the ten genes underwent a rapid expansion suggestive of a selective sweep. The time resolution
of our analyses is too low for us to derive more accurate
dates, but the suddenness of this expansion suggests
that it would have been difficult to forecast as it represents a dramatic departure from the previously decreasing trend. These “expanding genes” include one of the
Abdussamad and Aris-Brosou BMC Evolutionary Biology 2011, 11:6
http://www.biomedcentral.com/1471-2148/11/6
Page 3 of 13
CY026221_H1N1_USA_Human_2007
CY027149_H1N1_USA_Human_2006
CY025941_H1N1_USA_Human_2007
CY025223_H1N1_USA_Human_2006
CY025391_H1N1_USA_Human_2007
CY026517_H1N1_USA_Human_2007
CY026893_H1N1_USA_Human_2007
CY027317_H1N1_USA_Human_2007
CY025215_H1N1_USA_Human_2006
CY027773_H1N1_USA_Human_2007
CY027045_H1N1_USA_Human_2007
CY028077_H1N1_USA_Human_2007
CY028061_H1N1_USA_Human_2007
CY026917_H1N1_USA_Human_2007
CY026565_H1N1_USA_Human_2007
CY027877_H1N1_USA_Human_2006
CY027101_H1N1_USA_Human_2007
CY026949_H1N1_USA_Human_2007
CY028197_H1N1_USA_Human_2006
CY027653_H1N1_USA_Human_2007
CY026237_H1N1_USA_Human_2007
CY026877_H1N1_USA_Human_2007
FJ611900_H1N1_USA_Swine_2007
CY003706_H1N1_USA_Human_2003
CY028325_H1N1_USA_Human_2007
CY002530_H1N1_USA_Human_2002
CY050662_H1N1_USA_Human_2009
CY050478_H1N1_USA_Human_2009
CY050774_H1N1_USA_Human_2009
CY050550_H1N1_USA_Human_2009
CY050766_H1N1_USA_Human_2009
CY038772_H1N1_USA_Human_2008
CY044558_H1N1_USA_Human_2008
CY037681_H1N1_USA_Human_2008
CY044486_H1N1_USA_Human_2007
CY026373_H1N1_USA_Human_2007
CY026541_H1N1_USA_Human_2007
CY026525_H1N1_USA_Human_2007
CY026629_H1N1_USA_Human_2007
CY006677_H1N1_USA_Human_2002
CY003474_H1N1_USA_Human_2001
CY003306_H1N1_USA_Human_2002
CY021695_H1N1_USA_Human_2000
CY020263_H1N1_USA_Human_2001
CY002394_H1N1_USA_Human_2001
CY001954_H1N1_USA_Human_2001
CY020143_H1N1_USA_Human_2001
CY003018_H1N1_USA_Human_2001
CY002642_H1N1_USA_Human_2000
CY002618_H1N1_USA_Human_2001
CY020151_H1N1_USA_Human_2001
CY000451_H1N1_USA_Human_2000
CY052729_H1N1_USA_Human_2009
CY054789_H1N1_USA_Human_2009
GQ465697_H1N1_Canada_Human_2009
CY044074_H1N1_USA_Human_2009
GQ402235_H1N1_Canada_Human_2009
CY052593_H1N1_USA_Human_2009
CY052236_H1N1_USA_Human_2009
CY052705_H1N1_USA_Human_2009
GQ465702_H1N1_Canada_Human_2009
GQ150330_H1N1_Canada_Swine_2009
CY040890_H1N1_Mexico_Human_2009
GQ373263_H1N1_Canada_Human_2009
CY053073_H1N1_USA_Human_2009
CY050881_H1N1_Mexico_Human_2009
CY052809_H1N1_USA_Human_2009
CY053647_H1N1_Mexico_Swine_2009
AY619960_H1N1_Canada_Swine_2002
DQ280194_H1N1_Canada_Swine_2003
DQ280218_H1N1_Canada_Swine_2003
DQ280242_H1N1_Canada_Swine_2004
DQ280251_H1N1_Canada_Swine_2004
DQ889687_H1N1_USA_Human_2005
DQ280202_H1N1_Canada_Swine_2003
GQ484356_H1N1_USA_Swine_2007
*
*
*
non-pandemic
clade
*
pandemic
clade
*
*
80.0
70.0
60.0
50.0
40.0
30.0
20.0
10.0
swine clade
0.0
Absolute time (in years before 2009)
Figure 1 Phylogeny estimated for the NA protein-coding gene. Sequences from the 2009 H1N1 pandemics and the clade sustaining them
are in red. Host switch events are marked with a star (*): in black for swine-to-human event and in gray for human-to-swine events. Branch
lengths are proportional to time; the horizontal axis below the tree gives the scale.
PB2
relative to the viral population, not to the host’s
dynamics, and therefore represent the incidence of the
virus rather than its prevalence [15]. Second, segment
dynamics are not linked to the origin of the segments or
genes, as PA and NP, which come form a North American avian and a classical avian source, respectively [2],
(B)
500
(A)
500
polymerase genes (PB1), the two antigenic determinants
(HA and NA), and the genes on the last two segments,
M and NS. On the other hand, two of the polymerase
genes (PB2 and PA) as well as the nucleoprotein (NP)
underwent a steady decrease in terms of scaled effective
population size (Neτ ). Note first that these estimates are
PB1
PA
●
●
●
●
Ne .
●
10
20
●
50
100
50
●
20
Ne .
NA
M2
M1
NS2
NS1
100
NP
●
5
●
5
10
HA
200
200
●
0
20
40
60
80
Time (in years before 2009)
100
0
20
40
60
80
100
Time (in years before 2009)
Figure 2 Skyline reconstructions of the demographics of all ten protein-coding genes. The effective population size (Ne) scaled to
generation time (τ) is plotted against absolute time, expressed in years before 2009. (A): for the three genes with decreasing Ne × τ are
represented by filled symbols. (B): for the seven genes with the recent increase in Ne × τ.
Abdussamad and Aris-Brosou BMC Evolutionary Biology 2011, 11:6
http://www.biomedcentral.com/1471-2148/11/6
0.006
0.005
0.004
0.003
0.002
0.001
Absolute mean rates (sub/site)
still exhibit similar dynamics (Figure 2A). Yet, such a
decoupling of segment dynamics is not atypical in Influenza A viruses (see [16]). One notable difference with
the latter study however is that our reconstruction goes
40 years back in time before the 2009 outbreak without
encountering any of the oscillations reconstructed over
a 14 year period for H3N2 viruses [16]. A potential
explanation is that the pattern observed here is due to
our smaller effective sample size (after clustering of
sequences at the 0.01% similarity level), and/or to the
lower temporal resolution of our analysis. In spite of
these potential confounding factors, the lack of oscillations detected in our results might also reflect the lack
of evidence for seasonality in H1N1 dynamics, which is
consistent with the dominant incidence of H3N2 viruses
in the human population between 1968 (the year of the
‘Hong Kong Flu’) and 2009 [16]. While in the face of
the 2009 pandemic it makes sense that Neτ for both the
HA and NA antigens increased, it is unclear (i) why Neτ
decreased for some segments and (ii) why a decoupling
is inferred within the polymerase genes, setting PB2,
which has a role in host restriction (e.g., [17]), apart.
This decoupling of segments cannot be due to our
sequence clustering that eliminated highly similar
sequences, but under-sampling of genomes cannot be
ruled out (see below).
Second, the posterior distributions of the absolute
rates of evolution are summarized in Figure 3. These
rates are similar to those estimated in previous studies
(e.g., [11,16]), and our results suggest that there is
extensive rate heterogeneity between the different segments of the Influenza A genomes of H1N1 viruses, and
even within segments as demonstrated in particular by
the posterior estimates for M2 and M1 (Figure 3). Posthoc comparisons of rates sampled from their posterior
distributions, either by means of Tukey HSD or pairwise
t tests, show significant differences at the a = 0.001
level, even under the very conservative Bonferroni correction. Therefore, Influenza A viruses of different subtypes evolve at different rates as reviewed before [18],
and each of their protein-coding genes, even on the
same segment, exhibit significant rate heterogeneity.
Summarizing rates of evolution of Influenza A viruses
and possibly other segmented RNA viruses by a single
number might therefore not give a realistic picture of
the extensive rate variation found in these viruses.
Third, Table 1 shows that the pandemic and non-pandemic H1N1 protein-coding genes analyzed here coalesced on average 65 years before 2009, that is, around
1944 (SEM = 18.52 years, excluding PB1 and NS1). NS1
and in particular PB1 have both been circulating for
much longer periods of time (since 1878 and 1728,
respectively; Table 1). Keeping in mind that the accuracy
of the estimated dates depends on the density of sampled
Page 4 of 13
PB2
PB1
PA
HA
NP
NA
M2
M1
NS2 NS1
The 10 standard genes of the Influenza A genome
Figure 3 Box-and-whisker plot of posterior absolute mean
rates of evolution for all ten protein-coding genes. The mean
rates were sampled by BEAST from their respective posterior
distribution.
genomes, three points can be made here: (i) these dates
are much deeper than in [11], where time to the most
recent common ancestor (TMRCA) of the sampled
sequences goes back to ~ 1985, due to the lower breadth
of their sampling strategy. The inclusion of the 1918 Brevig genome A/Brevig Mission/1/1918 [6] for instance
would only pull this root age further back in time; (ii) we
also detected variation of coalescence times within segments: the protein-coding genes on segment 7 and 8,
M2-M1 on the one hand and NS2-NS1 on the other
hand, coalesced at slightly different dates, although their
95% HPDs overlap - which might be due to a combination of short sequences and small numbers of variable
sites in the overlapping genes on segments 7 and 8; (iii)
the observation that different segments share the same
coalescence times has already been documented and
interpreted as evidence for the correlated evolution and
co-transmission of segments [19], so that there would be
genetic linkage between segments. However, [19] found
that coalescence times were shared by the PB2/PB1/PA/
NP/M segments, while our Table 1 suggests that PB2/
PA/HA/NA/NS2 have similar root age and therefore
could be linked. The different linkage groups or constellations might be a characteristic of the different viruses
studied (avian Influenza viruses of different subtypes in
[19]vs. H1N1 in human and swine here). However, it is
also possible that such gene constellations are highly
labile both in time and across subtypes.
A way to test this lability hypothesis is to estimate the
TMRCA of the pandemic sequences. Unlike the TMRCA
of the sampled H1N1 sequences, the emergence of the
Abdussamad and Aris-Brosou BMC Evolutionary Biology 2011, 11:6
http://www.biomedcentral.com/1471-2148/11/6
Page 5 of 13
Table 1 Estimated dates for the gene-specific ages of the root of the sampled H1N1 sequences, the divergence of the
pandemic clade (MRCApandemic), and for the diversification of the 2009 pandemic sequences (Pandemic age)
HPD MRCA pandemic
Root
95% HPDroot
MRCApandemic
PB2
74.28
105.75-46.09
8.54
12.06-5.35
Pandemic age
1.00
95% HPDpandemic
1.60-0.48
PB1
281.40
491.80-109.34
15.44
23.53-9.30
4.37
6.65-2.32
PA
HA
79.71
83.87
111.65-49.45
129.77-45.50
13.56
7.79
19.67-7.88
12.57-3.96
1.03
1.19
1.58-0.56
1.62-0.82
NP
55.39
95.16-22.36
7.88
11.34-5.08
0.93
1.60-0.37
NA
74.61
93.17-57.07
43.03
56.95-28.56
1.53
2.08-0.94
M2
29.51
39.69-19.98
14.76
20.99-8.72
1.30
1.31-1.20
M1
48.69
66.69-31.12
34.07
49.57-17.94
1.70
2.63-0.88
NS2
70.70
97.57-46.96
7.35
9.67-5.38
1.22
1.84-0.81
NS1
130.95
189.46-78.78
9.16
12.32-6.38
1.11
1.55-0.76
All dates are in years before 2009.
Notes–HPD: Highest Posterior Density; MRCA: Most Recent Common Ancestor.
pandemic sequences shows a very consistent date across
all segments and genes at 1.22 years before 2009, that is
during the last semester of 2007 (SEM = 0.25 year). This
date is slightly older than previous estimates that put the
TMRCA of pandemic sequences sometime between mid2008 [11] to early January 2009 [1]; this difference can be
due to relaxation of selective constraints that are not
directly accounted for here [11], slight differences in
model specifications ( [1] used a coalescent prior with
exponential growth rather than a skyline model used
here) and to our generally broader (but less dense) sampling of genomes. Based on the synchrony argument
used above and in [19], this result suggests the formation
of a new gene constellation in the late 2007. The emergence of this constellation could be the consequence of a
selective sweep, as suggested in the case of avian Influenza viruses [19], but it could also be due to a demographic bottleneck in the viral population or other
nonadaptive processes.
The lability hypothesis has a corollary that is easily testable: although the reassortment events that led to the
emergence of the pandemic strain have a history that
goes back to the early 1990’s [11], consistently with the
TMRCA estimated here (MRCApandemic in Table 1), the
coalescence times of the genomes analyzed here occurred
only shortly before the pandemic. The most recent common ancestor of the pandemic clade (MRCApandemic) has
a mean age of 16.16 years before 2009 (SEM = 12.36
years; Table 1), which corresponds to the end of 1992.
The 14.94 years (= 16.16 - 1.22) gap separating this
MRCApandemic from the pandemic clade represents a long
period of time when sequences leading to the 2009 pandemic were not sampled [11]. But the long branch leading to the pandemic clade (Additional File 2) could also
be due to the simultaneous action of positive selection.
Test of positive selection for the 2009 H1N1 pandemic
To test the hypothesis that the long branch leading to
the 2009 pandemic might represent the action of positive selection, we performed a branch-site test of positive selection along this branch in all ten protein-coding
genes of the H1N1 Influenza A genome. The results,
presented in Table 2 demonstrate quite dramatically
that none of the ten protein-coding genes shows any
evidence for positive selection, hereby suggesting that
this long branch reflects exclusively a period of 15 years
of unsampled history, and hence a dramatic failure of
the current surveillance system of circulating Influenza
viruses [11].
One potential caveat with our analysis is that codon
models assume that all nonsynonymous differences
observed in the data are fixed [20]. However, when data
are sampled at the population level, as is most likely the
case here, it is possible that most of the observed differences do in fact represent segregating polymorphisms.
This is known to render the use of non-synonymous to
synonymous rate ratios (ω’s) potentially problematic, as
estimated ω ratios can take values < 1 within a population even in the presence of very strong positive selection [21]. As some of the nonsynonymous differences in
our data are potentially transient polymorphisms, we
reanalyzed the same data with two tests based on population genetics principles. First, we employed the McDonald-Kreitman test (MKT), which is a two-population
neutrality test that compares the ratio of fixed nonsynonymous to synonymous differences to the ratio of
polymorphic nonsynonymous to synonymous differences
[22]. Here, a first “population” consisted of the
sequences from the pandemic clade, while the other
“population” contained all the remaining sequences in
order to match the specification of the codon-based test
Abdussamad and Aris-Brosou BMC Evolutionary Biology 2011, 11:6
http://www.biomedcentral.com/1471-2148/11/6
Page 6 of 13
Table 2 Neutrality tests and test of positive selection for the human 2009 H1N1 pandemic
Model
PB2
H1
PB1
pD
0.573
109
-10489.98
-10151.99
-10151.99
0.671
0.072
117
110
0.169
111
-9736.96
p-value
ω
pω sites (95%)
0.037
1.000
0.967 na
1.000
0.000 none
0.023
0.940 na
1.000
0.023
0.940 none
0.032
0.955 na
0.999
0.032
0.955 none
-9736.96
0.327
H0
H1
ln L
-10489.98
116
0.770
np
108
H0
H1
PA
p-MKT
H0
HA
H0
158
-10294.19
0.067
0.862 na
0.844
0.026
159
110
-10293.78
-6402.29
0.362
NP
H1
H0
1.000
0.039
0.007 none
0.969 na
H1
0.982
0.392
111
-6402.29
0.989
1.000
0.000 none
154
-8085.11
-8085.01
NA
H0
H1
M2
0.015
0.034
155
224
0.007
225
-1699.98
0.846 na
0.655
2.345
0.001 none
0.132
0.000 na
0.976
1.171
0.384 none
-1699.98
0.436
H0
H1
0.075
M1
H0
142
-3175.82
0.025
0.973 na
0.894
0.101
143
286
-3175.82
-2341.50
1.000
NS2
H1
H0
1.000
0.082
0.000 none
0.865 na
H1
0.901
0.005
287
-2341.50
0.984
1.000
0.000 none
182