-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathindex.xml
executable file
·4718 lines (3107 loc) · 698 KB
/
index.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>Sol Messing</title>
<link>https://solmessing.netlify.app/</link>
<atom:link href="https://solmessing.netlify.app/index.xml" rel="self" type="application/rss+xml" />
<description>Sol Messing</description>
<generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><lastBuildDate>Mon, 24 Oct 2022 00:00:00 +0000</lastBuildDate>
<image>
<url>https://solmessing.netlify.app/media/icon_huedd5de82286bc1d0a1509b535f624c76_25974_512x512_fill_lanczos_center_3.png</url>
<title>Sol Messing</title>
<link>https://solmessing.netlify.app/</link>
</image>
<item>
<title>Example Talk</title>
<link>https://solmessing.netlify.app/talk/example-talk/</link>
<pubDate>Sat, 01 Jun 2030 13:00:00 +0000</pubDate>
<guid>https://solmessing.netlify.app/talk/example-talk/</guid>
<description><div class="alert alert-note">
<div>
Click on the <strong>Slides</strong> button above to view the built-in slides feature.
</div>
</div>
<p>Slides can be added in a few ways:</p>
<ul>
<li><strong>Create</strong> slides using Wowchemy&rsquo;s <a href="https://wowchemy.com/docs/managing-content/#create-slides" target="_blank" rel="noopener"><em>Slides</em></a> feature and link using <code>slides</code> parameter in the front matter of the talk file</li>
<li><strong>Upload</strong> an existing slide deck to <code>static/</code> and link using <code>url_slides</code> parameter in the front matter of the talk file</li>
<li><strong>Embed</strong> your slides (e.g. Google Slides) or presentation video on this page using <a href="https://wowchemy.com/docs/writing-markdown-latex/" target="_blank" rel="noopener">shortcodes</a>.</li>
</ul>
<p>Further event details, including <a href="https://wowchemy.com/docs/writing-markdown-latex/" target="_blank" rel="noopener">page elements</a> such as image galleries, can be added to the body of this page.</p>
</description>
</item>
<item>
<title>An Early Election 2024 Forecast</title>
<link>https://solmessing.netlify.app/post/election_projection_regularized_swing/</link>
<pubDate>Thu, 11 Jan 2024 00:00:00 +0000</pubDate>
<guid>https://solmessing.netlify.app/post/election_projection_regularized_swing/</guid>
<description><p>Early projections for 2024 based on previous Presidential and House returns slighly favor Republicans. These projections are completely unrelated to Biden’s recent polling numbers.</p>
<!-- ![](/img/Map_for_2024_JS_Swing.png "A simple forecasted map for 2024. Created https://www.270towin.com/maps/WWE2B.") -->
<p>Here&rsquo;s the story behind this approach: In early 2020, I ran battleground state election forecasts for Acronym. The results suggested Georgia would be extremely competitive—and Acronym spent more $ there than many other non-profit actors. After the election, we could see that those projections had much lower forecasting error than polling data <a href="https://solomonmg.github.io/post/what-the-polls-got-wrong-in-2020/" target="_blank" rel="noopener">https://solomonmg.github.io/post/what-the-polls-got-wrong-in-2020/</a>.</p>
<p>Because this approach does not use polling data, it&rsquo;s not suspetible to any of the potential problems with polls I talk about in that post: undecided voters breaking late, low education non-response, bad likely voter modeling, partisan non-response, shy Trumpers, etc.</p>
<p>The core idea behind this approach is a fact not emphasized enough in most stats/ML courses: if you’re going to try to predict something, it’s very hard to do better than using the same variable at t - 1 if you can. And we can. This approach goes one step further and looks at the direction that variable has been moving and assume that things are likely to keep moving in that same direction.</p>
<p>What that means for presidential election forecasts: for each state, estimate the &ldquo;swing&rdquo; from 2016 to 2020 for president and 2018-2022 for the U.S. house; then simply add that to 2020 presidential returns. Then those estimates of state-level swing are regularized&mdash;mathematically ``nudged&rsquo;&rsquo; toward national trends, which you’ll like if you believe “uniform swing” is particularly important. The projected state-level swing is weighted 60-40 toward presidential results.</p>
<p>Here&rsquo;s a cleaner plot showing the actual forecast values in potential 2024 battleground states:</p>
<p>
<figure id="figure-a-simple-forecast-for-2024-battleground-states-hat-tip-to-tom-cunninghamhttpstecunninghamgithubio-who-suggested-this-plot-design">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/EstStateDemVote.png" alt="A simple forecast for 2024 battleground states. Hat tip to [Tom Cunningham](https://tecunningham.github.io) who suggested this plot design." loading="lazy" data-zoomable /></div>
</div><figcaption>
A simple forecast for 2024 battleground states. Hat tip to <a href="https://tecunningham.github.io" target="_blank" rel="noopener">Tom Cunningham</a> who suggested this plot design.
</figcaption></figure>
</p>
<p>I should now point to a link to the data and code: <a href="https://github.com/SolomonMg/election_projection_regularized_swing" target="_blank" rel="noopener">https://github.com/SolomonMg/election_projection_regularized_swing</a>, and thank the <a href="https://electionlab.mit.edu" target="_blank" rel="noopener">MIT Election Data + Science Lab</a> for curating <a href="https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/42MVDX" target="_blank" rel="noopener">these</a> <a href="https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/IG0UN2" target="_blank" rel="noopener">data</a>.</p>
<h3 id="electoral-math">Electoral Math:</h3>
<p>I&rsquo;m going to rely on <a href="https://www.270towin.com" target="_blank" rel="noopener">www.270towin.com</a> to translate these projections into an electoral map. A better way to do this might be to come up with conservative estimates of error and simulate a few thousand elections, but I&rsquo;m not estimating an extremely rigorous Bayesian model nor including enough extant data to really justify a FiveThirtyEight style forecast.</p>
<p>If you call anything lower than a 3 point margin either way a &ldquo;tossup,&rdquo; here&rsquo;s what the electoral map looks like:</p>
<p>
<figure id="figure-elecotral-map-for-2024-lower-than-a-3-margin-is-a-tossup-created-httpswww270towincommapswwe2b">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/Map_for_2024_JS_Swing.png" alt="Elecotral map for 2024, lower than a 3% margin is a tossup. Created https://www.270towin.com/maps/WWE2B." loading="lazy" data-zoomable /></div>
</div><figcaption>
Elecotral map for 2024, lower than a 3% margin is a tossup. Created <a href="https://www.270towin.com/maps/WWE2B" target="_blank" rel="noopener">https://www.270towin.com/maps/WWE2B</a>.
</figcaption></figure>
</p>
<p>That looks OK for Biden, but if you really trust this approach, you might want to say anything lower than 2% is a tossup. Then the electoral math looks very bad for Biden:</p>
<p>
<figure id="figure-elecotral-map-for-2024-lower-than-a-2-margin-is-a-tossup-created-httpswww270towincommapswwexg">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/Map_for_2024_JS_margin2Swing.png" alt="Elecotral map for 2024, lower than a 2% margin is a tossup. Created https://www.270towin.com/maps/WWExg." loading="lazy" data-zoomable /></div>
</div><figcaption>
Elecotral map for 2024, lower than a 2% margin is a tossup. Created <a href="https://www.270towin.com/maps/WWExg" target="_blank" rel="noopener">https://www.270towin.com/maps/WWExg</a>.
</figcaption></figure>
</p>
<h3 id="observation-polarization-and-accuracy">Observation: Polarization and Accuracy</h3>
<p>These projections essentially assume party identification, demographic trends, and voting behavior will mostly continue in the same general direction as in the past. They should have a lot of appeal if you think polarization means most people have already made up their minds about who to vote for for President, that Presidential campaign effects are relatively small (in equilibrium at least), and/or that “demographics are destiny.” What&rsquo;s more, the results are regularized toward national trends, which you’ll like if you believe that <a href="https://press.uchicago.edu/ucp/books/book/chicago/I/bo27596045.html" target="_blank" rel="noopener">local politics has been ``nationalized,&rsquo;&rsquo; as Dan Hopkins argues</a> and thus that <a href="https://projects.fivethirtyeight.com/2020-swing-states/" target="_blank" rel="noopener">“uniform swing” in the electorate is an increasingly important factor explaining state-level election results</a>&mdash;despite that Florida bucked the national trend in 2020.</p>
<p>In fact, over time, as polarization seems to worsen, this approach improves in accuracy:</p>
<p>
<figure id="figure-backtested-forecasts-improve-in-accuracy-over-time-and-2020-was-far-easier-to-predict-than-past-elections">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/Battleground_MAE_projections.png" alt="Backtested forecasts improve in accuracy over time, and 2020 was far easier to predict than past elections." loading="lazy" data-zoomable /></div>
</div><figcaption>
Backtested forecasts improve in accuracy over time, and 2020 was far easier to predict than past elections.
</figcaption></figure>
</p>
<p>However, these projections do not account for events since 2022. Older voters pass away and younger voters become eligible to vote changing the makeup of the electorate. Public opinion/sentiment may change related to economic conditions (inflation/income/unemployment/etc), policy developments e.g., related to abortion, international affairs like the Gaza conflict, or candidate-attributes like Biden’s age or Trumps legal troubles.</p>
<h3 id="observation-patterns-in-us-elections">Observation: Patterns in U.S. Elections</h3>
<p>These projections also do not explicitly model well-known voting patterns, instead relying on change from one cycle to another to get reasonable estimates. The most notable trend is that the president’s party almost always tends to lose seats in the house in midterm elections. <a href="https://www.jstor.org/stable/2130810" target="_blank" rel="noopener">https://www.jstor.org/stable/2130810</a> <a href="https://fivethirtyeight.com/features/why-the-presidents-party-almost-always-has-a-bad-midterm/" target="_blank" rel="noopener">https://fivethirtyeight.com/features/why-the-presidents-party-almost-always-has-a-bad-midterm/</a></p>
<p>Because the model only looks at the state-level the difference between the last two <em>midterm</em> cycles, these projections are capturing how midterm returns <em>change</em> in each state, which goes a ways toward correcting the consistent lower performance in midterms pattern, and in part may reflect changes in sentiment toward the president.</p>
<p>A less reliable trend that’s held since FDR’s time is that incumbent presidents have tended to get a higher percent of the popular vote in their election for a second term&mdash;Obama in 2012 was a notable exception. <a href="https://www.presidency.ucsb.edu/statistics/data/presidential-election-mandates" target="_blank" rel="noopener">https://www.presidency.ucsb.edu/statistics/data/presidential-election-mandates</a> What’s more, house midterm results seem to be particularly bad just before an incumbent is voted out of office. It&rsquo;s not clear if this is a bug or a feature, or how reliably this would be picked up using this method, but it&rsquo;s worthing pointing out.</p>
<h3 id="observation-accuracy-over-previous-election-results">Observation: Accuracy over Previous Election Results</h3>
<p>If it&rsquo;s hard to do better than election returns at t - 1, does this approach actually do better? Yes, by a little. Including all states, these projections have lower mean absolute error (MAE). For some reason these projections miss badly in 2004, and excluding that earliest year I can compute these projections using the MIT data, shows they do in fact do quite a bit better than simply relying on previous presidential election results alone.</p>
<p>
<figure id="figure-accuracy-over-time-for-projections-compared-with-simply-using-the-previous-election">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/AllStates_MAE_projections.png" alt="Accuracy over time for projections compared with simply using the previous election. " loading="lazy" data-zoomable /></div>
</div><figcaption>
Accuracy over time for projections compared with simply using the previous election.
</figcaption></figure>
</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-r" data-lang="r"><span class="line"><span class="cl"><span class="o">&gt;</span> <span class="n">bt_dat</span> <span class="o">%&gt;%</span> <span class="nf">group_by</span><span class="p">(</span><span class="n">proj_type</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">summarise</span><span class="p">(</span><span class="nf">mean</span><span class="p">(</span><span class="n">mae</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="c1"># A tibble: 2 × 2</span>
</span></span><span class="line"><span class="cl"> <span class="n">proj_type</span> <span class="nf">`mean</span><span class="p">(</span><span class="n">mae</span><span class="p">)</span><span class="n">`
</span></span></span><span class="line"><span class="cl"><span class="n"> &lt;chr&gt; &lt;dbl&gt;
</span></span></span><span class="line"><span class="cl"><span class="n">1 prev pres 3.11
</span></span></span><span class="line"><span class="cl"><span class="n">2 proj 2.65
</span></span></span><span class="line"><span class="cl"><span class="n">
</span></span></span><span class="line"><span class="cl"><span class="n">&gt; bt_dat %&gt;% filter(years != 2004) %&gt;% group_by(proj_type) %&gt;% summarise(mean(mae))
</span></span></span><span class="line"><span class="cl"><span class="n"># A tibble: 2 × 2
</span></span></span><span class="line"><span class="cl"><span class="n"> proj_type `</span><span class="nf">mean</span><span class="p">(</span><span class="n">mae</span><span class="p">)</span><span class="n">`</span>
</span></span><span class="line"><span class="cl"> <span class="o">&lt;</span><span class="n">chr</span><span class="o">&gt;</span> <span class="o">&lt;</span><span class="n">dbl</span><span class="o">&gt;</span>
</span></span><span class="line"><span class="cl"><span class="m">1</span> <span class="n">prev</span> <span class="n">pres</span> <span class="m">3.36</span>
</span></span><span class="line"><span class="cl"><span class="m">2</span> <span class="n">proj</span> <span class="m">2.51</span>
</span></span></code></pre></div><p>Results are more subtle if we restrict to battleground states:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-r" data-lang="r"><span class="line"><span class="cl"><span class="o">&gt;</span> <span class="n">bt_dat</span> <span class="o">%&gt;%</span> <span class="nf">group_by</span><span class="p">(</span><span class="n">proj_type</span><span class="p">)</span> <span class="o">%&gt;%</span> <span class="nf">summarise</span><span class="p">(</span><span class="nf">mean</span><span class="p">(</span><span class="n">mae</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="c1"># A tibble: 2 × 2</span>
</span></span><span class="line"><span class="cl"> <span class="n">proj_type</span> <span class="nf">`mean</span><span class="p">(</span><span class="n">mae</span><span class="p">)</span><span class="n">`
</span></span></span><span class="line"><span class="cl"><span class="n"> &lt;chr&gt; &lt;dbl&gt;
</span></span></span><span class="line"><span class="cl"><span class="n">1 prev pres 1.19
</span></span></span><span class="line"><span class="cl"><span class="n">2 proj 1.16
</span></span></span><span class="line"><span class="cl"><span class="n">&gt; bt_dat %&gt;% filter(years != 2004) %&gt;% group_by(proj_type) %&gt;% summarise(mean(mae))
</span></span></span><span class="line"><span class="cl"><span class="n"># A tibble: 2 × 2
</span></span></span><span class="line"><span class="cl"><span class="n"> proj_type `</span><span class="nf">mean</span><span class="p">(</span><span class="n">mae</span><span class="p">)</span><span class="n">`</span>
</span></span><span class="line"><span class="cl"> <span class="o">&lt;</span><span class="n">chr</span><span class="o">&gt;</span> <span class="o">&lt;</span><span class="n">dbl</span><span class="o">&gt;</span>
</span></span><span class="line"><span class="cl"><span class="m">1</span> <span class="n">prev</span> <span class="n">pres</span> <span class="m">1.28</span>
</span></span><span class="line"><span class="cl"><span class="m">2</span> <span class="n">proj</span> <span class="m">0.995</span>
</span></span></code></pre></div><h3 id="observation-regularization-toward-0-or-the-mean">Observation: Regularization toward 0 or the Mean?</h3>
<p>Here’s the map with shrinkage toward 0, which will move the estimates toward the prior year&rsquo;s election. Biden does worse in WI and slightly worse in AZ and PA, because the presidential swing estimates get pulled down toward zero instead of up toward the nation-wide state-level mean (3.5%). But he does better in NC, where the relatively good house results get pulled toward zero instead of down to the state-level average midterm swing (-11.5%).</p>
<p>
<figure id="figure-a-simple-forecasted-map-for-2024-shrinkage-toward-zero-created-httpswww270towincommaps66rzw">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/Map_for_2024_JS_0_Swing.png" alt="A simple forecasted map for 2024, shrinkage toward zero. Created https://www.270towin.com/maps/66rZw." loading="lazy" data-zoomable /></div>
</div><figcaption>
A simple forecasted map for 2024, shrinkage toward zero. Created <a href="https://www.270towin.com/maps/66rZw" target="_blank" rel="noopener">https://www.270towin.com/maps/66rZw</a>.
</figcaption></figure>
</p>
<p>However, based on my updated backtesting, the MAE estimates are worse when you shrink toward zero, which is what I did back in 2020. This makes me feel good because the mathematical/statistical theory says that shrinking toward the group mean should produce high quality estimates, while there&rsquo;s not much theory that suggests shrinking toward zero should improve estimation.</p>
<h3 id="methological-details">Methological Details</h3>
<p>For each state it estimates the &ldquo;swing&rdquo; from 2016 to 2020 for president and 2018-2022 for the U.S. house; then simply adds that to 2020 presidential returns. The projected state-level swing is weighted 60-40 toward presidential results.</p>
<p>Now, the tricky bit is I estimate “swing” using James-Stein-adjusted state-level slope. This method “shrinks” the slant of each slope toward 50-50 or toward the average slope. 50-50 is what I used in 2020 and but recent corrections I&rsquo;ve made to my backtesting scrips reveals that has a slightly higher mean absolute error going back to the 2004 election.</p>
<p>I&rsquo;ve since updated the approach in a number of important ways, based on backtesting (looking at how well the method performs on past elections). I now regularize (or &ldquo;shrink&rdquo;) fewer quantities and do so toward the mean instead of toward zero. I should also note that the original code I used a few minor errors, which I&rsquo;ve since fixed.</p>
</description>
</item>
<item>
<title>Disaggregating 'Ideological Segregation'</title>
<link>https://solmessing.netlify.app/post/thoughts-on-election-2020/</link>
<pubDate>Wed, 02 Aug 2023 00:00:00 +0000</pubDate>
<guid>https://solmessing.netlify.app/post/thoughts-on-election-2020/</guid>
<description><h3 id="tldr">TLDR:</h3>
<ol>
<li>[UPDATED SEPT 30] Yesterday, Science <a href="https://solmessing.netlify.app/pdf/science.adk1211.pdf">published a letter I wrote</a> arguing that there is little evidence of algorithmic bias in Facebook’s feed ranking system that would serve to increase ideological segregation, also known as <a href="https://books.google.com/books/about/The_Filter_Bubble.html?id=Qn2ZnjzCE3gC" target="_blank" rel="noopener">the &ldquo;Filter Bubble&rdquo; hypothesis</a>.</li>
<li>This contradicts claims in <a href="https://www.science.org/doi/full/10.1126/science.ade7138" target="_blank" rel="noopener">González-Bailón et al 2023</a> that Newsfeed ranking increases ideological segregation. This claim was the main piece of evidence in the Science <a href="https://www.science.org/toc/science/381/6656" target="_blank" rel="noopener">Special Issue on Meta</a> that might support the controversial cover that suggested that Meta’s algorithms are “Wired to Split.”</li>
<li>The issue is that while domain-level analysis suggests feed-ranking increases ideological segregation, URL-level analysis shows <em>no difference</em> in ideological segregation before and after feed-ranking.</li>
<li>And we should strongly prefer their URL-level analysis. Domain-level analysis <em>effectively mislabels highly partisan content</em> as &ldquo;moderate/mixed,&rdquo; especially on websites like YouTube, Reddit, and Twitter (<a href="https://ori.hhs.gov/education/products/niu_authorship/mistakes/09mistake-a.htm" target="_blank" rel="noopener">aggregation bias/ecological fallacy</a>).</li>
<li>Interestingly, the authors seem to agree&mdash;the discussion section points out problems with domain-level analysis.</li>
<li>Another <em>Science</em> paper from the same issue, <a href="https://www.science.org/doi/10.1126/science.abp9364" target="_blank" rel="noopener">Guess et al 2023</a> shows (in the SM) that Newsfeed ranking actually <em>decreases</em> exposure to <em>political content</em> from like-minded sources compared with reverse-chronological feedranking.</li>
<li>The evidence in the 4 recent papers is not consistent with a meaningful Filter Bubble effect in 2020; nor does it support the notion that Meta&rsquo;s algorithms are &ldquo;Wired to Split.&rdquo;</li>
<li>Furthermore, domain-level aggregation bias is a big issue in a great deal of past research on ideological segregation, because domain-level analysis <em>understates</em> media polarization. Because <a href="https://www.science.org/doi/full/10.1126/science.ade7138" target="_blank" rel="noopener">González-Bailón et al 2023</a> gives both URL- and domain-level estimates, we can see the magnitude of aggregation bias. It&rsquo;s huge.</li>
<li>I make a number of other observations about what we know about whether social media is polarizing and discuss implications for the controversial Science cover and Meta&rsquo;s flawed claims that this research is exculpatory.</li>
</ol>
<!-- 6. None of this is the last word on social media algorithms---as [González-Bailón et al 2023](https://www.science.org/doi/full/10.1126/science.ade7138) point out, we need additional research on friend/page/group/etc recommender systems, which may polarize the graph itself. -->
<h3 id="introicymi">Intro/ICYMI</h3>
<details>
<summary>Click to expand</summary>
<p>Last week saw the release of a series of <a href="https://www.science.org/toc/science/381/6656" target="_blank" rel="noopener">excellent papers in <em>Science</em></a>. I was particularly interested in <a href="https://www.science.org/doi/full/10.1126/science.ade7138" target="_blank" rel="noopener">González-Bailón et al 2023</a>, which measures &ldquo;ideological segregation.&rdquo; This concept is based on <a href="https://web.stanford.edu/~gentzkow/research/echo_chambers.pdf" target="_blank" rel="noopener">Matt Gentzkow and Jesse Shapiro&rsquo;s 2011 work</a>. As they note, &ldquo;The index ranges from 0 (all conservative and liberal visits are to the same outlet) to 1 (conservatives only visit 100% conservative outlets and liberals only visit 100% liberal outlets).&rdquo;</p>
<p>This paper also replicates and extends my own work with Eytan Bakshy and Lada Adamic, also published in <a href="https://solomonmg.github.io/pdf/Science-2015-Bakshy-1130-2.pdf" target="_blank" rel="noopener"><em>Science</em> in 2015</a>.</p>
<p>To be clear <a href="https://www.science.org/doi/full/10.1126/science.ade7138" target="_blank" rel="noopener">González-Bailón et al 2023</a> goes a lot further, examining how these factors vary over-time, investigating clusters of isolated partisan media organizations, and patterns in the consumption of misinformation. They find (1) ideological segregation is high; (2) ideological segregation “increases after algorithmic curation” consistent with the “Filter Bubble” hypothesis; (3) there is a substantial right wing “echo chamber” in which conservatives are essentially siloed from the rest of the site (4) where misinformation thrives.</p>
<p>I&rsquo;ve had many years to think about issues related to these questions, after working with similar data from 2012 in my dissertation, and co-authoring a <a href="https://solomonmg.github.io/pdf/Science-2015-Bakshy-1130-2.pdf" target="_blank" rel="noopener">Science paper</a> while working at Facebook using 2014 data. I also saw the design (but not results) presented at the <a href="https://www.ssrc.org/programs/digital-platforms-initiative/2023-ssrc-workshop-on-the-economics-of-social-media/" target="_blank" rel="noopener">2023 SSRC Workshop on the Economics of Social Media</a>, though I did not notice these issues until I saw the final paper.</p>
<p>I put together these thoughts after discussion and feedback from Dean Eckles and Tom Cunningham, former colleagues at Stanford, Facebook, and Twitter.</p>
<!-- [Click here to read the backstory on Echo Chambers, Filter Bubbles, and Selective Exposure, including where my own past work fits in](#a-brief-history-of-echo-chambers-filter-bubbles-and-selective-exposure) -->
</details>
<h3 id="a-brief-history-of-echo-chambers-filter-bubbles-and-selective-exposure">A Brief History of Echo Chambers, Filter Bubbles, and Selective Exposure</h3>
<details>
<summary>Click to expand</summary>
<!-- 20 years ago I worked as a foreign media analyst and I noticed that a great deal of misinformation circulating in news websites in Indonesia and the Middle East came from Alex Jones' InfoWars (yes he's been around for a long time). I became fascinated with the question of how people get their media and how technology changes that. -->
<p>The conventional academic wisdom when I started my PhD was that we shouldn&rsquo;t expect to see much in the way of media effects (<a href="https://books.google.com/books/about/The_Effects_of_Mass_Communication.html?id=CzcGAQAAIAAJ" target="_blank" rel="noopener">Klapper 1960</a>) because people tended to &ldquo;select into&rdquo; content that reinforced their views (<a href="https://www.jstor.org/stable/2747198" target="_blank" rel="noopener">Sears and Freedman 1967</a>).</p>
<p>In 2007, Cass Sunstein wrote <a href="https://www.jstor.org/stable/j.ctt7tbsw" target="_blank" rel="noopener">&ldquo;Republic.com 2.0&rdquo;</a>, which warned that the internet could allow us to even more easily isolate ourselves into &ldquo;information cocoons&rdquo; and &ldquo;echo chambers.&rdquo; Technology allows us to &ldquo;filter&rdquo; exactly what we want to see, and design our own programming. Cass also suggested this could lead to polarization.</p>
<!-- What's more, new media should be expected to further strengthen this "minimal effects" hypothesis ([Bennett and Iyengar 2008](https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=a2fcebd37ba1e919f662a287841b0356603cd0a4)). -->
<p>In 2009-2010, Sean Westwood and I ran a series of studies suggesting that <a href="https://journals.sagepub.com/doi/10.1177/0093650212466406" target="_blank" rel="noopener">popularity and social cues</a> in news aggregators and social media websites might be a way out of selective news consumption&mdash;we might be <em>more</em> exposed to cross-cutting views on platforms that feature a social component, and furthermore, this social component seemed to be more important that media &ldquo;source&rdquo; label. That was great in the abstract, but what happens on actual websites that people use?</p>
<p>The extent to which widely used social media platforms might allow us to exit &ldquo;echo chambers&rdquo; depended on the extent of cross-partisan friendships and interactions on platforms like Facebook. I did a PhD internship at Facebook to look into that, and the question of whether encountering <a href="https://www.dropbox.com/s/nu39148ukbab34r/CH7brief.pdf?raw=true" target="_blank" rel="noopener">political news on social media was ideologically polarizing</a> (note that the results are as not well powered as I would like). That work evolved into dissertation chapters and eventually our <a href="https://solomonmg.github.io/pdf/Science-2015-Bakshy-1130-2.pdf" target="_blank" rel="noopener">Science paper</a></p>
<p>Now <a href="https://en.wikipedia.org/wiki/Eli_Pariser" target="_blank" rel="noopener">Eli Parsner</a> had just published a book on &ldquo;<a href="https://books.google.com/books/about/The_Filter_Bubble.html?id=Qn2ZnjzCE3gC" target="_blank" rel="noopener">Filter Bubbles</a>&rdquo; suggesting that media technologies like Google Search and Facebook NewsFeed not only allowed us to ignore the &ldquo;other side,&rdquo; but actively filtering out search results and friends posts with perspectives from the other side.</p>
<p>What&rsquo;s more, a lot of people in the Human Computer Interaction (HCI) world were very interested in how one might examine this empirically, and I started collecting data with Eytan Bakshy that would do just that. The paper would allow us to quantify echo chambers created by our network of contacts, filter bubbles, and partisan selective exposure in social media by looking at exposure to <a href="https://www.jstor.org/stable/3117813" target="_blank" rel="noopener">ideologically &ldquo;cross-cutting&rdquo; content</a>.</p>
<p>We defined a few key components:</p>
<p><strong>Random</strong> - The set of content (external URLs) shared on Facebook writ large.</p>
<p><strong>Potential</strong> - The set of content shared by one&rsquo;s friends</p>
<p><strong>Exposed</strong> - The set of content appearing in one&rsquo;s Newsfeed.</p>
<p><strong>Selected</strong> - The set of content one clicks on.</p>
<p><strong>Endorsed</strong> - The set of content one &rsquo;likes'.</p>
<p>
<figure id="figure-figure-6-from-messing-2013-our-original-analysis-of-the-distribution-of-ideologically-aligned-content-on-facebook-using-data-from-2012">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/Ch6Fig6.5.jpg" alt="Figure 6 from Messing 2013: Our original analysis of the distribution of ideologically-aligned content on Facebook using data from 2012." loading="lazy" data-zoomable /></div>
</div><figcaption>
Figure 6 from Messing 2013: Our original analysis of the distribution of ideologically-aligned content on Facebook using data from 2012.
</figcaption></figure>
</p>
<p>
<figure id="figure-figure-3b-from-bakshy-et-al-2015-data-from-2014-quantifying-cross-cutting-content-on-facebook">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/ScienceBakshyFig3B.jpg" alt="Figure 3B from Bakshy et al 2015: Data from 2014 quantifying cross-cutting content on Facebook." loading="lazy" data-zoomable /></div>
</div><figcaption>
Figure 3B from Bakshy et al 2015: Data from 2014 quantifying cross-cutting content on Facebook.
</figcaption></figure>
</p>
<p>Unlike when I started to study social media in 2009 and no one was interested, in 2014, people understood that social media was an important force that was reshaping at least media if not society more broadly. And Science was particularly interested in the role of algorithms played in this environment.</p>
<p>We published our results in <a href="https://solomonmg.github.io/pdf/Science-2015-Bakshy-1130-2.pdf" target="_blank" rel="noopener">Science</a>, and the response from many was &ldquo;well this is smaller than expected,&rdquo; including a piece in <em>Wired</em> from <a href="https://www.wired.com/2015/05/did-facebooks-big-study-kill-my-filter-bubble-thesis/" target="_blank" rel="noopener">Eli Parisner himself</a>. David Lazer, one of the lead authors of <a href="https://www.science.org/doi/full/10.1126/science.ade7138" target="_blank" rel="noopener">González-Bailón et al 2023</a>, also <a href="https://education.biu.ac.il/sites/education/files/shared/science-2015-lazer-1090-1.pdf" target="_blank" rel="noopener">wrote a perspective in <em>Science</em></a>.</p>
<p>However, the piece immediately drew a great deal of criticism. This is in part because I wrote that exposure was driven more by individual choices than algorithms, which ignored the potential influence of friend recommendation systems and, more broadly, swept aside the extent to which interfaces structure interactions on websites, which my own dissertation work had shown was quite substantial.</p>
<p>The study also had important limitations, many of which I addressed in a post suggesting how <a href="https://solomonmg.github.io/post/exposure-to-ideologically-diverse-response/" target="_blank" rel="noopener">future work could provide a more robust picture</a>.</p>
<p>I was (much later) tech lead for Social Science One (2018-2020), which gave external researchers access to data (the <a href="https://solmessing.netlify.app/pdf/Facebook_DP_URLs_Dataset.pdf">&lsquo;Condor&rsquo; URLs data set</a>) via differential privacy. My goal was to enable the kind of work research done <a href="https://www.science.org/doi/full/10.1126/science.ade7138" target="_blank" rel="noopener">González-Bailón et al 2023</a>. However, it soon became clear that Social Science One&rsquo;s data sharing model and differential privacy in particular was not suitable for ground-breaking research.</p>
<p>I personally advocated (with Facebook Researcher and longtime colleague <a href="https://twitter.com/anniefranco" target="_blank" rel="noopener">Annie Franco</a>) for the collaboration model used in the Election 2020 project, wherein external researchers would collaborate with Facebook researchers. That model would have to shield the research from any interference from Facebook&rsquo;s Communications and Policy arm, which might attempt to interfere with the inturpretation or publication of any resulting papers, which would create ethical conflicts.</p>
<p>I advocated for pre-registration to accomplish this, not merely to ensure scientific rigor but to protect against conflicts of interest and selective reporting of results. However, I left Facebook in January of 2020 and have not been deeply involved in the project since.</p>
<p><a href="https://www.science.org/doi/full/10.1126/science.ade7138" target="_blank" rel="noopener">González-Bailón et al 2023</a> does in fact accomplish most if not all of what I recommended future research do and goes much further than our original study, and the authors should be applauded for it. The paper shows that on Facebook (1) ideological segregation is high (in fact it&rsquo;s arguably higher than implied in the paper); (2) there is a substantial right wing &ldquo;echo chamber&rdquo; in which conservatives are siloed from the rest of the site (3) where misinformation thrives. When they start to talk about the filter bubble though, things get more complicated.</p>
</details>
<h3 id="is-there-a-filter-bubble-on-facebook">Is there a Filter Bubble on Facebook</h3>
<p>Are Facebook&rsquo;s algorithms &ldquo;Wired to split&rdquo; the public? This question is at the core of the recent <em>Science</em> issue, it&rsquo;s hotly debated in the field of algorithmic bias. A suspicion that the answer is &ldquo;yes&rdquo; has motivated a number of policy and regulatory actions. Armed with unprecendented data, <a href="https://www.science.org/doi/full/10.1126/science.ade7138" target="_blank" rel="noopener">González-Bailón et al 2023</a> seeks to answer this and other related questions.</p>
<p>There are three relevant claims that answer this question in the text: (1) &ldquo;ideological segregation is high and increases as we shift from potential exposure to actual exposure to engagement&rdquo; in the abstract, (2) &ldquo;The algorithmic promotion of compatible content from this inventory is positively associated with an increase in the observed segregation as we move from potential to exposed audiences&rdquo; in the discussion section, and (3) &ldquo;Segregation scores drawn from exposed audiences are higher than those based on potential audiences &hellip; (the difference between potential and engaged audiences is only visible at the domain level),&rdquo; in the caption of Figure 2.</p>
<p>These statements are generally confirmatory of algorithmic segregation, aka the <a href="https://books.google.com/books/about/The_Filter_Bubble.html?id=Qn2ZnjzCE3gC" target="_blank" rel="noopener">Filter Bubble hypothesis</a>.</p>
<p>But look at Figure 2, on which these claims seem to be based. Figure 2B shows an increase in observed segregation as you move from potential to exposed audiences. BUT Figure 2C&mdash;describing the same phenomena&mdash;does <em>not</em> (as noted in the caption).</p>
<p>
<figure id="figure-when-viewed-at-the-level-of-the-url-2c-the-study-is-consistent-with-a-negligible-filter-bubble-effect">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/BailonFig2BC.jpeg" alt="Figure 2 (B &amp;amp; C) from González-Bailón et al 2023: When viewed at the level of the URL (2C), the study is consistent with a &lt;em&gt;negligible&lt;/em&gt; Filter Bubble effect" loading="lazy" data-zoomable /></div>
</div><figcaption>
When viewed at the level of the URL (2C), the study is consistent with a negligible Filter Bubble effect
</figcaption></figure>
</p>
<h3 id="so-which-is-it">So which is it?</h3>
<p>First, what&rsquo;s the difference between these two figures? 2B aggregates things at the domain level (e.g., <a href="https://www.yahoo.com" target="_blank" rel="noopener">www.yahoo.com</a>) while 2C aggregates things at the URL level (e.g., <a href="https://www.yahoo.com/news/pence-trumps-indictment-anyone-puts-002049678.html%29" target="_blank" rel="noopener">https://www.yahoo.com/news/pence-trumps-indictment-anyone-puts-002049678.html)</a>. That means 2B treats all shares from yahoo.com the same, while 2B looks at each story separately.</p>
<p>If you&rsquo;re like me, when you think of political news, you have in mind domains like FoxNews.com or MSNBC.com, where it&rsquo;s likely that the website itself has a distinct partisan flavor.</p>
<p>But YouTube.com and Twitter.com both appear in the &ldquo;Top 100 Domains by Views&rdquo; in the study&rsquo;s SM, which obviously host a ton of both far left and far right or &ldquo;mixed&rdquo; content. And indeed, Figure S10 below shows that there are an array of domains that host some far right content and some far left content.</p>
<p>
<figure id="figure-some-domains-host-some-far-right-content-urls-and-some-far-left-content">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/BailonFigS10.jpeg" alt="Figure S10 from González-Bailón et al 2023: Some domains host some far right content and some far left content." loading="lazy" data-zoomable /></div>
</div><figcaption>
Some domains host some far right content (URLs) and some far left content.
</figcaption></figure>
</p>
<!-- There's also just something about content that makes an argument that we seem to want to share, as suggested by analysis in my [dissertation](https://www.dropbox.com/s/zfw1d9j60hqjil7/sudiss.pdf?raw=true), which shows that NYT editorials are more likely than content from other sections to be shared (via email). Consider two OpEds at the NYT---one by conservative columnist Bret Stevens, the other by well-known liberal Nicholas Kristof. The former is going to be shared more by conservatives, the latter more by liberals. -->
<p>But even if we&rsquo;re talking about NYTimes.com it&rsquo;s not hard to see that for example, conservatives might be more likely to share conservative Op Eds from Bret Stevens, while liberals may be more likely to share Op Eds from Nicholas Kristof.</p>
<p>If you aggregate your analysis to the domain level, you&rsquo;ll miss this aspect of media polarization.</p>
<h3 id="domains-or-urls">Domains or URLs</h3>
<p>So which is right? <a href="https://www.science.org/doi/full/10.1126/science.ade7138" target="_blank" rel="noopener">González-Bailón et al 2023</a> suggests that we should prefer the URL-level analysis. Here&rsquo;s the passage in the discussion section, which describes why analyzing media polarization at the level of the domain is problematic:</p>
<p>&ldquo;As a result of social curation, exposure to URLs is systematically more segregated than exposure to domains&hellip; <em>A focus on domains rather than URLs will likely understate, perhaps substantially, the degree of segregation in news consumption online.</em>&rdquo; (Emphasis added)</p>
<p>What&rsquo;s more, past work (co-authored by one of the lead authors) shows that <a href="https://osf.io/vbwer" target="_blank" rel="noopener">domain-level analysis can indeed mask “curation bubbles”</a> in which &ldquo;specific stories attract different partisan audiences than is typical for the outlets that produced them.&rdquo;</p>
<p>You can see this in the data clear as day&mdash;let&rsquo;s go back to Figure 2A, which shows a massive increase in estimated segregation when using URLs rather than domains:</p>
<p>
<figure id="figure-figure-2a-shows-much-higher-levels-of-audience-segregation-when-you-look-at-the-url-level-rather-than-the-domain-level">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/BailonFig2.jpg" alt="Figure 2 from González-Bailón et al 2023: There are much higher levels of audience segregation when you look at the URL level rather than the domain level" loading="lazy" data-zoomable /></div>
</div><figcaption>
Figure 2A shows much higher levels of audience segregation when you look at the URL level rather than the domain level
</figcaption></figure>
</p>
<p>Returning to Figures 2B and 2C, it seems that the potential audience at domain level is <em>artificially less segregated</em>, due to the aggregation at the level of the domain.</p>
<h3 id="what-explains-the-filter-bubble-discrepency">What explains the &lsquo;Filter Bubble discrepency&rsquo;?</h3>
<p>One possibility is that posts linking to content from &ldquo;mixed&rdquo; domains like YouTube, Reddit, Twitter, Yahoo, etc. do not score as well in feed-ranking. It&rsquo;s possible that partisan content on these domains is more likely to be downranked as misinformation or spam, or maybe Facebook-native videos (which render faster/better) have an edge over YouTube, or perhaps there are domain-level features in feed ranking, or maybe there is some other reason that content from &rsquo;non-mixed&rsquo; domains just performs better in the rather complex recommendation system that powers Newsfeed ranking.</p>
<p>Regardless, that would explain the results in Figure 2&mdash;making the potential audience look artificially broader than the actual audience, when you analyze content at the domain level.</p>
<h3 id="what-about-the-reverse-chron-experiment">What about the Reverse-Chron experiment?!</h3>
<p>Surely we can paint a fuller picture of the impact of algorithmic ranking on media polarization with that other excellent recent <em>Science</em> paper which looked at the <em>causal</em> effect of turning off Newsfeed ranking. Maybe we can cross-reference that paper and get a clearer picture of what&rsquo;s happening.</p>
<p><a href="https://www.science.org/doi/10.1126/science.abp9364" target="_blank" rel="noopener">Guess et al 2023</a> shows that Newsfeed induces proportionally <em>more</em> exposure cross-cutting sources but also more exposure to like-minded sources. It reduces exposure to moderate or mixed sources. Importantly, this is not just news that news sources link to, it&rsquo;s all content that everyone posts, including life updates, pictures, videos, etc.</p>
<p>
<figure >
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/GuessFig.jpg" alt="Figure 2 from Guess et al 2023: Compared to a reverse chronologically-ranked feed, FB&amp;rsquo;s ranking system induces a proportionally more exposure to &amp;ldquo;like-minded&amp;rdquo; sources but also more to cross-cutting sources, defined at the level of the entity posting." loading="lazy" data-zoomable /></div>
</div></figure>
</p>
<p>Ok, but what about political content? In the supplimentary materials, we see that when it comes to <em>political content</em>, Newsfeed ranking actually <em>decreases</em> exposure to political content from like-minded sources.</p>
<p>
<figure >
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/GuessChronScienceTabS20.jpg" alt="Figure S20 from Guess et al 2023: A reverse chronologically-ranked Newsfeed induces a proportionally less exposure to &amp;ldquo;like-minded&amp;rdquo; sources but also less to cross-cutting sources, defined at the level of the entity posting." loading="lazy" data-zoomable /></div>
</div></figure>
</p>
<p>What about exposure to political content posted by cross-cutting sources? The SM doesn&rsquo;t provide that, but it does provide a paragraph noting that Newsfeed <em>decreased</em> exposure to political news from partisan sources relative to reverse-chron!</p>
<p>
<figure >
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/GuessChronScienceS3.3.jpg" alt="S3.3 in Guess et al 2023: A reverse chronologically-ranked Newsfeed induces a proportionally less exposure to &amp;ldquo;like-minded&amp;rdquo; sources but also less to cross-cutting sources, defined at the level of the entity posting." loading="lazy" data-zoomable /></div>
</div></figure>
</p>
<p>Now a big caveat here is that it&rsquo;s clear from the main results that political content is not doing well in Newsfeed ranking. Note that there are <a href="https://www.wsj.com/articles/facebook-politics-controls-zuckerberg-meta-11672929976" target="_blank" rel="noopener">reports that the company decided to downrank political and news content in 2021</a>.</p>
<p>Regardless, the results in the SM are not at all suggestive of a filter bubble, at least during the 2020 election&mdash;the experimental results suggest that if anything feedranking is showing us <em>less</em> polarizing content than we would see with reverse-chron.</p>
<h3 id="how-fb-groups-impact-estimates-of-algorithmic-segregation">How FB Groups impact estimates of algorithmic segregation</h3>
<p>I sent a much earlier draft of this to <a href="https://www.asc.upenn.edu/people/faculty/sandra-gonzalez-bailon-phd" target="_blank" rel="noopener">Sandra González-Bailón</a> and <a href="https://cssh.northeastern.edu/faculty/david-lazer/" target="_blank" rel="noopener">David Lazer</a>, lead authors for <a href="https://www.science.org/doi/full/10.1126/science.ade7138" target="_blank" rel="noopener">González-Bailón et al 2023</a>. Sandra pointed me to Figure S14, which does show a slight increase in segregation post-ranking for URLs shared by users and pages, but shows the <em>opposite</em> for content shared in those often-contentious Facebook groups.</p>
<!-- People may be especially likely to come across "persons dissimilar to themselves, and with modes of thought and action unlike those with which they are familiar" [(Mill cited in Mutz and Mondak, 2006)](https://www.polisci.upenn.edu/sites/default/files/mutz_mondak_2006.pdf). And it seems likely based on the plot below that group posts are treated differently from page posts and friend posts in Facebook feedranking. -->
<p>So it&rsquo;s really <em>not</em> that there&rsquo;s no difference at all pre- and post- ranking, just that the difference is on average zero once you include groups. Of course, the population of people who see content from groups and/or pages in Newsfeed may be unusual, and future work should dig into this variation.</p>
<p>I should also note that this small but real difference seems more or less consistent with what we <a href="https://www.science.org/doi/10.1126/science.aaa1160" target="_blank" rel="noopener">found in past work</a>, which only examined news shared by users (excluding pages and groups).</p>
<p>
<figure >
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/BailonFigS14.jpg" alt="Figure S14 from González-Bailón et al 2023: There is a modest filter bubble effect among users and pages, and a &amp;ldquo;reverse filter bubble&amp;rdquo; for content shared in groups." loading="lazy" data-zoomable /></div>
</div></figure>
</p>
<h3 id="the-algorithm-and-the-most-politically-engaged">The algorithm and the most politically engaged</h3>
<p>There is also a hint of an increase in segregation post-ranking among the most politically engaged 10% of Facebook users. However, the paper notes that this trend &ldquo;is only clear for domain-level data,&rdquo; which we&rsquo;ve already established should not be used here. (Note that they define high political interest users as those in the &ldquo;top 10% of engagement&hellip; (comments, likes, reactions, reshares) with content classified as political on Facebook&hellip;)&rdquo;).</p>
<p>
<figure >
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/BailonFigS19.jpg" alt="Figure S19 shows a modest filter bubble effect among the most engaged users at the peak of the 2020 election." loading="lazy" data-zoomable /></div>
</div></figure>
</p>
<p>A similar pattern holds for the top 1% of users (Figure S23 in the SM).</p>
<h3 id="algorithmic-segregation-and-ideology">Algorithmic segregation and ideology</h3>
<p>Feed ranking seems to expose both conservatives and liberals to more liberal content. This is consistent with my priors that conservative content is more likely to violate policy and be taken down or subject to &ldquo;soft actioning&rdquo; (e.g., downranking) for borderline violations.</p>
<p>So now things get messy&mdash;should we really say liberals are in a filter bubble (and conservatives aren&rsquo;t) if misinformation is included in that calculation?</p>
<!-- exposure to cross cutting content was thought to [reduce political participation](https://www.jstor.org/stable/3088437). -->
<p>
<figure id="figure-feedranking-exposes-you-to-slightly-more-liberal-content">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/BailonTabS8.jpeg" alt="Feedranking exposes you to slightly more liberal content, presumably due to misinformation actioning." loading="lazy" data-zoomable /></div>
</div><figcaption>
Feedranking exposes you to slightly more liberal content.
</figcaption></figure>
</p>
<p>We can see a similar pattern when we look at exposure to cross-cutting content, which is the measure our 2015 Science paper used. Conservatives see more liberal content in feed than their friends share.</p>
<p>
<figure id="figure-replication-of-bakshy-et-al-2015">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/BailonFigS11.jpg" alt="Replication of Bakshy et al 2015, showing little effect of feed-ranking among content shared by friends. However, when including page and group content, feed-ranking plays a more important role&amp;mdash;exposing liberals to proportionally less cross-cutting content and conservatives to comparatively more." loading="lazy" data-zoomable /></div>
</div><figcaption>
Replication of Bakshy et al 2015
</figcaption></figure>
</p>
<p>We can also see that things have changed a lot since we wrote our original Science piece in 2015. Liberals seem to see <em>far</em> more cross-cutting content, conservatives less.</p>
<!-- What about how this compares to others ways we get media on the internet? The evidence is not great here, but [Flaxman et al 2016](https://academic.oup.com/poq/article/80/S1/298/2223402) show that ideological segregation is higher for search than for social media websites, but lower for news aggregators and direct navigation to news websites. -->
<h3 id="brevity-is-a-double-edged-sword">Brevity is a double-edged sword</h3>
<p>Science gives authors only limited space and it would have been difficult to dig into everything I&rsquo;ve written about in 2 pages. I also know better than almost anyone just how much work went into these papers (I would bet thousands of hours for each of several authors), and how difficult it can be explain everything perfectly when you&rsquo;re pulling off such a big lift. I should also point out that these papers are very nuanced, well-caveated, and careful not to overstate their results regarding the filter bubble or algorithmic polarization.</p>
<p>Still, I do wish this work had squarely focused on URL-level analyses.</p>
<h3 id="what-this-means-for-other-studies">What this means for other studies</h3>
<p>This also means that past estimates of ideological segregation based domain-level analysis probably <em>understate</em> media polarization in a big way. This includes those based on <a href="https://journalqd.org/article/view/2586/2683" target="_blank" rel="noopener">Facebook data</a>, <a href="https://onlinelibrary.wiley.com/doi/epdf/10.1111/ajps.12589" target="_blank" rel="noopener">browser data</a>, on data describing various <a href="https://academic.oup.com/poq/article/80/S1/298/2223402" target="_blank" rel="noopener">platforms</a>) or simply websites across the <a href="https://web.stanford.edu/~gentzkow/research/echo_chambers.pdf" target="_blank" rel="noopener">internet</a>. I have said this for a long time but I did not think the magnitude was as strong as shown in <a href="https://www.science.org/doi/full/10.1126/science.ade7138" target="_blank" rel="noopener">González-Bailón et al 2023</a>.</p>
<h3 id="so-is-facebook-polarizing">So, is Facebook polarizing?</h3>
<p>It&rsquo;s very hard to give a good answer to this question. In &ldquo;The Paradox of Minimal Effects,&rdquo; Stephen Ansolabehere points out that re-election depends overwhelmingly on whether the country is prosperous and at peace, not what happens with media politics. This is thought to be because people selectively consume media, which serves mainly to reinforce their beliefs; while at the same time, the sum total of people&rsquo;s private lived experiences correponds reasonably well to aggregated economic data.</p>
<p>There&rsquo;s an argument that social media exists somewhere in between the conventional media and one&rsquo;s lived experiences. And what about evidence? As Sean Westwood and I have shown, partisan selectivity is far less severe when you <a href="https://journals.sagepub.com/doi/10.1177/0093650212466406" target="_blank" rel="noopener">add a social element to news consumption</a>. What&rsquo;s more, field-experimental work I did shows that <a href="https://www.dropbox.com/s/nu39148ukbab34r/CH7brief.pdf?raw=true" target="_blank" rel="noopener">increasing the prominence of political news in Facebook&rsquo;s Newsfeed</a> shifted issue positions toward the majority of news encountered (left-leaning), particularly among political moderates.</p>
<p>Tom Cunningham recently wrote a nice <a href="https://tecunningham.github.io/posts/2023-07-27-meta-2020-elections-experiments.html#other-evidence-on-media-and-polarization" target="_blank" rel="noopener">summary of some of the evidence related to the question of whether any kind of media might increase affective polarization</a>, which we discussed at length.</p>
<p>The evidence suggests any effect is likely small. First, we see that while social media is a global phenomenon, affective polarization is not&mdash;the <a href="https://direct.mit.edu/rest/article-abstract/doi/10.1162/rest_a_01160/109262/Cross-Country-Trends-in-Affective-Polarization?redirectedFrom=fulltext" target="_blank" rel="noopener">UK, Japan, and Germany have seen affective depolarization</a>.</p>
<p>Second, perhaps the highest quality experimental study on this question I&rsquo;ve seen is <a href="https://osf.io/jrw26/" target="_blank" rel="noopener">Broockman and Kalla (2022)</a>, which finds that paying heavy Fox News viewers to watch CNN has generally depolarizing effects, though as Tom points out, finds null effects on traditional measures of affective polarization.</p>
<p>Third, Tom and I have also discussed an excellent experimental study attempting to shed light on this: &ldquo;<a href="https://www.aeaweb.org/articles?id=10.1257/aer.20190658" target="_blank" rel="noopener">Welfare Effects of Social Media</a>,&rdquo; which concludes that Facebook is likely polarizing. They find that &ldquo;deactivating Facebook for the four weeks before the 2018 US midterm election&hellip; makes people less informed, it also makes them less polarized by at least some measures, consistent with the concern that social media have played some role in the recent rise of polarization in the United States.&rdquo;</p>
<p>The study defines political polarization in an unusual way&mdash;including congenial media exposure&mdash;how much news you see from your own side&mdash;in its polarization index. Most political scientists would consider congenial media exposure as <em>the thing that might cause polarization</em>, but not an aspect of polarization in and of itself.</p>
<p>
<figure id="figure-allcott-et-al-2020-figure-3-demonstrates-the-biggest-effect-is-on-a-measure-of-media-exposure">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/AlcottetalFigure3.jpg" alt="Allcott et al 2020 Figure 3" loading="lazy" data-zoomable /></div>
</div><figcaption>
Allcott et al 2020 Figure 3 demonstrates the biggest effect is on a measure of media exposure
</figcaption></figure>
</p>
<p>They do explain that their effects on affective polarization are not significant and they don&rsquo;t try to hide what&rsquo;s going into the measure. But you have to read beyond the abstract and the media headlines to really understand this point.</p>
<!-- They make the following claim in a footnote: "Online Appendix Table A16 shows that the effect on the political polarization index is *robust to excluding each of the seven individual component variables in turn*, although the point estimate moves toward zero and the unadjusted p-value rises to 0.09 when omitting congenial news exposure." -->
<p>Notably, in a robustness test in the appendix, the effect on the polarization loses statistical significance when you exclude this variable.</p>
<p>
<figure id="figure-allcott-et-al-2020-figure-a16-shows-that-the-effect-on-polarization-is-not-significant-at-p005-if-you-exclude-congenial-media-exposure">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/AlcottetalFigureA16.jpg" alt="Allcott et al 2020 Figure A16" loading="lazy" data-zoomable /></div>
</div><figcaption>
Allcott et al 2020 Figure A16 shows that the effect on polarization is not significant at P&lt;0.05 if you exclude congenial media exposure.
</figcaption></figure>
</p>
<p>That means the folks who didn&rsquo;t deactivate Facebook had higher levels of political knowledge and higher levels of issue polarization. This makes sense because if a person doesn&rsquo;t know where the parties stand on an issue, she is less likely to be sure about where she ought to stand.</p>
<h3 id="implications-for-how-this-was-publicized">Implications for how this was publicized</h3>
<p>All of this is relevant in light of the controversial Science Cover, which suggests Facebook&rsquo;s <em>algorithms</em> are &ldquo;Wired to Split&rdquo; us. It may be true, but the evidence across all 4 <em>Science</em> and <em>Nature</em> papers is not decisive on this question.</p>
<p>None of the experiments published so far show an impact on affective or ideological polarization. What&rsquo;s more, the proper URL-level analyses in <a href="https://www.science.org/doi/full/10.1126/science.ade7138" target="_blank" rel="noopener">González-Bailón et al 2023</a> show only a modest &lsquo;Filter Bubble&rsquo; for certain subsets of the data, and when it comes to political news, and the reverse-chronological feed ranking experiment shows Newsfeed ranking feeds us <em>less</em> polarized political content than would see in a reverse-chron Newsfeed.</p>
<p>Of course, the spin from the Meta Comms team that these <a href="https://www.wsj.com/articles/does-facebook-polarize-users-meta-disagrees-with-partners-over-research-conclusions-24fde67a" target="_blank" rel="noopener">results are exculpretory</a> is also highly problematic. This claim is not only wrong but amatuerish and self-defeating from a strategic perspective, and I was surprised to read about it.</p>
<p>For all the amazing work done to produce the experimental results, the data are too noisy to detect small but potentially compounding effects on polarization as suggested in <a href="https://statmodeling.stat.columbia.edu/author/dean/" target="_blank" rel="noopener">a post-publication review from Dean Eckles</a> and <a href="https://tecunningham.github.io/posts/2023-07-27-meta-2020-elections-experiments.html" target="_blank" rel="noopener">Tom Cunningham</a>.</p>
<p>What&rsquo;s more, even the excellent work done here in <a href="https://www.science.org/doi/full/10.1126/science.ade7138" target="_blank" rel="noopener">González-Bailón et al 2023</a> does not speak to the question of effects on polarization that other key recommender systems at Facebook may have: the People You May Know (PYMK) algorithm, which facilitates network connections on the website, along with the Pages You Might Like (PYML) and Groups You Might Like (GYML). The authors make a similar point in the Supplementary Materials, S3.2&mdash;pointing out that inventory, or the &ldquo;potential audience,&rdquo; &ldquo;results from another curation process determining the structure and composition of the Facebook graph, which itself results from social and algorithmic dynamics.&rdquo;</p>
<p>This means that we should <em>not</em> necessarily conclude that exposure is all about individual choices and not algorithms based on the sum total of evidence we have (a point I should have better emphasized <a href="https://solomonmg.github.io/pdf/Science-2015-Bakshy-1130-2.pdf" target="_blank" rel="noopener">in past work</a>)&mdash;algorithms may play an important role and as usual, more research is needed.</p>
<p>It would seem to me that both Science and Meta Comms are both going beyond the data here.</p>
<p>
<figure id="figure-wired-to-split">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/ScienceCoverWiredtoSplit.jpeg" alt="The Science Cover suggests Facebook&amp;rsquo;s algorithms are &amp;ldquo;Wired to Split&amp;rdquo; us." loading="lazy" data-zoomable /></div>
</div><figcaption>
Wired to Split
</figcaption></figure>
</p>
<h3 id="disclosures">Disclosures</h3>
<p>As noted above, from 2018-2020, I was tech lead for Social Science One, which gave external researchers <em>direct</em> access to data (the <a href="https://solmessing.netlify.app/pdf/Facebook_DP_URLs_Dataset.pdf">&lsquo;Condor&rsquo; URLs data set</a>) via differential privacy. However, that project has not yielded much research output for a number of organizational and operational reasons, including the fact that differential privacy is not yet suitable for such a complex project.</p>
<p>While at Facebook, I personally advocated (with Annie Franco) for the collaboration model used in the Election 2020 project, wherein external researchers would collaborate with Facebook researchers. That model would have to shield the research from any interference from Facebook&rsquo;s Communications and Policy arm, which would violate scientific ethics. It would involve pre-registration, not merely to ensure scientific rigor but to protect against conflicts of interest and selective reporting of results. However, I left Facebook in January of 2020 and have not been deeply involved in the project since.</p>
<p>I recently left Twitter (requesting to be in the first rounds of layoffs after Elon Musk took over) and started a job at NYU&rsquo;s CSMaP lab when my employment with Twitter ended. There are authors who are affiliated with my lab on the paper, including one of the PIs, Josh Tucker. My graduate school Advisor, Shanto Iyengar is also on the paper, and I consider the majority of the authors to be my colleagues and friends.</p>
<p>See also my <a href="https://solmessing.netlify.app/disclosures/">disclosures page</a>.</p>
<!-- I do wish the samples and timeframes for the ranking experiments were bigger so we could understand potentially smaller effects, which may be very important. --></description>
</item>
<item>
<title>On BlueSky</title>
<link>https://solmessing.netlify.app/post/bluesky-quasi-decentralized-social-network/</link>
<pubDate>Mon, 03 Apr 2023 00:00:00 +0000</pubDate>
<guid>https://solmessing.netlify.app/post/bluesky-quasi-decentralized-social-network/</guid>
<description><p>TL/DR Summary</p>
<ol>
<li>BlueSky has a chance to dethrone twitter right now, but that path is narrow. </li>
<li>Its exclusive invite only model means its user base is now small, elite, and homogenous with few bad actors. Almost everyone likes it. But the real test will be when it opens to the public. </li>
<li>It is designed for true account portability and in theory should prevent a single company from owning the entire network as it scales up. </li>
<li>However, it’s unclear if an ecosystem of small companies can do the job of content moderation in the same ways that centralized social networks do. The same is true of running modern feed-ranking and follow-recommendation systems. </li>
<li>There will be growing pressure to make money using ads to cover costs as the network scales up, which will incentivize centralizing key data and resources, undermining the original model.</li>
<li>Future possibilities include: (1) BlueSky remains de-facto centralized, &ldquo;in beta&rdquo; until it can get composable moderation right, which turns out to be the foreseeable future; (2) big players (Google, Facebook) join the party and dominate the ecosystem; (3) small, unmoderated, ad-free apps proliferate and the network becomes overrun with spam, NSFW, hate, scams and gifts that come with a lack of moderation.</li>
</ol>
<hr>
<p>Pretty much everyone at Twitter&mdash;and especially Jack Dorsey&mdash;has long known that BlueSky could replace Twitter. When I joined Twitter in 2021, I soon learned our CEO was terribly unpopular internally, sporting a job approval rating under 40 percent, by far the lowest of any executive at the company.</p>
<p>In fact, Jack was obsessed with decentralization, he seemed convinced that it was a mistake to have Twitter organized as a corporation, and he would rant about this on company-wide calls, which he seemed to be taking from caves in South Asia. This is when everyone else at the company was desperately trying to increase revenues to save the company from implosion.</p>
<p>
<figure id="figure-this-photo-of-jack-dorsey-captures-his-general-aspect-on-many-all-hands-calls">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/JackDorseyInCaveCreditTwitter.jpg" alt="A photo of young Jack Dorsey in a cave." loading="lazy" data-zoomable /></div>
</div><figcaption>
This photo of Jack Dorsey captures his general aspect on many all-hands calls.
</figcaption></figure>
</p>
<p>Enter BlueSky, which would decentralize Twitter. Jack launched the initiative in 2019, and his plan was to migrate Twitter to this new protocol. It puts user data including posts and follow lists on open, public portable data servers (PDSs) that mean true account portability. Any business or organization could index the those servers, or what I will call the “BlueSkyVerse” (technically the <a href="https://blueskyweb.xyz/blog/10-18-2022-the-at-protocol" target="_blank" rel="noopener">AT Protocol</a>), rank posts, and create a front end interface.</p>
<p>
<figure id="figure-the-bluesky-app-reads-posts-and-the-follow-graph-from-portable-data-servers-centralizing-them-in-an-index-ranking-and-dislaying-them-for-users">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/BlueSkyVerse-illustration.jpg" alt="A node labelled as BlueSky or App sits atop various Portable Data Server (PDS) nodes, with arrows (edges) pointing to them. A caption to the right of the App node reads &amp;ldquo;indexing, ranking, moderation, UX&amp;rdquo; and a caption to the right of the PDS nodes reads &amp;ldquo;Open AT protocol: user posts, likes, follow graph.&amp;rdquo;" loading="lazy" data-zoomable /></div>
</div><figcaption>
The BlueSky App reads posts and the follow graph from Portable Data Servers, centralizing them in an index, ranking, and dislaying them for users.
</figcaption></figure>
</p>
<p>But wait a minute! Remember during Elon Musk’s acquisition how everyone said that the value of twitter isn’t the tech, but rather the network of creators and the communities that exist there? If you decouple that network from the platform you give up your most valuable asset—Google, Meta, others can index the network, develop a user interface, create some algorithms, show ads, and eat your lunch.</p>
<p>And yet, Jack was about to do just that, filling Twitter’s moat by turning its most valuable asset into a protocol. Of course, this did not go over well with employees who weren&rsquo;t independently wealthy, nor the board, who eventually pushed him out.</p>
<p>BlueSky nicely captures the essence of Jack’s reign as half-time CEO: how little he cared about Twitter as a business and how much he cared about Twitter as an ecosystem.</p>
<p>But back to the question everyone cares about right now: will this new system lead to a better social network, or set of networks? Is this finally the Twitter alternative we’re looking for?</p>
<p>Make no mistake about it&mdash;BlueSky was designed by Twitter to replace Twitter. This makes it very different from the other new social media protocols, apps, etc. that we’ve seen come on the scene of late. As <a href="https://mastodon.social/@gruber/110314523447694321" target="_blank" rel="noopener">John Gruber put it</a>, “If you hated Twitter, you’ll like Mastodon. If you liked Twitter, you’ll love BlueSky.”</p>
<p>So it’s a contender, despite how hard it is to start a social network from scratch. And don’t get any funny ideas about a post-surveillance-capitalism social network&mdash;if BlueSky takes off, it will most likely devolve into a less-moderated, less-profitable version of Twitter, Inc (aka Twitter 1.0). It will indeed encourage competition for front-end interfaces to explore the BlueSkyVerse. But the biggest challenges that social networks have to face—content moderation, discoverability, and monetization—require big technical and infrastructural investments to do well. They may only be viable for well-capitalized companies that generate big profits.</p>
<p>But of course, I would be very nervous if I still worked at Twitter.</p>
<p><strong>Will it work?</strong></p>
<p>Now is a unique opportunity for a Twitter rival. Twitter CEO Elon Musk tends to say <a href="https://www.cnn.com/2022/10/30/business/musk-tweet-pelosi-conspiracy/index.html" target="_blank" rel="noopener">all manner of nutty things</a>, he has <a href="https://techcrunch.com/2022/11/21/elon-musk-twitter-netzdg-test/" target="_blank" rel="noopener">decimated Twitter&rsquo;s trust and safety org</a>, and cut staffing by more than 80%. And the company slashed infrastructure budgets needed for automated content moderation&mdash;internal sources say the company has cut 3bn since peak spending prior to the recession, while external accounts say <a href="https://www.reuters.com/technology/musk-orders-twitter-cut-infrastructure-costs-by-1-bln-sources-2022-11-03/" target="_blank" rel="noopener">Musk ordered a 1bn cut himself</a>.</p>
<p>It shows: in the wake of the Allen massacre on Saturday, <a href="https://www.dallasnews.com/news/2023/05/11/gore-conspiracies-spread-on-elon-musks-loosely-moderated-twitter-after-allen-shooting/" target="_blank" rel="noopener">graphic videos and misinformation spread across the platform</a>. Advertisers don&rsquo;t want to risk putting their brands next to that kind of content and <a href="https://www.cnbc.com/2022/11/01/ad-giant-ipg-advises-brands-to-pause-twitter-spending.html" target="_blank" rel="noopener">many have suspended advertising</a> on the platform.</p>
<p>We’ve all wondered which alternative social media system might replace Twitter. Could it be Mastodon, Spoutible, Post News, maybe Substack Notes? Or perhaps Truth Social or Gab or Gettr?!</p>
<p>
<figure id="figure-the-only-ads-i-saw-on-gabcom-were-ads-for-advertising-on-gabcom">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/NoAdsonGab.jpg" alt="A screenshot from Gab.com, with a post showing a flag of the UN with the text: &amp;ldquo;Need to burn a flag? Make it this one.&amp;rdquo;" loading="lazy" data-zoomable /></div>
</div><figcaption>
The only ads I saw on Gab.com were ads for advertising on Gab.com.
</figcaption></figure>
</p>
<p>I’m guessing it’s not going to be those other networks. The new centralized social network entrants—Spoutible, Post News, and Substack Notes—feel sterile and inauthentic when you first get started, partially because they are built around conventional media outlets, partially because they didn’t pay enough attention to discoverability in onboarding. Gettr/gab/truth social have libertarian-borderline-right-wing moderation setups, and the vast majority of people on Twitter have little interest in a right-wing echo chamber where there&rsquo;s no one to troll.</p>
<p>
<figure id="figure-you-can-get-the-best-designers-and-engineers-on-the-planet-but-if-you-show-people-a-blank-timeline-and-recommendations-to-follow-a-bunch-of-people-theyve-never-heard-of-no-one-is-going-to-use-your-platform">
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/SpoutableBlankTimeline.jpg" alt="A screenshot from Spoutible showing a blank timeline." loading="lazy" data-zoomable /></div>
</div><figcaption>
You can get the best designers and engineers on the planet but if you show people a blank timeline and recommendations to follow a bunch of people they&rsquo;ve never heard of, no one is going to use your platform.
</figcaption></figure>
</p>
<p>Mastodon is losing steam for many reasons&mdash;onboarding is terribly confusing, it’s broken into communal servers that are all very different but that all seem uptight. Moderation there has been characterized as “<a href="https://mastodon.social/@gruber/110328355532624579" target="_blank" rel="noopener">petulant nannyism</a>.”</p>
<p>Like Twitter, and <a href="https://werd.io/2023/the-fediverse-and-the-at-protocol" target="_blank" rel="noopener">unlike Mastodon</a>, BlueSky can surface content from this entire web of activity across the BlueSkyVerse and delight you with memes and witticisms, many of which were about <a href="https://faineg.substack.com/p/how-i-accidentally-ruined-bluesky" target="_blank" rel="noopener">”Sexy” ALF (yes, the 80s TV star)</a> when I signed up.</p>
<p>Many beta users say BlueSky feels like a breath of fresh air, like a throwback to early Twitter. For now, BlueSky is invite-only and so missing are the scammers, crypto bros, right-wing nuts, and tone policing randos looking for followers you find on Twitter. It feels more communal and less exhausting. Unclear how long that will last.</p>
<p>So maybe BlueSky has a legit claim to the Throne of Discourse, post-Twitter.</p>
<p><strong>Content Moderation</strong></p>
<p>First of all, content moderation is not just a “nice-to-have” thing that keeps the press happy. Facebook and others have found that content moderation <a href="http://tecunningham.github.io/2023-04-28-ranking-by-engagement.html" target="_blank" rel="noopener">increases retention</a>. And look at the flip side: most people don’t want to hang out at what Mike Masnick calls “<a href="https://www.techdirt.com/2023/05/04/on-social-media-nazi-bars-tradeoffs-and-the-impossibility-of-content-moderation-at-scale/" target="_blank" rel="noopener">Nazi bars</a>,” which is what platforms with permissive moderation policies will often become known for, whether they are actually Nazis or just radical free-speech advocates. Once that happens, kiss a lot of your core user base and ad revenue goodbye&mdash;which is what seems to be happening at Twitter.</p>
<p>Of course, content moderation is the bane of the modern social media network. It’s expensive, <a href="https://www.techdirt.com/2019/11/20/masnicks-impossibility-theorem-content-moderation-scale-is-impossible-to-do-well/" target="_blank" rel="noopener">it will always be wrong</a>, it can easily create a PR dumpster file, and its benefits are extremely difficult to measure. This new protocol was designed with content moderation in mind so let me break that down before talking about the problems that will surely come up.</p>
<p>On BlueSky, speech happens on your PDS, but reach happens on the centralized app—Bluesky for now. And they <a href="https://blueskyweb.xyz/blog/4-13-2023-moderation" target="_blank" rel="noopener">are in fact moderating</a>, so if they find a post that violates policy, they may take it off their app. It’s still up on the PDSs, it’s just not indexed in BlueSky. So great, it allows for a slightly truer form of “freedom of speech but not reach.”</p>
<p>How does this actually work? The BlueSky team wants to create a “<a href="https://blueskyweb.xyz/blog/4-13-2023-moderation" target="_blank" rel="noopener">moderation ecosystem</a>,” in which labels (“spam”, “nsfw”) can be created by anyone, and apps like BlueSky can then choose what labels to act upon. Right now, it’s completely centralized at BlueSky, and they have an automated layer and decisions are made by “server administrators.” Eventually though, there will be other label sources, other apps besides BlueSky and many servers beyond bsky.social. They’re proposing a “choose your own moderation” approach.</p>
<p>OK what are the downsides?</p>
<p>First, there are key parts of moderation that raise questions under this framework. If you doxx someone’s home address for targeted harassment, post a bunch of Child Sexual Abuse Material (CSAM) or non-consensual sexual imagery, it feels insufficient to merely de-index those posts. There are cases where it <a href="https://www.nytimes.com/2023/05/03/technology/dorsey-musk-twitter-bluesky-nostr.html" target="_blank" rel="noopener">may not be legally sufficient</a> under the Digital Services Act, NetzDG, or U.S. Copyright Law.</p>
<p>The spam-detection arms race is another example—the more you are open with how it works, the faster the spammers get around your detection systems. Somewhat relatedly, the fact that blocklists are public on BlueSky due to the BlueSkyVerse architecture, is <a href="https://twitter.com/MattBinder/status/1652142389165797377?s=20" target="_blank" rel="noopener">already stirring controversy</a>.</p>
<p>Finally, a big part of a healthy information ecosystem is keeping bad actors off your platform in the first place. In centralized networks, that’s often done by IP screening, cell phone/text message screening, email validation, and/or by using other private data. But a PDS hosts public data, so the centralized app would need to create parallel user accounts to collect and maintain that data.</p>
<p>All that means it’s difficult to see an alternative to a world where BlueSky and other AT apps need to start collecting private user data, even if it’s inconsistent with the clean decentralized, portable data model illustrated above. The line between PDS and user account will get very fuzzy very fast.</p>
<p>And, once apps do this for content moderation, wouldn’t they also wish to do it for advertising as well? Content moderation isn’t free.</p>
<p>Right now, signups are based on invites, which helps keep out bad actors. But eventually BlueSky will need to open up fully once it’s out of beta.</p>
<p>When that happens, the job of content moderation will be far more complex than in a place like Mastodon, because the BlueSky architecture is meant to enable “scale and global discoverability.” With Mastodon/the Fediverse, each server has its own policies, norms, and content moderation, which is far simpler in its small, federated worlds. In the BlueSkyVerse, you have no choice but to scale up moderation.</p>
<p><strong>Recommender systems in the BlueSkyVerse</strong></p>
<p>Will BlueSky be incentivized to build a feed-ranking system into their product and start logging the vast scope of data that inspired the phrase “surveillance capitalism?” They have already started down that path—in fact they’ve built the BlueSkyVerse to facilitate global discovery—large scale indexing and ranking across all PDSs in the network.</p>
<p>Right now, the “What’s Hot” feed does global discovery, but in a way that is pretty basic—it’s showing popular stuff from the last 30 minutes. For now this is fine, <a href="http://tecunningham.github.io/2023-04-28-ranking-by-engagement.html" target="_blank" rel="noopener">it’s the core of most modern recommender systems</a> in social media websites.</p>
<p>Contrast this with Mastodon, where you can technically follow people from another server but the system isn’t designed so servers index each other and form one network. This is an important reason I think BlueSky could have legs, but Mastodon will probably not replace Twitter.</p>
<p>Setting aside any monetary pressures facing BlueSky for a minute, I suspect they will be driven toward increased data collection and deployment, simply because you need to do that to move the metrics that tell you your product is improving. This may be further cemented by the culture of modern engineering organizations—where engineering leaders and PMs ruthlessly focus on moving a “north star” metric, which is almost alway some variant of time spent. “Time spent, daily active users, session counts, these are measures of whether you’re making your product better—the fact that they are all highly correlated with potential ad revenue is coincidental.</p>
<p>Of course, to do anything like what Twitter and Facebook do with their recommender systems—for both follow recommendations and for feed-ranking—will require a lot more resources. For the follow graph, that entails predicting which users are likely to form mutual follow relationships or satisfactory follow-only relationships, which can be done with shortcuts but is ultimately a difficult (graph machine learning) problem. For feed-ranking, that requires predicting what users are likely to interact with what content, which both Twitter and Facebook had entire divisions of engineers and data scientists working on.</p>
<p><strong>Pressures to centralize and monetize the BlueSkyVerse</strong>
Venture capitalists and startups in Silicon Valley are always talking about “moats.” If you invest a great deal of resources to build a technology or a new marketplace, what’s to stop a competitor from drinking your milkshake?</p>
<p>There’s an influential idea among “Web 3.0” circles, which is that Facebook, Instagram, and Twitter are the landlords of castles you can’t leave. That’s not supposed to happen this time—the BlueSkyVerse was designed around account portability and front-end/algorithmic competition. The hope is this will create an ecosystem of small companies doing bits and pieces of what big social media companies do today.</p>
<p>At the same time, everything I’ve seen so far suggests that large investments are going be required to even start playing in the BlueSkyVerse—there are barriers to entry on data processing to even index it as users grow, to create a legit feed and UX, and to do content moderation at that kind of scale. Jack has given billions to the BlueSky team to get the system to where it is today.</p>
<p>So what happens if the BlueSkyVerse really takes off? We might indeed see real competition for front-end apps that do custom algorithmic ranking and figure out innovative ways to moderate content. We might see further media fragmentation—perhaps front-end providers will try to differentiate themselves by topic or political orientation like television channels do.</p>
<p>But running a modern social media website is expensive. If it grows as big as Twitter, indexing the BlueSkyVerse will become a challenge, same for running modern recommender systems. And if you want ad revenue you need content moderation, which you can’t solve with AI alone—you need humans in the loop, which means you don’t get the kind of economies of scale you’d see with automated systems. What’s more, you often need sensitive user data to do these things well, and you need bespoke solutions to new adversarial tactics you find. So it’s hard to fully rely on an external company for these solutions, as the creators of BlueSky seemed to envision.</p>
<p><strong>The future of the network</strong></p>
<p>I see a few possibilities if BlueSky gets really big: the first is that BlueSky the app simply dominates this system—they moved first, they understand the system, they can do content moderation, they figure out how to scale up, and they may decide to sell ads. At the same time if BlueSky does become “Twitter 3.0,” there have to be consequences to the fact that I can simply take my posts and follow-graph to a competing service and still be on the same network.</p>
<p>Or maybe not. Maybe they will realize that the challenges of content moderation favor keeping the network as is, and the BlueSkyVerse will remain closed for a long time. Perhaps forever.</p>
<p>But if it does really launch and open up, it seems likely that established tech starts to play—Google jumps in, dedicates a small fraction of the resources it used to fund Google+, indexes the BlueSkyVerse in a day, and boom… has a competitor to Facebook. Maybe Facebook jumps in too, but that’s a tricky proposition because once part of Facebook/Instagram has true account portability what happens to the rest of the company?</p>
<p>Of course, another outcome that seems likely is a conservative social media front-end provider. Maybe Truth Social integrates with the BlueSkyVerse. It won’t make much money because many in that demographic seem happy with Twitter for now, and there will be substantial brand risk for potential advertisers.</p>
<p>Finally, we might see pure anarchy. In this &ldquo;race to the bottom&rdquo; scenario, a set of small, unmoderated, ad-free apps proliferate. Since people don&rsquo;t like ads, they use these apps. The network becomes overrun with spam, NSFW, hate, scams and gifts that come with a lack of moderation. Of course, it&rsquo;s unclear these apps would be tolerated by the app stores, but this is one direction things might generally go.</p>
</description>
</item>
<item>
<title>What can we learn from 'The Algorithm,' Twitter's partial open-sourcing of it's feed-ranking recommendation system?</title>
<link>https://solmessing.netlify.app/post/twitter-the-algorithm/</link>
<pubDate>Mon, 03 Apr 2023 00:00:00 +0000</pubDate>
<guid>https://solmessing.netlify.app/post/twitter-the-algorithm/</guid>
<description><p>Last Friday (2023-03-31) Twitter released what it calls “the algorithm,” which appears to be a highly redacted, incomplete part of code that governs the “for you” home timeline ranking system. And I saw nothing to suggest the parts of the code they put in the GitHub repository wasn’t authentic.</p>
<p>It’s highly unusual for a tech company to open up a product at the core of its monetization strategy. The thinking is that the more engaging the content you show people right when they log in, the more likely they are to stick around. And the more you keep people logged in, the more they see ads. And the more data you can get to show them better ads!</p>
<p><strong>Transparency, or a distraction from closing the API?</strong></p>
<p>Is this a step forward for transparency as Musk and Twitter would claim? I am skeptical. You can’t learn much from this release in and of itself&mdash;you need the underlying model features, parameters, and data to really understand the algorithm. Those combine into a system that&rsquo;s effectively different for everyone! So even if you had all that, you&rsquo;d likely need to algorithmically audit the system to really get a handle on it.</p>
<p>And Twitter made it <a href="https://www.wired.com/story/twitter-data-api-prices-out-nearly-everyone/" target="_blank" rel="noopener">prohibitively expensive</a> for external researchers to get that data through its API with the recent price updates ($500k/yr). So at the same time twitter is releasing this code, it’s made it incredibly difficult for research to <em>audit</em> this code</p>
<p><strong>What&rsquo;s in the code? Gossip and Rumors</strong></p>
<p><strong>Ukraine</strong> There were some <a href="https://twitter.com/SolomonMg/status/1642845123531751425?s=20" target="_blank" rel="noopener">initial reports</a> that Twitter was downranking tweets about Ukraine. I looked at the code and can tell you those claims are wrong&mdash;twitter has an audio-only <a href="https://www.clubhouse.com" target="_blank" rel="noopener">Clubhouse</a> clone called Spaces and that code is for that product, not ordinary tweets on hometimeline. What&rsquo;s more, this is likely a label related only to <strong>crisis misinformation</strong>, as per Twitter&rsquo;s <a href="https://help.twitter.com/en/rules-and-policies/crisis-misinformation" target="_blank" rel="noopener">Crisis Misinformation Policy</a>.</p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Uh, this code looks like it *only* relates to Twitter&#39;s Spaces product. If so it is *not* used for down-ranking ordinary tweets in home timeline ranking. <br><br>It&#39;s called here with the other twitter Spaces spaces visibility models and labels: <a href="https://t.co/7n8Tl7WVlJ">https://t.co/7n8Tl7WVlJ</a> <a href="https://t.co/stSThUTvUe">https://t.co/stSThUTvUe</a></p>&mdash; Sol Messing (@SolomonMg) <a href="https://twitter.com/SolomonMg/status/1642560420392103936?ref_src=twsrc%5Etfw">April 2, 2023</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<p><strong>Musk Metrics</strong> One of the most interesting things we learned from the code is that Twitter created an entire suite of metrics about Elon Musk’s personal twitter experience. The code shows they fed those metrics to the experimentation platform (Duck Duck Goose, or DDG), which at least historically has been used to evaluate whether or not to ship products.</p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Twitter’s algorithm specifically labels whether the Tweet author is Elon Musk<br><br>“author_is_elon”<br><br>besides the Democrat, Republican and “Power User” labels<a href="https://t.co/fhpBjdfifX">https://t.co/fhpBjdfifX</a> <a href="https://t.co/orCPvfMTb9">pic.twitter.com/orCPvfMTb9</a></p>&mdash; Jane Manchun Wong (@wongmjane) <a href="https://twitter.com/wongmjane/status/1641884551189512192?ref_src=twsrc%5Etfw">March 31, 2023</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>This episode is consistent with reporting that engineers are very concerned about how any features they ship <a href="https://www.theverge.com/2023/2/9/23593099/elon-musk-twitter-fires-engineer-declining-reach-ftc-concerns" target="_blank" rel="noopener">affect the CEOs personal experience on Twitter</a>. And other <a href="https://arstechnica.com/tech-policy/2023/02/report-musk-had-twitter-engineers-boost-his-tweets-after-biden-got-more-views/" target="_blank" rel="noopener">reporting has suggested that there may have been a Musk centric boost feature</a> that shipped, and you would want exactly this kind of instrumentation to understand how that worked in practice.</p>
<p><strong>Republican, Democrat Metrics</strong> We also learned that Twitter is logging similar metrics for lists of prominent Democrat and Republican accounts, <a href="https://www.yahoo.com/entertainment/twitters-recommendation-algorithm-is-now-on-github-200511112.html" target="_blank" rel="noopener">ostensibly to understand</a> whether any features that they ship affect those sets of accounts equally. Now we know that <a href="https://www.nature.com/articles/s41467-022-34769-6" target="_blank" rel="noopener">conservative accounts tend to share more misinformation than liberal accounts on both Twitter</a> and <a href="https://www.science.org/doi/full/10.1126/sciadv.aau4586" target="_blank" rel="noopener">on Facebook</a>. And, <a href="https://www.washingtonpost.com/technology/2023/02/08/house-republicans-twitter-files-collusion/" target="_blank" rel="noopener">Musk has alleged that Democrats and Big Tech are colluding</a> to enforce policy violation unequally across parties.</p>
<p>But if you have these ``partisan equality&rsquo;&rsquo; stats as part of your ship criteria, perhaps on equal footing with policy violation frequency, you can see how <strong>this could really affect the types of health and safety features that actually make it to the site in production</strong>.</p>
<p>This code was then comically removed via pull requests from Twitter. Because once you delete something on GitHub, it just goes away. Right?</p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">which specific user groups, you might wonder? <a href="https://t.co/wYSsUKm1pA">pic.twitter.com/wYSsUKm1pA</a></p>&mdash; Colin Fraser (@colin_fraser) <a href="https://twitter.com/colin_fraser/status/1641960748233662464?ref_src=twsrc%5Etfw">April 1, 2023</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<p><strong>Twitter Blue Boost</strong> What’s more, we sorta knew that Twitter Blue users get a boost in feed ranking, but the code make it clear that it could double your score among people who don&rsquo;t follow you, and quadruple it for those who do.</p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Not sure what I expected, but interesting that first pull request on newly open-sourced <a href="https://twitter.com/twitter?ref_src=twsrc%5Etfw">@Twitter</a> algo (<a href="https://t.co/kAVP0zdzki">https://t.co/kAVP0zdzki</a>) is to downweight verified user multipliers. <a href="https://t.co/PfqIdVTDnk">pic.twitter.com/PfqIdVTDnk</a></p>&mdash; Caitlin Hudon (@beeonaposy) <a href="https://twitter.com/beeonaposy/status/1641878347557883910?ref_src=twsrc%5Etfw">March 31, 2023</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>As <a href="https://twitter.com/jonathanstray/status/1642200687101501441" target="_blank" rel="noopener">Jonathan Stray pointed out</a>, if this counts as a paid promotion, the FTC might require Twitter to label your tweets as ads. Now we kind of already knew this from Musks Twitter Blue announcement, but having evidence in the code might cross a different line for the FTC.</p>
<p><strong>So what about the ackshual algorithm? What does this say about feed ranking?</strong></p>
<p>The code itself is there but it’s missing specifics&mdash;key parameters, feature sets, and model weights are absent or abstracted. And obviously the data.</p>
<p>The most critical thing we learned about Twitter’s ranking algorithm is probably from a readme file that former Facebook Data Scientist <a href="https://twitter.com/jeff4llen" target="_blank" rel="noopener">Jeff Allen</a> found. If we take that at face value, a fav (twitter like) is worth half a retweet. A reply is worth 27 retweets, and a reply with a response from a tweets author is worth a whopping 75 retweets!</p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">According to the Heavy Ranker readme, it looks like this is the &quot;For you&quot; feed ranking formula is<br><br>Each &quot;is_X&quot; is a predicted probability the user will take that action on the Tweet.<br><br>Replies are the most important signal. Very similar to MSI for FB.<a href="https://t.co/Bmv7qg4voc">https://t.co/Bmv7qg4voc</a> <a href="https://t.co/lWfaUboT6q">pic.twitter.com/lWfaUboT6q</a></p>&mdash; Jeff Allen (@jeff4llen) <a href="https://twitter.com/jeff4llen/status/1641901988047626241?ref_src=twsrc%5Etfw">March 31, 2023</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>Now it’s not quite that simple&mdash;what about when a tweet is first posted and there’s no data? Twitter’s deep learning system (in the heavy ranker) will do some heavy lifting and predict the likelihood of each of these actions based on the tweet author, their network, any initial engagements, the tweet text, and thousands of signals and embeddings.</p>
<p>Of course, what happens in the first few minutes when a tweet is posted deeply shapes who sees and engages with it downstream in the future.</p>
<p>[And the way this is implemented in practice is that the model handles all cases, but as you get more and more real time data on a tweet, those real time features dominate everything else and push those probabilities close to 1, see <a href="https://twitter.com/SolomonMg/status/1642154005588504577?s=20" target="_blank" rel="noopener">discussion here</a>.]</p>
<p>Now I should point out that there are some spammy accounts claiming to have found ranking parameters in the code. They’re wrong, those are used to <a href="https://github.com/twitter/the-algorithm/blob/7f90d0ca342b928b479b512ec51ac2c3821f5922/src/java/com/twitter/search/README.md" target="_blank" rel="noopener">retrieve tweets from your network for candidate generation only</a>. <a href="https://lucene.apache.org" target="_blank" rel="noopener">Lucene</a> is an open source search tool.</p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Uh, this is for search query results, not home timeline ranking. <br><br>See the thing that says &quot;luceneScoreParams&quot;??<br><br>Lucene is a search library: <a href="https://t.co/993crKVHJi">https://t.co/993crKVHJi</a> <a href="https://t.co/AdGtjCbOlx">https://t.co/AdGtjCbOlx</a></p>&mdash; Sol Messing (@SolomonMg) <a href="https://twitter.com/SolomonMg/status/1642563414970060800?ref_src=twsrc%5Etfw">April 2, 2023</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>I should point out however, that some of the ``Earlybird&rsquo;&rsquo; code was at one point used in timeline ranking, and it appears that <a href="https://github.com/twitter/the-algorithm/blob/7f90d0ca342b928b479b512ec51ac2c3821f5922/cr-mixer/server/src/main/scala/com/twitter/cr_mixer/similarity_engine/EarlybirdTensorflowBasedSimilarityEngine.scala" target="_blank" rel="noopener">it may be used in cr-mixer</a>, which is used in candidate generation for <a href="https://github.com/twitter/the-algorithm/blob/7f90d0ca342b928b479b512ec51ac2c3821f5922/cr-mixer/README.md" target="_blank" rel="noopener">out-of-network tweets</a>.</p>
<p>Interestingly, <a href="https://github.com/twitter/the-algorithm/blob/main/home-mixer/server/src/main/scala/com/twitter/home_mixer/functional_component/filter/OutOfNetworkCompetitorURLFilter.scala" target="_blank" rel="noopener">Twitter appears to remove competitor URLs</a>, perhaps only for tweets that are outside of network (you don&rsquo;t follow the author).</p>
<p><strong>What else goes into the ``the Algorithm?&rsquo;&rsquo;</strong></p>
<p>What gets ranked in the first place? The other piece here is the ``TikTok&rsquo;&rsquo; part of the ranking algorithm, which is also incomplete without the models/data/parameters/etc. What I mean is the code that takes content from across the platform and says “I’m going to put this into your queue for the heavy ranker to sort out.”</p>
<p>Now on Twitter often that historically meant tweets posted by or replied to by accounts you follow. But, Twitter realized it could find a lot more content for that heavy ranker magic.</p>
<p>There’s a complex system that inserts tweets into your queue for ranking. This is called <strong>candidate generation</strong> in the “recommendation system” subfield of applied computing.</p>
<p>If you follow a lot of people on twitter like me, about <strong>half</strong> of the candidate tweets in twitter’s ranked “for you” timeline at any given time are from people you follow.</p>
<p>Now, if you don’t follow a ton of people, or if you have a new account, you can run out of these tweets, and then Twitter will try to find additional candidates so that you have ranked content. If so, means that this system is going to govern what in your home timeline feed like TikTok&mdash;gathering content it predicts you’ll like from across the platform.</p>
<p>This takes place in <a href="https://github.com/twitter/the-algorithm/blob/7f90d0ca342b928b479b512ec51ac2c3821f5922/cr-mixer/README.md" target="_blank" rel="noopener">cr-mixer</a>, and although some of the <a href="https://github.com/twitter/the-algorithm/blob/7f90d0ca342b928b479b512ec51ac2c3821f5922/cr-mixer/server/src/main/scala/com/twitter/cr_mixer/candidate_generation/CandidateSourcesRouter.scala" target="_blank" rel="noopener">high level function calls are there</a>, much of the code and the models appear to be missing, and many files come with this warning at the top: ``This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.''</p>
<p>Twitter seems to have made some of the systems public underlying candidate generation public, including its <a href="https://github.com/twitter/the-algorithm/tree/7f90d0ca342b928b479b512ec51ac2c3821f5922/src/scala/com/twitter/simclusters_v2" target="_blank" rel="noopener">SimCluster model</a>.</p>
<p>BTW, I’d like to give a shout out to <a href="https://twitter.com/vboykis" target="_blank" rel="noopener">Vicki Boykis</a>, and <a href="https://twitter.com/igorbrigadir" target="_blank" rel="noopener">Igor Brigadir</a> who are <a href="https://github.com/igorbrigadir/awesome-twitter-algo" target="_blank" rel="noopener">doing amazing work to map out the codebase</a> and unearth exactly what’s missing and what’s not.</p>
<p><strong>Trust and Safety</strong></p>
<p>A lot of the code related to Trust and Safety is missing, presumably to prevent bad actors from learning too much and gaming those systems. However, there do seem to be some specifics about the kinds of things twitter considers borderline or violating that I don’t think were previously public.There are a bunch of safety parameters in the code, some of which are in Twitter’s policy documents, but some are not.</p>
<p>There are entries like “HighCryptospamScore” that <a href="https://github.com/twitter/the-algorithm/blob/7f90d0ca342b928b479b512ec51ac2c3821f5922/visibilitylib/src/main/scala/com/twitter/visibility/rules/DownrankingRules.scala" target="_blank" rel="noopener">appear in the code</a>, which may give scammers hints about how to craft tweets to get around detection systems. The same is true for <a href="https://github.com/twitter/the-algorithm/blob/7f90d0ca342b928b479b512ec51ac2c3821f5922/visibilitylib/src/main/scala/com/twitter/visibility/models/TweetSafetyLabel.scala#L115" target="_blank" rel="noopener">code that contains links</a> to “UntrustedUrl,” “TweetContainsHatefulConductSlur” for low, medium and high severity.</p>
<p>There’s also a reference to a “Do Not Amplify” <a href="https://github.com/twitter/the-algorithm/blob/7f90d0ca342b928b479b512ec51ac2c3821f5922/visibilitylib/src/main/scala/com/twitter/visibility/models/SpaceSafetyLabelType.scala#L26" target="_blank" rel="noopener">parameter in the code</a>, which was discussed in the twitter files but seems not to be publicly documented in it’s policies. There are entries like “AgathaSpam,” which refers to a propriety embedding used across the codebase. Twitter also has a bunch of visibility rules hardcoded in Scala that might be useful to bad actors trying to game the system, outlining what rules are in play for all tweets, new users, user mentions, liked tweets, realtime spam detection, etc. Finally, some of the consequences for those violations are <a href="https://github.com/twitter/the-algorithm/blob/7f90d0ca342b928b479b512ec51ac2c3821f5922/visibilitylib/src/main/scala/com/twitter/visibility/rules/Action.scala" target="_blank" rel="noopener">spelled out in Scala</a> as well.</p>
<p>Of course, it&rsquo;s really hard to know with certainty that any of this wasn&rsquo;t in public somehow before this release.</p>
</description>
</item>
<item>
<title>A 2 million-person, campaign-wide field experiment shows how digital advertising affects voter turnout</title>
<link>https://solmessing.netlify.app/publication/aggarwal20232/</link>
<pubDate>Sun, 01 Jan 2023 00:00:00 +0000</pubDate>
<guid>https://solmessing.netlify.app/publication/aggarwal20232/</guid>
<description><ul>
<li>Supplimentary materials at end of manuscript</li>
<li><a href="https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/YMKVA1" target="_blank" rel="noopener">Replication materials</a></li>
<li>Media coverage: <a href="https://www.nature.com/articles/d41586-023-00073-6" target="_blank" rel="noopener">Nature</a></li>
</ul>
</description>
</item>
<item>
<title>Past vote data outperformed the polls. How did it go so wrong?</title>
<link>https://solmessing.netlify.app/post/what-the-polls-got-wrong-in-2020/</link>
<pubDate>Sun, 08 Nov 2020 00:00:00 +0000</pubDate>
<guid>https://solmessing.netlify.app/post/what-the-polls-got-wrong-in-2020/</guid>
<description><p>It’s becoming clear that the 2020 polls underestimated Trump’s support by anywhere from a 4-8 point margin depending on your accounting&ndash;a significantly worse miss than in 2016, when <a href="https://fivethirtyeight.com/features/the-polls-are-all-right/" target="_blank" rel="noopener">state polls were off but the national polls did relatively well</a>.</p>
<p>In fact, this year we were better off using projections based on past vote history in each state to predict how things would go in battleground states, as I&rsquo;ll show below.</p>
<p>But I also want to start to ask questions about what happened this time around. The polling from 2018 looked encouraging, convincing many pollsters that the post-2016 reckoning had fixed many issues called out in the <a href="https://www.aapor.org/Education-Resources/Reports/An-Evaluation-of-2016-Election-Polls-in-the-U-S.aspx" target="_blank" rel="noopener">2016 AAPOR report on election polling</a>. After 2018, FiveThirtyEight wrote that the <a href="https://fivethirtyeight.com/features/the-polls-are-all-right/" target="_blank" rel="noopener">&ldquo;Polls are Alright&rdquo;</a>.</p>
<p>But the second Miami-Dade reported results from the 2020 election, we knew something was probably wrong with the 2020 polls.</p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Here&#39;s another chart of polls vs. returns that splits the data by how much the polls underestimated Trump. One place where the polls most underestimated Trump is Wisconsin (off by -9 points). Note returns are not yet verified and states are still finalizing their counts. <a href="https://t.co/iM8mjqoAuK">pic.twitter.com/iM8mjqoAuK</a></p>&mdash; Stefan (@stefanjwojcik) <a href="https://twitter.com/stefanjwojcik/status/1325786708022079488?ref_src=twsrc%5Etfw">November 9, 2020</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>As Stefan notes (we worked together at Pew Research Center&rsquo;s Data Labs), the error seems slightly lower in key battleground states, though the polls missed big in WI, perhaps in part due to its horrifically bad voter file data.</p>
<p>Unlike 2016, both state and national polls appeared to underestimate Trump&rsquo;s support, as this early (Nov 7) analysis from <a href="https://twitter.com/thomasjwood" target="_blank" rel="noopener">Tom Wood</a> shows:</p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">Current as to the afternoon on the 7th, and with Senate results too. <a href="https://t.co/EGZGarRPNj">pic.twitter.com/EGZGarRPNj</a></p>&mdash; Tom Wood (@thomasjwood) <a href="https://twitter.com/thomasjwood/status/1325199348553162752?ref_src=twsrc%5Etfw">November 7, 2020</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<!-- [![normal](/img/TWpollingerror.jpeg)](https://twitter.com/thomasjwood/status/1325199348553162752) -->
<h2 id="polling-versus-past-votes">Polling versus past votes</h2>
<p>Perhaps what surprised me the most about polling this time around was when I went to evaluate some election projections I put together in April that we used internally at Acronym to help evaluate where we might want to spend. I pulled in the <a href="https://www.nytimes.com/live/2020/presidential-polls-trump-biden" target="_blank" rel="noopener">NYTimes polling averages</a> and compared them with the latest state-level presidential results from the AP. I then did the same for the April projections. Turns out the projections were significantly more accurate than the polling averages:</p>
<p>
<figure >
<div class="d-flex justify-content-center">
<div class="w-100" ><img src="https://solmessing.netlify.app/img/PollingVSPastVoteProj.png" alt="normal" loading="lazy" data-zoomable /></div>
</div></figure>
</p>
<p>We used these projections, and other extant data (including the fact that there are two Senate races in play), when making what turned out to be a very lucky decision to start spending money in Georgia. We were one of the biggest and earliest spenders in that race.</p>
<p>What are these projections? I simply took the last two state-level Presidential and U.S. House election totals, estimated each state&rsquo;s &ldquo;trajectory,&rdquo; and added that to each state&rsquo;s Democratic margin from the previous cycle.</p>
<p>(Note that I also weighted 60-40 toward the Presidential results, and slightly regularized both the latest margin and the trajectory toward zero.)</p>
<p>Informing this approach is work from <a href="https://catalist.us/yair-ghitza-phd/" target="_blank" rel="noopener">Yair Ghitza</a> describing what went wrong in 2016, which suggested polarization and other state-level trends would continue, in addition to national trends or &ldquo;uniform swing.&rdquo; </p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">This paper from <a href="https://twitter.com/SimonJackman?ref_src=twsrc%5Etfw">@SimonJackman</a> also deserves a big hat tip <a href="https://t.co/CTTpYDPwl2">https://t.co/CTTpYDPwl2</a></p>&mdash; Sol Messing (@SolomonMg) <a href="https://twitter.com/SolomonMg/status/1325564912798752773?ref_src=twsrc%5Etfw">November 8, 2020</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>I should note that this may only have worked because of something peculiar about this election cycle&ndash;I haven&rsquo;t gone an back-tested this approach or anything like that.</p>
<p>Seems I was not the only one who noticed this kind of pattern:</p>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">A similar observation from <a href="https://twitter.com/gelliottmorris?ref_src=twsrc%5Etfw">@gelliottmorris</a> <a href="https://t.co/XSUAhGBZfb">https://t.co/XSUAhGBZfb</a></p>&mdash; Sol Messing (@SolomonMg) <a href="https://twitter.com/SolomonMg/status/1325522770890027008?ref_src=twsrc%5Etfw">November 8, 2020</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<h2 id="what-went-wrong-the-usual-suspects">What went wrong: The Usual Suspects</h2>