-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathLecture 17 COPS Causal Consistency.srt
5151 lines (4289 loc) · 149 KB
/
Lecture 17 COPS Causal Consistency.srt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1
00:00:02,689 --> 00:00:09,319
大家好,我们开始吧
alright hello everyone let's get started
2
00:00:10,849 --> 00:00:20,910
今天的话题是因果一致性,然后使用警察制度
today the topic is causal consistency
and then use the cop system the cop's
3
00:00:20,910 --> 00:00:26,099
我们每天的论文都是因果一致性的案例研究,因此
paper that we're every day is a case
study for causal consistency so the
4
00:00:26,099 --> 00:00:35,850
设置实际上很熟悉,我们再次谈论大型网站
setting is actually familiar we're
talking again about big websites that
5
00:00:35,850 --> 00:00:42,120
在多个数据中心中有数据,他们想在每个数据中心中复制数据
have data in multiple data centers and
they want to replicate the data in each
6
00:00:42,120 --> 00:00:45,750
每个数据中心的所有数据都必须保存副本
of their all their data in each their
data centers have to keep a copy close
7
00:00:45,750 --> 00:00:55,020
给用户,也许是为了容错,所以像往常一样,我们可能会
to users and for perhaps for fault
tolerance so as usual we have maybe I'll
8
00:00:55,020 --> 00:00:59,600
有三个数据中心
have three data centers
9
00:01:02,630 --> 00:01:05,690
而且您知道,因为我们正在构建大型系统,所以我们将分片数据
and you know because we're building big
systems we're going to shard the data
10
00:01:05,690 --> 00:01:09,380
每个数据中心都将有多台服务器,您可能都知道
and every data center is going to have
multiple servers with you know maybe all
11
00:01:09,380 --> 00:01:14,570
以Z a开头的所有托管人对应分片的密钥
the keys that start with Z a through all
the custodian corresponding shards of
12
00:01:14,570 --> 00:01:31,520
我们已经看到过的URL,您知道人们知道的通常目标
URLs we've seen this for and you know
the usual goals people have you know
13
00:01:31,520 --> 00:01:34,820
如何进行这项工作有很多不同的设计,但您知道吗
there's many different designs for how
to make this work but you know you
14
00:01:34,820 --> 00:01:39,080
真的很喜欢阅读,当然也很喜欢阅读,因为这些网站
really like reads to be certainly like
reads to be fast because these web
15
00:01:39,080 --> 00:01:45,680
工作负载往往以阅读为主,并且您知道自己想要工作和
workloads tend to be read dominated and
you know you'd like rights to work and
16
00:01:45,680 --> 00:01:52,640
您想让我们尽可能地保持一致
you'd like to have us as much
consistency as you can so the fast
17
00:01:52,640 --> 00:01:55,970
原因很有趣,因为客户端通常是Web浏览器,因此
reasons are interesting because the
clients are typically web browsers so
18
00:01:55,970 --> 00:02:03,020
网络将有一组网络浏览器,所有这些浏览器都将客户端称为
and there's web going to be some set of
web browsers which all call clients the
19
00:02:03,020 --> 00:02:07,039
客户端存储系统,但它们实际上是与用户
clients the storage system but they're
really web browsers talking to a user's
20
00:02:07,039 --> 00:02:11,800
浏览器,所以典型的安排是这些发生在本地的原因
browser some so the typical arrangement
is that the reason these happen locally
21
00:02:11,800 --> 00:02:18,470
权利可能会稍微复杂一点,因此适合这种情况的系统
and rights might be little more
complicated so one system that fits this
22
00:02:18,470 --> 00:02:24,890
模式是扳手,您还记得涉及的扳手和扳手权利
pattern is spanner you remember that a
spanner and spanner rights involved
23
00:02:24,890 --> 00:02:29,800
运行在所有数据中心的Paxos,因此如果您在paxos中进行写操作
Paxos that runs across all the data
centers so if you do a write in paxos
24
00:02:29,800 --> 00:02:35,959
也许数据中心的客户需要写一份涉及通信的文件
maybe a client in a data center needs to
do a write the communication involve
25
00:02:35,959 --> 00:02:41,300
实际需要税收可能在这些服务器之一上运行以进行通话
actually need requires taxes maybe
running on one of these servers to talk
26
00:02:41,300 --> 00:02:45,319
至少要复制到其他大多数数据中心,因此权利
to at least a majority of the other data
centers that are replicas so the rights
27
00:02:45,319 --> 00:02:52,500
往往会有点慢,但除此之外,萨凡纳(Savannah)支持
tend to be a little bit slow but there
consistent in addition Savannah supports
28
00:02:52,500 --> 00:02:57,450
两阶段提交,因此我们有事务,读取速度更快,因为
two-phase commit so we had transactions
and the reads are much faster because
29
00:02:57,450 --> 00:03:04,709
微风使用了真正的时间计划,论文的跨度描述了
the breeze used a true time scheme that
the span of paper described and really
30
00:03:04,709 --> 00:03:09,959
只咨询了当地人,我们还阅读了Facebook内存缓存的新论文,该论文是
only consulted local we also read the
Facebook memcache new paper which is
31
00:03:09,959 --> 00:03:15,269
在此演示模式中的另一种设计Facebook memcache关键文件中有一个
another design in this demo pattern the
Facebook memcache key paper there's a
32
00:03:15,269 --> 00:03:21,060
具有我的续集数据库的主要集合的主站点,因此如果客户想要
primary site that has the primary set of
my sequel databases so if a client wants
33
00:03:21,060 --> 00:03:25,140
做正确的事,我想这个数据中心3的主要方面是发送所有
to do a right I suppose the primary side
this data center 3 does to send all
34
00:03:25,140 --> 00:03:30,540
数据中心3的权限,然后数据中心3发送新信息或
rights to data center 3 and then data
center 3 sends out new information or in
35
00:03:30,540 --> 00:03:33,750
对其他数据中心的验证正确,因此实际上有点贵
validations to the other data centers
right so actually a little bit expensive
36
00:03:33,750 --> 00:03:41,220
与另一方面的扳手不同,当客户端时所有读取都是本地的
and not unlike spanner on the other hand
all the reads are local when a client
37
00:03:41,220 --> 00:03:44,579
需要做一个珠子,它可以查询本地数据中的内存缓存密钥服务器
needs to do a bead it could consult a
memcache key server in the local data
38
00:03:44,579 --> 00:03:51,840
中心,还有memcachedb先生,这让人眼花fast乱
center and there's memcachedb sir just
blindingly fast this the people
39
00:03:51,840 --> 00:03:56,519
向他们报告单个内存缓存,服务器每次保存一百万次读取
reporting them a single memcache the
server conserve a million reads per
40
00:03:56,519 --> 00:04:02,700
第二个非常快,因此Facebook memcache D方案也需要
second which is very fast so again the
Facebook memcache D scheme needs to
41
00:04:02,700 --> 00:04:06,450
涉及跨数据通信中心的权利,但阅读
involve cross data center of
communication for rights but the reads
42
00:04:06,450 --> 00:04:11,220
是本地的,所以今天的问题和警察文件的问题
are local so the question for today and
the question of the cops papers
43
00:04:11,220 --> 00:04:16,880
答案是我们是否可以拥有允许权利追究的制度
answering is whether we can have a
system that allows rights to pursue
44
00:04:16,880 --> 00:04:21,060
纯粹在本地,从客户的角度来看,客户可以交谈
purely locally and this from the clients
point of views that the client can talk
45
00:04:21,060 --> 00:04:26,729
从正确到正确,他们可以自己发送正确的本地副本
to the from once it were right they can
send the right local replica in its own
46
00:04:26,729 --> 00:04:31,050
数据中心以及某些读取仅本地副本,而永远不会
data center as well as some reads to
just the local replicas and never have
47
00:04:31,050 --> 00:04:34,770
等待其他数据中心无需与其他数据中心对话
to wait for other data centers never
have to talk to other data centers or
48
00:04:34,770 --> 00:04:40,830
等待其他数据中心行使权利,所以我们真正想要的是一个
wait for other data centers to do rights
so what we really want is a system that
49
00:04:40,830 --> 00:04:47,780
可以具有本地读取和本地权限
can have local reads and local rights
50
00:04:48,440 --> 00:04:53,580
那是大目标这是大目标实际上是一个绩效目标
that's the big that's the big goal
really a performance goal this would
51
00:04:53,580 --> 00:04:59,040
当然可以帮助提高性能,因为与以前的扳手和Facebook论文不同
help for performance of course cuz now
unlike spanner and Facebook paper we had
52
00:04:59,040 --> 00:05:04,200
从客户的角度来看,纯粹的本地权利要快得多
a purely local rights be much faster
from the clients point of view um it
53
00:05:04,200 --> 00:05:08,160
如果可以在本地完成权限,则还可能有助于提高容错能力
might also help with fault tolerance
robustness if rights can be done locally
54
00:05:08,160 --> 00:05:10,980
那么我们不必担心其他数据中心是否正常运行或
then we don't have to worry about
whether other data centers are up or
55
00:05:10,980 --> 00:05:15,090
是否可以快速与他们交谈,因为客户不需要等待
whether we can talk to them quickly
because the clients don't need to wait
56
00:05:15,090 --> 00:05:20,790
为他们服务,所以我们将要寻找具有这种水平的系统
for them so we're gonna be looking for
systems that have this this level of
57
00:05:20,790 --> 00:05:27,810
性能,最后我们将让您知道的一致性模型
performance and in the end we're gonna
let the consistency model you know cuz
58
00:05:27,810 --> 00:05:29,880
如果您只行使权利,我们将担心一致性
we're going to be worried about
consistency if you only do the rights
59
00:05:29,880 --> 00:05:33,690
首先到本地副本,您知道其他数据中心副本的情况
initially to the local replicas you know
what about other data centers replicas
60
00:05:33,690 --> 00:05:36,840
数据,所以我们一定会担心一致性
data so we'll certainly be worried about
consistency
61
00:05:36,840 --> 00:05:40,260
但是至少本次演讲的态度是我们要让
but the attitude for this lecture at
least is that we're gonna let the
62
00:05:40,260 --> 00:05:44,640
一旦我们弄清楚,一致性跟踪就会跟踪您知道的性能
consistency trail along behind the
performance you know once we figure out
63
00:05:44,640 --> 00:05:48,450
如何获得良好的表现,然后就会弄清楚如何定义
how to get good performance will well
then sort of figure out how to define
64
00:05:48,450 --> 00:05:54,330
在考虑是否足够好方面保持一致性,这就是整体
consistency in think about whether it's
good enough okay so that's the overall
65
00:05:54,330 --> 00:06:02,040
我实际上会谈论两种稻草人设计的策略,但是
strategy I'm gonna actually talk about
two strawman designs to sort of okay but
66
00:06:02,040 --> 00:06:08,160
在我们实际谈论警察如何工作之前的途中,设计不是很好
not great designs on the way to before
we actually talk about how cops works so
67
00:06:08,160 --> 00:06:16,770
首先,我想谈一个遵循该本地评级的最简单的设计
first i want to talk about a simplest
design that follows this local rating
68
00:06:16,770 --> 00:06:27,560
我能想到的策略我将这个稻草人称为1
strategy that I can think of I'll call
this straw man 1 so in straw and one
69
00:06:27,560 --> 00:06:33,680
我们将拥有三个数据中心
we're going to have three data centers
70
00:06:34,669 --> 00:06:41,490
并假设每种数据以两种方式绘制图表,因此他
and let's just assume that the data is
charted two ways in each of them so he's
71
00:06:41,490 --> 00:06:45,300
也许是ATM机和大使馆的钥匙在每个
from maybe ATM and keys from embassy
shard it the same way in each of the
72
00:06:45,300 --> 00:06:59,370
数据中心和客户端将在本地读取,如果客户端写入
data centers and the clients will read
locally and if a client writes so
73
00:06:59,370 --> 00:07:03,660
假设客户需要编写以M开头的密钥,客户将
supposing a client needs to write it key
that starts with M the clients gonna
74
00:07:03,660 --> 00:07:10,610
将密钥M的写入发送到具有以下内容的本地分片服务器的分片服务器
send a write of key M to the shard
server the local shard server that has
75
00:07:10,610 --> 00:07:15,840
它负责以分片服务器返回回复的M开始
its responsible he's starting with M
that shard server would return reply to
76
00:07:15,840 --> 00:07:22,200
客户立即说哦,是的,我是对的,但除此之外
the client immediately saying oh yes I
did you're right but in addition each
77
00:07:22,200 --> 00:07:29,010
服务器将维护已发送的未决权利队列
server will maintain a queue of
outstanding rights that have been sent
78
00:07:29,010 --> 00:07:32,190
它最近得到了需要发送到其他数据中心的客户,并且它
to it recently got clients that it needs
to send to other data centers and it
79
00:07:32,190 --> 00:07:38,100
会将这些权利在后台异步流式传输到相应的权限
will stream these rights asynchronously
in the background to the corresponding
80
00:07:38,100 --> 00:07:42,960
其他数据中心中的服务器,因此在向客户应用我们的分片之后
servers in the other data center so
after applying to the client our shard
81
00:07:42,960 --> 00:07:48,960
此处的服务器将向其他每个数据发送客户端权限的副本
server here will send a copy of the
clients right to each of the other data
82
00:07:48,960 --> 00:07:53,250
设置,您知道这些权利会通过网络传播,也许他们需要
setups and you know those rights go
through the network maybe they take a
83
00:07:53,250 --> 00:07:57,120
很长一段时间,他们最终将到达目标数据集
long time eventually they're gonna
arrive at the target data set the other
84
00:07:57,120 --> 00:08:00,930
数据中心以及这些分片服务器中的每一个都将对该服务器应用权限
data centers and each of those shard
servers will then apply the right to its
85
00:08:00,930 --> 00:08:10,560
本地数据表,因此这是一种具有非常好的性能的设计
local table of data so this is a design
that has very good performance right the
86
00:08:10,560 --> 00:08:13,979
原因权利全部在本地完成,可能永远不会有两个客户
reason rights are all done locally may
never have two clients never have to
87
00:08:13,979 --> 00:08:18,990
等待,有很多并行性,因为您知道此分片服务器用于
wait there's a lot of parallelism
because you know this shard server for a
88
00:08:18,990 --> 00:08:22,710
如果是碎片的话,衬衫可以独立地提供更多的机会
and the shirts are for a more
opportunity independently if the shard
89
00:08:22,710 --> 00:08:27,120
一个正确的服务器,您知道它必须将其数据推送到相应的
server for a gets right you know it has
to push its data to the corresponding
90
00:08:27,120 --> 00:08:30,300
分片服务器和其他数据中心,但可以执行这些推送
shard servers and other data centers but
it can do those push pushes
91
00:08:30,300 --> 00:08:34,799
独立于其他分片服务推送,因此在
independently of other shard service
pushes so there's parallelism both in
92
00:08:34,799 --> 00:08:41,429
如果稍微考虑一下,就可以提供服务并推动写作
serving and and pushing the writes
around if you think about it a little
93
00:08:41,429 --> 00:08:48,250
一点是,这种设计实际上也实际上有利于阅读和阅读
bit it's this design also essentially
effectively favors reads and the reads
94
00:08:48,250 --> 00:08:52,150
尽管这样做确实不会对本地数据中心产生任何影响。
really never have any impact beyond the
local data center the rights though do a
95
00:08:52,150 --> 00:08:54,820
只要您做对了一点工作,您就知道客户不必等待
bit of work whenever you do a right you
know the client doesn't have to wait for
96
00:08:54,820 --> 00:08:59,320
它,但分片服务器随后必须将权限推给其他数据中心
it but the shard server then has to push
the rights out to the other data centers
97
00:08:59,320 --> 00:09:03,220
而且您知道这意味着另一个数据中心的新数据将继续进行
and you know means that new data the
other data center then proceed very
98
00:09:03,220 --> 00:09:09,040
速度很快,因此阅读所涉及的工作少于权利,这对阅读而言是适当的
quickly so reads involve less work than
rights and that's appropriate for a read
99
00:09:09,040 --> 00:09:14,470
繁重的工作量,如果您更担心速率性能,可以想象
heavy workload if you are more worried
about rate performance you could imagine
100
00:09:14,470 --> 00:09:18,220
例如其他设计,您可以想象其中实际读取的设计
other designs for example you can
imagine design in which reads actually
101
00:09:18,220 --> 00:09:22,360
必须咨询多个数据中心,并且权限完全是本地的,因此您可以
have to consult multiple data centers
and rights are purely local so you can
102
00:09:22,360 --> 00:09:25,690
想象一个方案,当您执行读取操作时,您实际上会读取数据
imagine a scheme in which you have when
you do a read you actually read the data
103
00:09:25,690 --> 00:09:30,970
从其他每个日期开始,从每个日期获取所需密钥的当前副本
from each of the other date the current
copy of the key you want from each of
104
00:09:30,970 --> 00:09:34,990
其他数据中心,然后选择最新的数据中心
the other data centers and choose the
one that's most recent perhaps and then
105
00:09:34,990 --> 00:09:38,430
权利很便宜,品种很贵,或者您可以想象
rights are very cheap and breeds are
expensive or you can imagine
106
00:09:38,430 --> 00:09:44,860
这两种策略的组合采用某种法定人数重叠方案,或者
combinations of these two strategies
some sort of quorum overlap scheme or
107
00:09:44,860 --> 00:09:48,490
您写多数,只在多数数据时写多数
you write a majority and write a
majority at the only a majority of data
108
00:09:48,490 --> 00:09:53,440
中心和满足大多数数据中心的需求,并依靠重叠和
centers and meet a majority of data
centers and rely on the overlap and in
109
00:09:53,440 --> 00:10:00,040
实际上,人们有真实的实时系统用于商业用途
fact there are real live systems that
people use in commercially in real
110
00:10:00,040 --> 00:10:04,120
遵循这种设计的网站,因此,如果您对真正的
websites that follow much this design so
if you're interested in sort of a real
111
00:10:04,120 --> 00:10:11,230
的世界版本,您可以查找亚马逊的发电机系统或开放式
world version of this you can look up
Amazon's dynamo system or the open
112
00:10:11,230 --> 00:10:14,410
源码Kassandra系统
source
Kassandra system
113
00:10:14,410 --> 00:10:19,760
比我在这里草拟草图时要详细得多,但他们会遵循
there was much more elaborated than when
I've sketched out here but they follow
114
00:10:19,760 --> 00:10:26,960
相同的基本模式,所以这种方案的通常名称是最终的
the same basic pattern so the usual name
for this kind of scheme is eventual
115
00:10:26,960 --> 00:10:40,910
一致性,其原因是至少在开始时
consistency and the reason for that is
that at least initially if you do a
116
00:10:40,910 --> 00:10:46,360
向其他读者和其他数据中心写信不能保证EC或正确
write other readers and other data
centers are not guaranteed EC or right
117
00:10:46,360 --> 00:10:50,180
但他们总有一天会因为您要推广权利,所以他们会
but they will someday because you're
pushing out the rights so they'll
118
00:10:50,180 --> 00:10:56,120
最终会看到您的数据,因此无法保证订购,因此例如
eventually see your data there's no
guarantee about order so for example if
119
00:10:56,120 --> 00:11:00,020
我是客户,我写他从他们开始,然后我写钥匙
I'm a client and I write he's starting
with them and then I write a key
120
00:11:00,020 --> 00:11:07,100
从确定您知道M发出开始,这就是我获得篮圈碎片的权利
starting with a sure you know M sends
out it's my right to shards of a rim
121
00:11:07,100 --> 00:11:12,560
发出一项权利,服务器发出a发出我的权利,但是您知道
sends out one right and the server for a
sends out my right for a but you know
122
00:11:12,560 --> 00:11:15,830
这些可能会在广域以不同的速度或不同的路线传播
these may travel at different speeds or
different routes on the wide area
123
00:11:15,830 --> 00:11:19,880
网络,也许我写了,也许客户首先写了他们,然后写了一个,但是
network and maybe I wrote maybe the
client wrote em first and then a but
124
00:11:19,880 --> 00:11:24,260
也许如果他们先到达,然后更新为上午,也许我
maybe if they for a arrives first and
then the update for am and maybe I
125
00:11:24,260 --> 00:11:28,850
以相反的顺序到达另一个数据中心,因此不同的客户端
arrive at the opposite order at the
other datacenter so different clients
126
00:11:28,850 --> 00:11:36,470
会按照不同的顺序观察更新,所以您知道没有顺序
are gonna observe updates in different
orders so there's you know no order
127
00:11:36,470 --> 00:11:39,040
保证
guarantee
128
00:11:40,230 --> 00:11:46,209
从某种意义上说,最终意义的最终一致性是
the sense the sort of ultimate meaning
eventual consistency is that if things
129
00:11:46,209 --> 00:11:51,880
安顿下来,人们停止写作,所有这些最终写出消息
settle down and people stop writing and
all of these write messages finally
130
00:11:51,880 --> 00:11:56,800
到达他们的目的地已处理,然后我最终
arrive at their destinations are
processed then I'm an eventually an
131
00:11:56,800 --> 00:12:03,370
最终一致的系统应该最终以所有存储的相同值结束
eventually consistent system ought to
end up with the same value stored at all
132
00:12:03,370 --> 00:12:12,880
最终所有复制品的意义
of the all of the replicas that's the
sense of which it's eventually
133
00:12:12,880 --> 00:12:16,870
如果您等待尘埃落定,您将得到
consistent if you wait for the dust to
settle you're gonna end up with
134
00:12:16,870 --> 00:12:20,889
每个人都有相同的数据,这是一个很弱的规范,这是一个非常
everybody having the same data and
that's a pretty weak spec that's a very
135
00:12:20,889 --> 00:12:26,620
规格较弱,但您知道,因为规格较宽松,因此有很大的自由度
weak spec but you know because it's a
loose spec there's a lot of freedom in
136
00:12:26,620 --> 00:12:30,250
实施和很多机会来获得良好的绩效
the implementation and a lot of
opportunities to get good performance
137
00:12:30,250 --> 00:12:34,449
因为该系统基本上不需要您立即执行任何操作或
because the system basically doesn't
require you to instantly do anything or
138
00:12:34,449 --> 00:12:40,029
遵守任何排序规则,这与大多数一致性大不相同
to observe any ordering rules it's quite
different from most of the consistency
139
00:12:40,029 --> 00:12:45,240
到目前为止,我们再次提到的方案已在已部署的系统中使用
schemes we've seen so far again as I
mentioned it's used in deployed systems
140
00:12:45,240 --> 00:12:50,319
最终的一致性是很不错的,但对于应用程序程序员来说却很难
eventual consistency is but it can be
quite tricky for application programmers
141
00:12:50,319 --> 00:12:54,910
因此,让我画出一个您可能想在网络上做的事的例子
so let me sketch out a an example of
something you might want to do in a web
142
00:12:54,910 --> 00:13:04,149
还有您必须非常小心的网站,您可能会感到惊讶
and the website where you would have to
be pretty careful you might be surprised
143
00:13:04,149 --> 00:13:13,569
如果这是最终的一致性应用程序示例,假设我们正在构建一个网站
if this is an eventual consistency app
example suppose we're building a website
144
00:13:13,569 --> 00:13:19,959
存储照片,每个用户都有一组您知道的照片存储为
that stores photos and every user has a
you know set of photo photos stored as
145
00:13:19,959 --> 00:13:26,170
您知道具有某种唯一ID的键值对是键,每个用户
you know key value pairs with some sort
of unique ID is the key and every user
146
00:13:26,170 --> 00:13:31,689
有一份清单,其中包含一份允许其他人使用的公开照片清单
has a list of maintains a list of their
public photos that they allow other
147
00:13:31,689 --> 00:13:37,899
人们看到这样的假设,我拍了张照片,然后将其插入
people to see so supposing I take a
photograph and I want to insert it into
148
00:13:37,899 --> 00:13:43,329
该系统,或者您知道我与Web服务器联系,然后Web服务器运行
this system or you know I human contact
the web server and the web server runs
149
00:13:43,329 --> 00:13:47,589
代码将我的照片插入存储系统,然后添加一个
code that's gonna insert my photo into
the storage system and then add a
150
00:13:47,589 --> 00:13:51,540
引用我的照片到我的照片列表,所以也许
reference to my photo to my photo list
so maybe
151
00:13:51,540 --> 00:13:57,240
运行,也许会发生这种情况,我们会说它发生在客户端c1(即网络)上
run maybe this happens we'll say it
happens on clients c1 which is the web
152
00:13:57,240 --> 00:14:03,899
我正在与之交谈的服务器,但是代码看起来好像有一个代码调用
server I'm talking to and maybe but the
code looks like is there's a code calls
153
00:14:03,899 --> 00:14:09,290
我的照片的放置操作,这真的应该引起您的注意
the put operation for my photo
and it really should be a keen about you
154
00:14:09,290 --> 00:14:15,500
我只是求职者而已,因此我将我的照片和
I'm just gonna candidates just a few
plus value so I insert my photograph and
155
00:14:15,500 --> 00:14:22,770
然后当这个放置完成时,II将照片添加到我的列表中
then when this put finishes then I I add
the photo to my list
156
00:14:22,770 --> 00:14:28,020
对,这就是我的客户代码看起来像别人在看的东西
right that's what my my clients code
looks like somebody else is looking at
157
00:14:28,020 --> 00:14:34,080
我的照片看上去松散地看起来是我的照片列表的副本,然后
my photographs loosely gonna look fetch
a copy of my list of photos and then
158
00:14:34,080 --> 00:14:39,120
他们将查看列表中的照片,以便客户致电
they're gonna look at the photos that
are on the list so client to maybe calls
159
00:14:39,120 --> 00:14:48,570
获取我的列表,然后向下查看列表,然后呼叫获取该照片
get for my list and then looks down the
list and then calls get on that photo
160
00:14:48,570 --> 00:14:51,209
也许他们看到我刚刚上传的照片在列表中
maybe they see the photo I just uploaded
it on the list
161
00:14:51,209 --> 00:14:57,089
他们会为那张照片的你知道的钥匙做的,是的,所以这
and they're gonna do a get it for the
you know key for that photo yeah so this
162
00:14:57,089 --> 00:15:02,910
就像完全简单的代码看起来应该可以工作,但是
is like totally straightforward code
looks like it ought to work but in an
163
00:15:02,910 --> 00:15:07,589
最终保持一致的系统并不一定要正常工作,
eventually consistent system it's not
necessarily going to work and the
164
00:15:07,589 --> 00:15:12,899
问题是,即使客户在这样一个
problem is that these two puts even
though the client did them in such an
165
00:15:12,899 --> 00:15:17,670
明显的顺序是先插入照片,然后将对该照片的引用添加到我的照片中
obvious order first insert the photo and
then add a reference to that photo to my
166
00:15:17,670 --> 00:15:22,380
照片列表,事实是,在这种情况下,
list of photos the fact is that in this
event early consistent scheme that I
167
00:15:22,380 --> 00:15:29,570
概述了第二个放置点可能会在第一个放置点之前到达其他数据中心
outlined this second put could arrive at
other data centers before the first put
168
00:15:29,570 --> 00:15:34,770
因此,如果另一个客户端正在其他数据中心进行读取,则可能会看到