My notes about pregel paper is here
- Vertexes are recorded in different machine, they have global unique ID be assigned
- Directed graph
- Find connected components, use the smallest vertex id for component ID
- There are two connected components in this example(picture below), for lower component every vertex will record
1
and for upper one every vertex will record5
- There are two connected components in this example(picture below), for lower component every vertex will record
- All Vertex set to
active
- All Vertex record its own id(put that value on top of the node)
- All Vertex send its value in
message
to connected Vertexes
- Use edges to send message just recorded to his neighbors
- Each node don't know who connected to him
- De-active, stop this round
- Re-active. For nodes are active will still active, for nodes receive message will be active
- Messages has been delivered, all Vertex record min(node id send to him)
- Like
6
will receive value from 5 which is smaller, so he will record5
- Like
- Send message along side edge and against edge
- Why along side the edge: give vertex chance to update smallest vertex id found in this round. Like
6
, he changed his value to a smaller value, so he need send out value5
to7
- Why against: Take
2
->1
as an example. At first1
don't know there is2
until 2 send message to him. Because2
has bigger number compare with1
, so1
will send message against edge back to2
- Why along side the edge: give vertex chance to update smallest vertex id found in this round. Like
- Only active the one received message. Receive message send out in previously round.
- Even both
3
and4
has already changed there value to2
, but they don't know the change on each other side(they are on different machine), so they still need to send2
to each other.
- Stops when everyone is inactive.