Skip to content

Latest commit

 

History

History
68 lines (36 loc) · 3.01 KB

pregel_connected_component_example.md

File metadata and controls

68 lines (36 loc) · 3.01 KB

Calculate connected component by Pregel

My notes about pregel paper is here

Problem description

  • Vertexes are recorded in different machine, they have global unique ID be assigned
  • Directed graph
  • Find connected components, use the smallest vertex id for component ID
    • There are two connected components in this example(picture below), for lower component every vertex will record 1 and for upper one every vertex will record 5

Step 1

pregel_calc_connected_components_1_1  

  • All Vertex set to active
  • All Vertex record its own id(put that value on top of the node)
  • All Vertex send its value in message to connected Vertexes

pregel_calc_connected_components_1_2  

  • Use edges to send message just recorded to his neighbors
    • Each node don't know who connected to him
  • De-active, stop this round

Step 2

  • Re-active. For nodes are active will still active, for nodes receive message will be active

pregel_calc_connected_components_2_1  

  • Messages has been delivered, all Vertex record min(node id send to him)
    • Like 6 will receive value from 5 which is smaller, so he will record 5

pregel_calc_connected_components_2_2  

  • Send message along side edge and against edge
    • Why along side the edge: give vertex chance to update smallest vertex id found in this round. Like 6, he changed his value to a smaller value, so he need send out value 5 to 7
    • Why against: Take 2->1 as an example. At first 1 don't know there is 2 until 2 send message to him. Because 2 has bigger number compare with 1, so 1 will send message against edge back to 2

Step 3

  • Only active the one received message. Receive message send out in previously round.

pregel_calc_connected_components_3_1  

  • Even both 3 and 4 has already changed there value to 2, but they don't know the change on each other side(they are on different machine), so they still need to send 2 to each other.

pregel_calc_connected_components_3_2  

Step 4

pregel_calc_connected_components_4_1  

pregel_calc_connected_components_4_2  

  • Stops when everyone is inactive.

More info