a primary use of gossip is for information diffusion: some event occurs, and our goal is to spread the world
my problems
- what is gossip protocol
- the stages of nodes in the cluster
- join the cluster
- when a node become dead
- data sync
- initial stage
- in-sync stage (what’s the strategy to decide which data should be synced)
- how did the gossip protocol (implementation) resolves the data conflicts
The gossip protocol is a decentralized peer-to-peer communication technique to transmit messages in an enormous distributed system. The key concept of gossip protocol is that every node periodically sends out a message to a subset of other random nodes. The entire system will receive the particular message eventually with a high probability. In layman’s terms, the gossip protocol is a technique for nodes to build a global map through limited local interactions.
The gossip protocol is typically used to maintain the node membership list, achieve consensus, and fault detection in a distributed system.
three types of gossip
Anti-Entrophy
as the database. The replicated data is compared and the difference between replicas are patched 9. The node with the newest message shares it with other nodes in every gossip round.
The anti-entropy model usually transfers the whole dataset resulting in unnecessary bandwidth usage. The techniques such as checksum, recent update list, and Merkle tree can be used to identify the differences between nodes to avoid transmission of the entire dataset and reduce network bandwidth usage. The anti-entropy gossip protocol will send an unbounded number of messages without termination.
Rumor-Mongering
Aggregation
real world applications
- Apache Cassandra employs the gossip protocol to maintain cluster membership, transfer node metadata (token assignment), repair unread data using Merkle trees, and node failure detection
- Consul utilizes the swim-gossip protocol variant for group membership, leader election, and failure detection of consul agents
- CockroachDB operates the gossip protocol to propagate the node metadata
- Hyperledger Fabric blockchain uses the gossip protocol for group membership and ledger metadata transfer
- Riak utilizes the gossip protocol to transmit consistent hash ring state and node metadata around the cluster
- Amazon S3 uses the gossip protocol to spread server state across the system
- Amazon Dynamo employs the gossip protocol for failure detection, and keeping track of node membership
- Redis cluster uses the gossip protocol to propagate the node metadata
- Bitcoin uses the gossip protocol to spread the nonce value across the mining nodes
advantages
- scalable
- fault tolerant
- robust
- convergent consistency
- decentralized
- simplicity
- integration and interoperability
- bounded load
disadvantage
- eventual consistency
- unawareness of network partitions
- relatively high bandwidth consumption
- The gossip protocol is not known for efficiency as the same message might be retransmitted to the same node multiple times consuming unnecessary bandwidth
- increased latency
- difficulty in debugging and testing
- membership protocol is not scalable
- prone to computational errors
references
- https://systemdesign.one/gossip-protocol/
- https://www.youtube.com/watch?v=FuP1Fvrv6ZQ
- Apple Inc.: Cassandra Internals — Understanding Gossip
- https://managementfromscratch.wordpress.com/2016/04/01/introduction-to-gossip/