etcd 实战

etcd 基础架构

  • client
    • load balancing
    • node failover
  • API network (gRPC)
  • Raft Consensus
    • leader selection
    • log replication
    • ReadIndex
  • function
    • KVServer
    • MVCC
    • Auth
    • Lease
    • Compactor
  • storage
    • WAL
    • Snapshot
    • boltdb

architecture

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
+---------------------+
|       Clients       |
+---------------------+
          |
          v
+---------------------+
|       gRPC API      |
+---------------------+
          |
          v
+---------------------+       +---------------------+
|       Leader        | <-->  |      Followers      |
|   (Raft Consensus)  |       |   (Raft Consensus)  |
+---------------------+       +---------------------+
          |
          v
+---------------------+
|     BoltDB (MVCC)   |
+---------------------+
          |
          v
+---------------------+
|  Authentication &   |
|   Authorization     |
+---------------------+
          |
          v
+---------------------+
|       Leases        |
+---------------------+
          |
          v
+---------------------+
|       Watch         |
+---------------------+

Raft 算法

  • leader selection
    • leader, follower, candidate (preVote, preCandidate)
    • heart-beat interval, election timeout
  • log replication
  • safety
    • epoch, nextIndex, commitIndex

鉴权模块

  • user:pass + RBAC
    • blowfish encryption algorithm, salt, customizable hash iteration
    • x.509
    • ACL, ABAC, RBAC
    • JWT
    • Segment Tree
1
2
3
4
5
6
7
8
etcdctl user add root:root
etcdctl auth enable
etcdctl put hello world --user root:root

# role
etcdctl role add admin --user root:root
etcdctl role grant-permission admin readwrite hello helly --user root:root
etcdctl user grant-role alice admin --user root:root

租约模块

如何检测一个进程的存活性

MVCC/Watch 模块

questions

  • etcd watch 机制能保证事件不丢吗
  • 哪些因素会导致集群 leader 发生切换
  • 为什么基于 raft 实现的 etcd 还可能出现数据不一致
  • 为什么删除了大量数据, dbsize 无变化
  • 为什么 etcd 社区建议 db 不要超过 8G
  • 为什么集群各节点磁盘 IO 延时很低, 写请求也会超时
  • 为什么只存储了 1 个几百 KB 的 k/v, etcd 进程却可能耗费数 G 内存
  • 当在一个 namespace 下创建了数万个 pod/crd 资源时, 频繁通过标签去查询制定 pod/crd 资源时, api-server 和 etcd 为什么扛不住

to identify a compromised node

  • monitor node behavior – irregularity
    • heartbeat
    • log replication
    • election behavior
  • audit logs – unusual patterns
    • access logs
    • operation logs
  • data integrity checks
    • checksum verification
    • snapshot comparison
  • security measures
    • authentication and authorization
    • encryption
  • health checks
    • liveness and readiness probes
    • resource monitoring
  • consensus protocol violations
    • protocol adherence
    • quorum verification
Licensed under CC BY-NC-SA 4.0
Get Things Done
Built with Hugo
Theme Stack designed by Jimmy