Cassandra is by default an AP (Available Partition-tolerant) database, hence it is “always on”. But you can indeed configure the consistency on a per-query basis.
my questions
- how did cassandra ensure data security
- https://cassandra.apache.org/doc/latest/cassandra/managing/operating/security.html
- TLS/SSL encryption for client and inter-node communication
- Client authentication
- Authorization
challenges
- full multi-primary database replication
- global availability at low latency
- scaling out on comodity hardware
- linear throughput increase with each additional processor
- online load balancing and cluster growth
- partitioned key-oriented queries
- flexible schema
basic knowledges
- keyspace (database)
- table
- partition (primary index)
- row
- column
storage engine
- logging data in the commit log
- writing data to the memtable
- flushing data from memtable
- storing data on disk in SSTables
commit log
memtable
SSTables
SSTables are the immutable data files that Cassandra uses for persisting data on disk. SSTables are maintained per table.
- data.db - contents of rows
- partitions.db
- rows.db
- index.db
- summary.db
- filter.db - bloom filter
- CompressionInfo.db
- statistics.db
- digest.crc32
- TOC.txt
- SAI*.db
summit keynote
messaging summary
- each node starts a gossip round every second
- 1-3 peers per round
- 3 messages passed
- constant amount of network traffic
pratical implications
- who is in the cluster?
- gossip with a seed on startup
- learn all peers
- gossip
- lather, rinse, repeat
- how are peers judged UP or DOWN
- what does UP/DOWN mean
- local to each node
- determined via heartbeat
- failure detector
- glorified heartbeat listener
- records timestamp when heartbeat update is received for each peer
- keeps backlog of timestamp intervals between updates
- periodically check all peers to make sure we’ve heard from them recently
- UP/DOWN affects
- stop sending writes (hints)
- sending reads
- gossip
- repair/stream sessions are terminated
- what does UP/DOWN mean
- when does a node stop sending a peer traffic
- when is one peer preferred over another
- dynamic snitch to rank all peers’ latency
- when does a node leave the cluster
Query Structure
cql
|
|
misc
configuration
cassandra.yaml
|
|
source code
- to hack the source code locally, we had to use ant (to build the project), also to generate files by using command
ant generate-idea-files
, after this, the IDEA starts to learn how to source the libraries.