tags : Message Passing, Design Patterns
FAQ
Streaming vs Processing
These terms are vague, on top of that redpands, pulsar etc. mix these together
Event streaming
:Storage
andmovement
of data in real timeEvent processing
:Processing
andconnectivity
of data. connecting source & destination.- Eg. kafka ecosystem
Storage and movement
: Kafka brokerConnectivity
: kafka connect, flink also can doProcessing
: ksqldb, flink, kstream
What Models
Model to think about data at rest and data in motion about real world systems.
State based
- Source of truth: A table that can be mutated
- This is the more traditional model.
- More live a snapshot in time
Events based
-
Source of truth: Event log
-
Storing your data as events
-
Retains more data than state based system
-
Events are immutable (You can’t change the past)
-
Event sourcing, Event Streaming, CQRS etc.
-
Current view
- Whole stream of events is required to derive the current position
Advantage
- Immutability
- Recoverability: If we discover a bug in our system, once we fix the bug, fixing the data is simply replaying the event stream from that previous point.
Disadvantage
- Event log can now contain many different versions of the same schema
Event Sourcing w CQRS
- Command Query Responsibility Segregation
- At its heart is the notion that you can use a different model to update information than the model you use to read information.
- CQRS is not the only way to do this, but most probably one of the most common ways
Writing
We write to append only log as events happen
Reading / CQRS
-
Usually to read from a event log, you’ll need to do a chronological reduce over the data. This can take a lot of time based on the size of the data.
-
Solution is to compute this at write time(async).
-
Because of this, system becomes Eventually consistent 🌟
- Reads might not be immediately available after write.
CDC
See CDC ( Change Data Capture )