Question

Where does the raw data reside and how to we extract it?

There are three different modes:

  1. Databases
  2. Services
  3. Real-time transport

Databases

Process A writes into DB which process B reads from.

Services

Process A requests data from process B using an API like REST or RPC.

Real-time transport

Different processes get data from other processes through an in-memory storage as the data broker.


Comparisons

TypeArchitecture / Use CaseProsCons
- BatchSimple- Not all processes have access to the same databases (e.g. processes from different organizations)
- Read/writing from the same database can be slow
- Service-oriented architecture
- Request-driven
- Microservices
- Batch(?)
- Different processes from different organizations can access the data
- Allows services to be decoupled
As the number of services, and those that depend on each other, scale:
- Inter-service data passing becomes the bottleneck, as the same data is sent to the requestor redundantly
- Cascading failures when services go down
- Event-driven
- Streaming
Addresses shortcomings of services.- More complex