- Data comes in self-contained documents, and relationships between documents are rare.
- Common file formats include JSON, XML, BSON
NoSQL implementation
- Designed for Big Data
- Have to model data to the query
- As such, data tends to be denormalized and duplicated
- No joins! One table per query
- The primary key
- has to be unique
- is made up of one partition key (used to determine which node holds the data)
- Partition key chosen should ideally be equally distributed to that work is distributed across all nodes
- This reminds me a bit of BigTable schema design
- and zero or more clustering keys (which controls the ascending sorting order)
- the WHERE clause is a critical part of any NoSQL query
- the partition keys and clustering columns have to be called in order