NoSQL - ilya-khadykin/notes-outdated GitHub Wiki
NoSQL vs relational databases
NoSQL db | Relational dbs |
---|---|
more flexible for schema changes | SQL was designed to be a query language for relational databases |
many NoSQL dbs allow definition of fields on record creation | relation databases are usually table-based, almost like spreadsheets |
nested values are common in NoSQL databases | records stored in rows; columns represent fields in rows |
fields are not standardized between records | SQL queries within or between tables in relation database |
NoSQL databases types
document stores:
- documents are stored in structured format (XML, JSON etc);
- usually organized into "collections" or "databases";
- individual documents have unique structures;
- each document usually has a specific key;
- it is possible to query a document by fields;
key-value stores:
- you have a key you can query by, and the value at that key (you usually can't query by anything other that key)
- some key-value store let you define more than one key;
- sometimes used alongside relational databases for caching
BigTable/tabular:
- named after Google`s proprietary "BigTable" implementation;
- each row can have a different set of columns;
- designed for large number of columns;
- rows are typically versioned
graph databases:
- designed for data best represented as interconnected nodes (a series of road intersections);
object databases:
- tightly integrated with object oriented programming language used;
- act as a persistence layer: store objects directly;
- you can link objects directly through pointers
Popular NoSQL dbs
CouchDB
Document db written in Erlang
MongoDB
Document db which uses JavaScript
Notes:
- querying is not done over HTTP (in comparison with CouchDB)
- native drivers for each language
- does not support CouchDB-style views
- only master/slave replication: only master copies can write data
- consistent, partition-tolerant db
- all users always get the same data back from MongoDB
- documents are partitioned using sharding
- each partion will have a subset of the records
- shards are created based on key you choose (allows you customize how MongoDB partions the db)
structure and querying in MongoDB
- structure: database/collection/record
- JavaScript-based querying somewhat similar to SQL
- still has schema-free structure
- can define MapReduce functions
Cassandra
Originally developed by Facebook
Notes:
- querying not over HTTP
- native driver for each language
- cross between key/value store and tabular database
- available, partition-tolerant db:
- you should always be able to read from and write to Cassandra
- hardware nodes can be added with no downtime
- consistency can be adjusted, although this will affect the availability
structure and querying in Cassandra
- each key maps to one or more columns
- columns can be grouped into column families
- Cassandra Query Language (CQL) is similar to SQL
- CQL specifically designed for column groups and adjusted consistency
Riak
Document db written in Erlang
Notes:
- MapReduce functions can be written in Erlang as well as JavaScript
- designed primarily to work on Mac and Linux
- available, partition-tolerant db:
- you should always be able to read from and write to Riak
- hardware nodes can be added easily
structure and querying in Riak
- structure: bucket/key/value
- query syntax is the same as the Lucene full-text search engine
- can define MapReduce function
- key filters allow you to pick up records with keys matching certain criteria
Redis
key/value store
Notes:
- querying not over HTTP
- native drivers for each language
- designed primarily to work on Mac and Linux (does not have Windows support)
- master/slave replication
- consistent, partition-tolerant db:
- each user should always get the same data back from Redis
- writing directly to a slave is possible, but violates consistency
- data replicated to multiple slaves
structure and querying in Redis
- queries primarily by key
- specific values from hashes within records can be retrieved
- value does not have to be a string, unlike many key/value stores
- lists, sets, and hashes of strings
- lists are lists of strings
- hashes are further key/value pairs
- sets are non-repeating values