ChatGPT解决这个技术问题 Extra ChatGPT

Difference between Document-based and Key/Value-based databases?

I know there are three different, popular types of non-sql databases.

Key/Value: Redis, Tokyo Cabinet, Memcached

ColumnFamily: Cassandra, HBase

Document: MongoDB, CouchDB

I have read long blogs about it without understanding so much.

I know relational databases and get the hang around document-based databases like MongoDB/CouchDB.

Could someone tell me what the major differences are between these and the 2 former on the list?

there are five: (1)Key-Value Stores: Oracle Coherence, Redis, Kyoto Cabinet (2)BigTable-style Databases: Apache HBase, Apache Cassandra (3)Document Databases: MongoDB, CouchDB (4)Full Text Search Engines: Apache Lucene, Apache Solr (5)Graph Databases: neo4j, FlockDB , see nosql-data-modeling-techniques

c
cs95

The main differences are the data model and the querying capabilities.

Key-value stores

The first type is very simple and probably doesn't need any further explanation.

Data model: more than key-value stores

Although there is some debate on the correct name for databases such as Cassandra, I'd like to call them column-family stores. Although key-value pairs are an essential part of Cassandra, it's not limited to just that. It allows you to nest key-value pairs, so a key could refer to multiple sub-key-value pairs.

You cannot nest key-value pairs indefinitely though. You are limited to three levels (column families) or four levels of nesting (super-column families). In case the term column family doesn't ring a bell, see the WTF is a SuperColumn article, it's a good explanation of Cassandra's data model.

Document databases, such as CouchDB and MongoDB store entire documents in the form of JSON objects. You can think of these objects as nested key-value pairs. Unlike Cassandra, you can nest key-value pairs as much as you want. JSON also supports arrays and understands different data types, such as strings, numbers and boolean values.

Querying

I believe column-family stores can only be queried by key, or by writing map-reduce functions. You cannot query the values like you would in an SQL database. If your application needs more complex queries, your application will have to create and maintain indexes in order to access the desired data.

Document databases support queries by key and map-reduce functions as well, but also allow you to do basic queries by value, such as "Give me all users with more than 10 posts". Document databases are more flexible in this way.


So the key-value stores like redit doesn't allow you to store nested key:values? And from your description, then storing a whole database (from RDBMS) into Cassandra doesn't sound very clever cause it doesn't allow flexible query and has limited nesting depth, am I right?
@ajsie: Correct, key-value stores don't support nested key-value pairs. Most of them do support specialized values though, such as a lists. Cassandra is very different from an RDBMS, as both are designed to solve very different problems. RDBMS systems are aimed at relational data that need complex querying, whereas Cassandra is aimed at processing enormous amounts of mostly non-relational data. Of course it's possible to move an RDBMS database to Cassandra, but not very clever indeed. Each of them has its own use.
So is every document database also a key, value store where the value is simply a JSON like { value: base64(val) } ?
@GroovyDotCom: Yes, you could use a document database to store simple key/value objects.
A
Ashraf Alam

Ayende has given a nice explanation regarding the difference between Key-Value and Document database:

A document database is, at its core, a key/value store with one major exception. Instead of just storing any blob in it, a document db requires that the data will be store in a format that the database can understand (i.e. JSON, XML etc). In most doc dbs, that means that we can now allow queries on the document data.