Introduction to MongoDB - kollektivesplagiieren/innovative-commercial-market GitHub Wiki

Introduction

Introduction to NoSQL

In a common relational database, your data is stored in different tables, often connected using a primary to foreign key relation. Your program will later reconstruct the model using various SQL statements to arrange the data in some kind of hierarchical object representation. Document-oriented databases handle data differently. Instead of using tables, they store hierarchical documents in standard formats, such as JSON and XML.

In your application, you'll use an object-relational mapping library or direct SQL statements to select the blog post record and the post comments records to create your blog post object. However, in a document-based database, the blog post will be stored completely as a single document that can later be queried.

Introduction to MongoDB

MongoDB's main goal was to create a new type of database that combined the robustness of a relational database with the fast throughput of distributed keyvalue data stores. With the scalable platform in mind, it had to support simple horizontal scaling while sustaining the durability of traditional databases. Another key design goal was to support web application development in the form of standard JSON outputs. These two design goals turned out to be MongoDB's greatest advantages over other solutions as these aligned perfectly with other trends in web development, such as the almost ubiquitous use of cloud virtualization hosting or the shift towards horizontal, instead of vertical, scaling.

Key features of MongoDb

The BSON format

One of the greatest features of MongoDB is its JSON-like storage format named BSON. Standing for Binary JSON, the BSON format is a binary-encoded serialization of JSON-like documents, and it is designed to be more efficient in size and speed, allowing MongoDB's high read/write throughput. Like JSON, BSON documents are a simple data structure representation of objects and arrays in a key-value format. A document consists of a list of elements, each with a string typed field name and a typed field value. These documents support all of the JSON specific data types along with other data types, such as the Date type. Another big advantage of the BSON format is the use of the _id field as primary key. The _id field value will usually be a unique identifier type, named ObjectId, that is either generated by the application driver or by the mongod service.

MongoDB indexing

Indexes are a unique data structure that enables the database engine to efficiently resolve queries. When a query is sent to the database, it will have to scan through the entire collection of documents to find those that match the query statement. This way, the database engine processes a large amount of unnecessary data, resulting in poor performance.

To speed up the scan, the database engine can use a predefined index, which maps documents fields and can tell the engine which documents are compatible with this query statement.

MongoDB replica set

To provide data redundancy and improved availability, MongoDB uses an architecture called replica set. Replication of databases helps protect your data to recover from hardware failure and increase read capacity. A replica set is a set of MongoDB services that host the same dataset. One service is used as the primary and the other services are called secondaries. All of the set instances support read operations, but only the primary instance is in charge of write operations. When a write operation occurs, the primary will inform the secondaries about the changes and make sure they've applied it to their datasets' replication. Another robust feature of the MongoDB replica set is its automatic failover. When one of the set members can't reach the primary instance for more than 10 seconds, the replica set will automatically elect and promote a secondary instance as the new primary. When the old primary comes back online, it will rejoin the replica set as a secondary instance.

MongoDB sharding

Vertical scaling is easier and consists of increasing single machine resources, such as RAM and CPU, in order to handle the load. However, it has two major drawbacks: first, at some level, increasing a single machine's resources becomes disproportionately more expensive compared to splitting the load between several smaller machines. Secondly, the popular cloud-hosting providers limit the size of the machine instances you can use. So, scaling your application vertically can only be done up to a certain level.

Horizontal scaling is more complicated and is done using several machines. Each machine will handle a part of the load, providing better overall performance. The problem with horizontal database scaling is how to properly divide the data between different machines and how to manage the read/write operations between them.

Luckily MongoDB supports horizontal scaling, which it refers to as sharding. Sharding is the process of splitting the data between different machines, or shards. Each shard holds a portion of the data and functions as a separate database. The collection of several shards together is what forms a single logical database. Operations are performed through services called query routers, which ask the configuration servers how to delegate each operation to the right shard.

Using mongosh

Connecting to MongoDB

To connect to MongoDB, you will need to use the MongoDB connection URI. The MongoDB connection URI is a string URL that tells the MongoDB drivers how to connect to the database instance. The MongoDB URI is usually constructed as follows:

mongodb://username:password@hostname:port/database

To connect to the local database instance, the MongoDB URI looks like this:

mongodb://localhost/<server-name>

Basics

  • > use <db-name>: switch database

  • > show dbs: list all database with documents

  • > show collections: list all collections within the database

  • > db.<collection-name>.insert(): insert a document

  • > db.<collection-name>.find(): list all documents

  • > db.<collection-name>.drop(): delete collection

  • > db.<collection-name>.update(...,...,{upsert: true}): update an existing document, creating new document if required

  • > db.<collection-name>.update(...,...,{multi: true}): update all the documents that comply with the selection criteria

  • > db.<collection-name>.save(): creating new document if not existing, updating document if _id exists

  • > db.<collection-name>.remove(): remove all documents (won't delete the collection or its indexes)

  • db.<collection-name>.remove({ "property1": "value1" }, true): remove only one document

Query operators

  • > db.<collection-name>.find({"property1": { $in: ["value1", "value2"]}})

  • > db.<collection-name>.find({ "property1": "value1", "property2": { $gt: value } })

  • > db.<collection-name>.find( { $or: [{ "property1": "value1" }, {"property2": "value2"}] })

⚠️ **GitHub.com Fallback** ⚠️