Data modeling

Data modeling

Data in MongoDB has a flexible schema which means that each collection does not enforce a structure to how each documents inside of it should be. So the key factor to a good document strucuture is the Structure of documents and Relations between data.

So in order to have a good data modeling, your design should follow to the following points:

  • How the data will be used
  • How to retrieve data
  • what is the relationship between data
  • How often does the data change
  • Use duplication when you have the choice
  • Performance of queries

Key challenges and factors in data modeling

The key factor in designing data models in MongoDB resolves in the structure of the documents and the relationship between those documents. MongoDB provides two way of dealing with relationships.

  • Embedded documents
  • References: Manual or DBrefs

Denomrmalized models or Embedded documents

With MongoDB, you may embed related data in a single structure or document. These schema are generally known as “denormalized” models, and take advantage of MongoDB’s rich documents. An example will be an article that contains its own comments:

{
    name: <name>,
    content: <content>,
    exerpt: <exerpt>,
    views: <number>,
    creationDate: <date>,
    authorId: <ObjectId>,
    comments: [
    {
        username: <username>, 
        email: <email>,
        comment: <comment>
    },
    {
        username: <username>, 
        email: <email>,
        comment: <comment>
    }
    ]
}

Fetching Denomrmalized models

In order to query our data we can use the find method, but also we can use the projection to get only the article or only the comments

//All the document
db.coll.find()
//Get only The article data
db.coll.find({}, {comments: 0})
//Get only the commenets
db.coll.find({}, {comments: 1})

Normalized models or References

Normalized data models describe relationships using references between documents. References store relationships data between our collection from one document to an other using the _id or some custom fields.

Manual references

It's when you store the _id field of a documents inside an other like a foreign key.

{
    name: <name>,
    content: <content>,
    exerpt: <exerpt>,
    views: <number>,
    creationDate: <date>,
    authorId: <ObjectId>
}

DBrefs references

DBRefs are a convention for representing a document, rather than a specific reference type. They include the name of the collection, and in some cases the database name, in addition to the value from the _id field.

{
    name: <name>,
    content: <content>,
    exerpt: <exerpt>,
    views: <number>,
    creationDate: <date>,
    "authorId" : {
        "$ref" : "authorCollection",
        "$id" : ObjectId("...") or <custom_id>,
        "$db" : "authorDatabase"
    }
}

The order of fields in the DBRef matters, and you must use the above sequence when using a DBRef.

Example of data modeling and relationship

In our scenario, we image that we need to create a schema where we have articles, comments, tags and authors.

First step: structure of our data

The _id will be ignored in the first step because we will not specify the relationship between collections until the second step.

Article
{
    name: <name>,
    content: <content>,
    exerpt: <exerpt>,
    views: <number>,
    creationDate: <date>
}
Comment
{
    username: <username>, 
    email: <email>,
    comment: <comment>
}
Comment
{
    username: <username>, 
    email: <email>,
    comment: <comment>
}
Tag
{
    name: <name>
}
Author
{
    name: <name>, 
    email: <email>,
    encryptedPassword: <encryptedPassword>
}

Second step: Relationships

  • Each comment belongs to one article, so it makes since that the comments should be an embedded object inside of the article.
  • As the user can change and we will like to keep track of the real object. The article should contains a reference to the real user.
  • A tag can be added to multiple article, but since it does not mather if it changes or not, because each article will contain it's own tags. The best way would be to have a list of tags inside the document.

So our final document will be as the following one:

Author
{
    _id: <ObjectId>
    name: <name>, 
    email: <email>,
    encryptedPassword: <encryptedPassword>
}
Article
{
    name: <name>,
    content: <content>,
    exerpt: <exerpt>,
    views: <number>,
    creationDate: <date>,
    authorId: <ObjectId>,
    comments: [
    {
        username: <username>, 
        email: <email>,
        comment: <comment>
    },
    {
        username: <username>, 
        email: <email>,
        comment: <comment>
    }
    ],
    tags: [
        {name: <name>},
        {name: <name>}
    ]
}

One to One, Many to One relationships in MongoDB

One To One

In most cases One to One relation ship is preferable to be created using embedded documents and you can get the embedded documents using projection. For example a client with an adrress:

Embedded document

{
    name: "yami",
    email: "yamicode@yamicode.com",
    adrress: {
        country: "UK",
        city: "London",
        adrress: "....."
    }
}

References

//Client
{
    _id: ObjectId("5e44043adfa432477f122817")
    name: "yami",
    email: "yamicode@yamicode.com",
    adrress: 
}
//Adresse
{
    clientId: ObjectId("5e44043adfa432477f122817"),
    country: "UK",
    city: "London",
    adrress: "....."
}

One to Many

In a one to many or Many to One relationship we have two choices, either using an embedded document if the data depends only on the following relationship like comments inside of an article or using references if the data should keep track of the realtime object. Here is an example of an article that contains multiple comments:

Embedded document

{
    name: <name>,
    content: <content>,
    exerpt: <exerpt>,
    views: <number>,
    creationDate: <date>,
    authorId: <ObjectId>,
    comments: [
    {
        username: <username>, 
        email: <email>,
        comment: <comment>
    },
    {
        username: <username>, 
        email: <email>,
        comment: <comment>
    }
    ]
}

References

//Comments
{
    _id: ObjectId("5e44043adfa432477f122817")
    username: <username>, 
    email: <email>,
    comment: <comment>
},
{
    _id: ObjectId("5e44043adfa432477f122818")
    username: <username>, 
    email: <email>,
    comment: <comment>
}
//Article
{
    name: <name>,
    content: <content>,
    exerpt: <exerpt>,
    views: <number>,
    creationDate: <date>,
    comments:[
        ObjectId("5e44043adfa432477f122817"),
        ObjectId("5e44043adfa432477f122818")
    ]
}