§8.9.

Object-relational and object-document mapping

The shopping list project has, so far, just a single table or collection. It is not time-consuming to write code to read and write to the database.

However, as your database becomes more complex, the persistence layer is likely to become repetitive: the same kind of queries are used to retrieve records from the database.

For example, in a more complex shopping list application, we might add relations for users and stores. This extended application would require many similar select queries:

Retrieve a user

select id, full_name, email
from shopping_user
where email = 'ben@example.com'

Retrieve stores in the area

select id, company, address, city
from shopping_store
where city = 'Sydney'

Get the quantities for a shopping list item

select id, quantity
from shopping_item
where description = 'Pasta'

These queries have many commonalities:

  1. They list the columns, starting with the id

  2. They use the same keywords (select, from, where)

  3. They have a simple single-column query (where …​ = …​)

Many programmers notice these commonalities. They build modules that automatically construct such queries, hoping to reduce errors when accessing a database.

The idea of building such a library is so popular it has a name: object-relational mapping (ORM) in the case of SQL databases, and object-document mapping (ODM) in the case of document databases.

You could write your own ORM/ODM. However, this is likely to introduce errors and security risks. It is better to use an established and thoroughly tested ORM / ODM.

ORM, ODM, Sequelize and Mongoose

ORM and ODM libraries provide an automatic translation layer between an underlying database and the objects in a programming language. A good ORM or ODM library will replace all of (or at least the vast majority of) the code in your persistence layer.

These libraries can even be used as a replacement of your database schemas: the mapping tools can automatically generate the create table statements of your schema.

Two popular choices in Node.js development are Sequelize for SQL databases and Mongoose for MongoDB:

Sequelize

Mongoose

Technology:

Object-relational mapping

Object-document mapping

Good for use with:

PostgreSQL and other SQL databases

MongoDB

Installation:

npm install sequelize

npm install mongoose

Declaring a mapping:

const ShoppingItem =
  sequelize.define('shopping_item', {
    id: {
      type: Sequelize.INTEGER,
      autoIncrement: true,
      primaryKey: true
    },
    description: Sequelize.TEXT,
    quantity: Sequelize.NUMERIC
  });
const ShoppingItem = mongoose.model(
  'item',
  {
    description: {type: String, index: true},
    quantity: Number
  }
);

Find all records:

ShoppingItem.findAll();
ShoppingItem.find();

Find one record:

ShoppingItem.findOne(
  { where: {description: 'Pasta'} }
)
ShoppingItem.findOne(
  { description: {$eq: 'Pasta'} }
)

Inserting:

ShoppingItem.create(
  { description: 'Pasta', quantity: 4}
)
new ShoppingItem(
  { description: 'Pasta', quantity: 4}
).save()

Updating:

item.description = 'Chocolate bar';
item.save();
item.description = 'Chocolate bar';
item.save();

Deleting:

item.destroy();
item.delete()

The chapter’s chapter08_sql_orm and chapter08_docstore_odm projects contain examples of using ORM and ODM to read from a database.

Reflection: Raw drivers versus mapping

Compare ORM/ODM code to the direct use of a database client. Consider the following questions:

  1. Do you prefer the code that uses the database client directly or the code that uses an ORM/ODM? Why?

  2. Do you prefer Sequelize or Mongoose? Why?

  3. In what situations would ORM/ODM be preferred to using the database client directly?

  4. In what situations would the direct database client be preferred to an ORM/ODM?

Challenges of ORM/ODM

Using an ORM/ODM is almost as easy as having objects in memory. ORM/ODM hides most of the complexity of creating and saving those objects to a database.

If the ORM/ODM code is simple, developers can image there is no underlying database. The ORM/ODM hides all the details of persistence.

However, there are fundamental differences between objects in memory and records in a database:

Object in memory

Records in database

Lost during server restarts

Saved to disk and accessible after rebooting

Identified by reference (i.e., by memory location)

Identified by a primary key or ‘id’

Referenced directly by memory location

Referenced indirectly by foreign key

Stored in one place in memory on one computer

Can be loaded into memory multiple times and on multiple computers simultaneously

Accessed and updated directly by updating memory

Accessed and updated indirectly via a database client

Changes are instantaneous

Changes can be delayed, reversed or subject to advanced data management features such as transactions and versioning

These differences result in an ‘impedance mismatch’. The ORM/ODM layer imperfectly hides many of the details of persisting data to a database. In most code, the imperfect layering does not matter, but subtle problems arise when you build more complex systems:

  1. It can be challenging to create complex queries

  2. Performance can be poor

  3. It can be difficult to understand error messages generated by the ODM/ORM

  4. It may make easy problems easier but complex problems more complex

ORM/ODM is not necessarily a poor choice. ORM/ODM offers real advantages, not only in hiding the persistence layer but also in providing a consistent approach to common problems:

  • Data synchronization

  • Version-tracking

  • Supporting different underlying databases

Ultimately, the choice to use an ORM/ODM is a design trade-off. You should consider the decision in light of your design goals and the objective of creating a simple, clear, well-layered system.

Reflection: Object-orientation versus databases

Object-oriented programmers may be familiar with terms such as the following:

  1. Identity

  2. References to other objects

  3. Methods and message-passing

  4. Classes

  5. string, number, Array

  6. Inheritance

What are the equivalents of these concepts in database management systems?