12.7: Message queueing (Advanced Internet Programming)

Message queueing is the extension of a publish/subscribe architecture to a distributed system. Message queueing architectures physically decouple the publisher and subscriber (they may be running on separate servers) and transmit messages over the network. The name message queueing refers to asynchronous communication between publisher and subscriber. Because the subscriber may fail or operate at a different rate to the publisher, messages are be stored (or queued) until they are processed.

Problem

Many JavaScript functions do not return a value. The functions are called for their side-effects (i.e., what they do), rather than their results.

console.log is perhaps the most common example. In the team shopping list application of Chapter 10, the notificationService.notifyNewItem, notificationService.notifyDeletedItem and expenseService.createExpense functions did not return any value.

A single server runs these functions synchronously (i.e., one after the other). The calling code waits for the function to finish.

Even though the calling code could postpone execution to improve response times, synchronous execution is most common in computer programming due to its efficiency. There are no overheads associated with delayed execution, and reordering the order of execution does not offer compelling benefits in terms of throughput (the CPU must eventually do all the work irrespective of the order of execution).

Synchronous processing has a considerable cost in distributed systems. The caller spends a lot of time waiting: for the network to send the parameters, for the remote server to process the result, and for the network to send back a confirmation that the method has completed. If there is no result or the result is not needed, there is a lot of time wasted in waiting, for no benefit.

Solution

Message queuing replaces a remote message call with a message sent asynchronously over the network (i.e., without waiting for the results). As with publish/subscribe architectures, message queuing decouples the sender and receiver: topic names or subscriptions link the sender and receiver, rather than direct references.

Message-oriented middleware is a common name for the technology that manages the delivery of messages in a distributed system. ^[1] The middleware typically brings the following additional capabilities:

The middleware saves (or queues) messages when senders or receivers run at different speeds or handle temporary failures.
The middleware guarantees reliable message delivery. The middleware will retransmit messages if they are lost or sent to a failed server
The middleware resends messages if a subscriber fails. If the subscriber crashes during processing, the middleware can resend the message to a different subscriber (or the same subscriber after it reboots).
Some middleware cooperates with database transactions to ensure that messages are processed atomically. The middleware participates in the commit/rollback of database transactions. If the database transaction aborts, then the middleware will also roll back and resend the message.
The middleware supports messages delivery under various quality of service guarantees. Examples include at-least-once delivery, at-most-once delivery, exactly-once delivery, high-priority delivery, low-priority delivery, optional delivery, maximum queue sizes, timeouts.

Implementation

Message-oriented middleware products include NATS, RabbitMQ, ZeroMQ, Apache ActiveMQ, IBM MQ, ejabberd, Apache Kafka and DDS. Your Node.js code can communicate with the middleware using client packages from NPM.

You can also implement a simple message queue with a database table. To publish a message, insert data into a table. Subscribers query the table for new messages and remove the message once read.

In a basic implementation, a client publishes a message by inserting it into a table. The subscriber routinely queries the table, searching for new rows. ^[2]

PostgreSQL and MongoDB provide mechanisms for more advanced message queueing.

Message queues with PostgreSQL

In PostgreSQL, the NOTIFY and LISTEN commands can avoid the need for subscribers to frequently query a table.

A publisher can insert a message into a shopping_item_event table (for example) and then call NOTIFY new_shopping_item_event. The database scans its list of connections that have subscribed to the event using LISTEN new_shopping_item_event. The database generates an event for each subscriber, that notifies the subscriber that it should scan the shopping_item_event table to retrieve the latest message.

PostgreSQL also provides a SKIP LOCKED option in an SQL query. The query SELECT id FROM shopping_item_event FOR UPDATE SKIP LOCKED LIMIT 1 will return the first available unprocessed record. This feature allows at-most-once delivery when multiple subscribers read from a single message queue.

The PostgreSQL documentation has more information about NOTIFY, LISTEN and the SKIP LOCKED option of SELECT.

Message queues with MongoDB

In MongoDB, the findAndModify operation atomically finds an unprocessed record in a collection. The operation allows a retrieved record to be simultaneously deleted (or modified with a processing/processed status).

MongoDB does not include notify/listen functionality. However, tailable cursors provide a way to run a query where the results stay ‘open’ and update with new values. When publishers add new records to a collection, the tailable query results continue to accumulate new rows. A subscriber can use a tailable cursor to detect the publication of new messages to a collection.

The MongoDB documentation includes sample code for message queuing scenarios.

1. Middleware is a more general term describing technologies that connect servers on a distributed system. Because message queues are so popular in distributed systems, middleware is sometimes synonymous with message-oriented middleware.

2. This technique of repeatedly scanning the same data to detect new changes is known as polling. It is simple but inefficient.