Running Background Jobs With Express

In one of my recent projects I needed to run background jobs from an Express application. I looked at RabbitMQ, Resque, and custom Redis setups but they all sort of fell short of what I needed. My first attempt was to use the Redis BRPOP command since the command blocks and waits for there to be items in a list before it returns. The Express app would add a new item to a list then a custom daemon I wrote in Go would use BRPOP to grab the latest item and do some processing before storing it in our database. But our Go daemon could only talk to the Express app by putting stuff in a database and we didn’t want the Express app to have some special endpoint just for this daemon and tie up the event loop for our customers. After some research I came across the Agenda module which allows you to set up recurring or on-demand background jobs in an Express application. Here’s how we set it up.

How it works

Agenda connects to a MongoDB instance and stores the data and schedules for tasks in a collection. This way you can pick up where you left off if you ever need to reboot the server. Unlike Redis based solutions, Agenda is perfect for recurring tasks because it stores and loads up its schedule when you start the process. Redis based queues aren’t bad though! Agenda simply addresses a different need and for our use case which was running on-demand, scheduled, and recurring background jobs, it worked just as well as something like Resque and then some.

With Agenda set up you have your Express app running as its own process and Agenda running as a separate one. Using the Agenda package within an Express route you store the task in Mongo and then the separate Agenda process picks it up and runs it.

Use cases

In case you’re wondering why or for what you’d use this, here were some of our tasks:

  • Subscribing users to mailing lists on signup
  • We had an email marketing feature that allowed users to send emails to a list of addresses at a set of times and dates. We had an Agenda task that would use the Mandrill API to send bulk emails on behalf of our users at set dates
  • Sending users a welcome email on signup
  • Look up an insurance license (we didn’t need immediate feedback so we did an API call in the background)
  • Create a Mandrill webhook and set up subaccounts for our users

Some of these tasks are run immediately and once. Others, like our email marketing tasks, were run on a specific schedule.

The setup in your Node app

The Agenda docs have a great example of how to structure your Express application for Agenda to run in a separate process. For us, we had a lib folder where we’d add our private modules. That’s where we had a jobs.js file and a jobs/ directory which contained a file for each type of task (email, sms, user-setup).

The main job file (the main Agenda setup)

Our main jobs.js file would import all of the tasks, set up Agenda, and export an object that would run in a separate process and is used within the Express app to run and schedule tasks. Ours looked like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
// Jobs
// ====
// The scheduled tasks setup
// and startup script.

var env       = process.env.NODE_ENV || 'development',
    _         = require('lodash'),
    config    = _.merge(require(__dirname + '/../config').global, require(__dirname + '/../config')[env]),
    Agenda    = require('agenda'),
    agenda    = new Agenda({db: {address: 'mongodb://localhost:27017/myApp', collection: 'tasks'}}),
    jobTypes  = process.env.WORKERS ? process.env.WORKERS.split(',') : [],
    db        = require(__dirname + '/mongo'), // Our internal MongoDB module to connect to our app's database
    knex      = require('knex')(require(__dirname + '/../../knexfile')[env]),
    bookshelf = require('bookshelf')(knex),
    models    = require(__dirname + '/../models')(bookshelf);


// Create MongoDB connection pool
db.connect(config.mongodb('myApp'), function(err) {
  if (err) {
    logger.fatal(err);
    process.exit(1);
  }

  // Start each job processor
  jobTypes.forEach(function(type) {
    require('./jobs/' + type)(agenda, db, models, config);
  });
});

if (jobTypes.length) {
  agenda.start();
}

// Handles graceful stopping of jobs
function graceful() {
  agenda.stop(function() {
    db.close(function(e) {
      if (e) logger.error(e);
      process.exit(0);
    });
  });
}

process.on('SIGTERM', graceful);
process.on('SIGINT' , graceful);

module.exports = agenda;

That file will start Agenda as a separate process. It reads the environment variable called WORKERS and requires the task file for each one. So if you had a set of Agenda task definitions as modules in jobs/ called sms.js, email.js, and user-setup.js then you’d start Agenda in production with NODE_ENV=production WORKERS=sms,email,user-setup node path/to/jobs.js. That starts Agenda and requires each job file (which contain the task definitions) so that they’re now listening and waiting to be called.

The important thing to remember is that the jobs.js file is both the file that defines tasks and starts the Agenda process and it’s also the file that you’ll require in your Express application which is used to send new jobs to the Agenda process.

Defining jobs

Remember how I mentioned that we have a jobs.js file and a jobs/ directory? Now we’ll be adding at least one file to the jobs directory for each type of job we want to run. Again, you can define all your jobs in one file but I found it useful to define all email related tasks in one file, SMS tasks in another, etc., etc. Each job file will export a function that takes things like your database connection, logger, and app configuration from your main application. That file would look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
// Email
// ==========
// Jobs that send emails at
// scheduled times.

var mandrillApi = require('mandrill-api/mandrill'),
    mandrill    = new mandrillApi.Mandrill(process.env.MANDRILL_KEY),
    _           = require('lodash'),
    Hashids     = require('hashids'),
    hashids     = new Hashids(process.env.ENCRYPTION_KEY);

module.exports = function(agenda, db, models, config) {
  // Registration email
  // ------------------
  // Sends a welcome email to a user on signup
  agenda.define('registration email', function(job, done) {
    mandrill.messages.sendTemplate({
      // Define your Mandrill settings in here
    }, function(result) {
      // Do something on success
      done();
    }, function(err) {
      // Log or do something else on error
      done();
    });
  });

  // Reset email
  // -----------
  // Sends a password reset email to a user
  // when they request it.
  agenda.define('reset email', function(job, done) {
    // Define a task to send password reset emails here
    // done();
  });

  // Drip email
  // ----------
  // Send a drip marketing email
  // to a user's mailing list.
  agenda.define('drip email', function(job, done) {
    mandrill.messages.sendTemplate({
      // More Mandrill API client stuff here
    }, function(result) {
      // Log or do something on success
      done();
    }, function(err) {
      // Log or do something else on error
      done();
    });
  });
};

The above file defines some tasks for sending emails in different scenarios. Two of the tasks are the type that would be run immediately but that don’t need to be run in-process. For example, working in Node, if you want to send a welcome email to a user when they first sign up you don’t want to tie up the event loop waiting for the MailChimp or Mandrill API to respond to you. Instead you can stick that in a background job, have it return immediately, and move on to the rest of the logic in your signup process.

Other tasks are better suited to a schedule. Agenda shines here. Our Drip Email definition is the type of task that should be run at scheduled times.

Running tasks from within Express

Now that you’ve defined and set up your tasks you just need to run them. This is how you do this within an Express route:

1
2
3
4
5
6
7
8
agenda = require('./path/to/jobs.js');

app.post('/signup', function(req, res, next) {
  // Do user registration stuff here
  agenda.now('registration email', { fname: User.get('fname'), email: User.get('email') });

  res.redirect('/dashboard/');
});

The above is a condensed version of what you’d do in your app. You require the main jobs.js file which exports an instance of Agenda and then call one of Agenda’s methods (which you can read about in their docs) to start a job now or in the future. You can also pass an object to Agenda so that when your task runs it can use the data that was available when the request was made.

Running it

I use PM2 to deploy Node apps. My ecosystem.json5 file contains 2 apps. The main Express app and the Agenda script. You’ll run these both as separate Node processes. In development I have 2 terminal tabs open. In one I run my Express app with node server.js and start the Agenda process with WORKERS=my,worker,list node lib/jobs.js.

So that’s how to set up Agenda in a modular way within your Express app.

Web development

« Nginx mysteriously fails to start What is it really like to run a company? »

Comments