ActiveJobStatus

ActiveJobStatus is a gem I
recently wrote ([with help from others]
(https://github.com/cdale77/active_job_status/graphs/contributors))
that provides simple job status information and batching for
background jobs running with Rails’ ActiveJob abstraction. In this post I’ll go
through the rationale, usage, and creation of the gem.

Why did you do this?

I was working on an application that did a lot of
background job processing. One day I needed to notify the user
of the completion of a data import job. To do this I needed to ask my job
processing engine (Sidekiq) if a given job was complete or not. I was using the
new (at the time) ActiveJob abstraction in Rails, but found that ActiveJob did
not provide a way to get the Sidekiq job id, so I couldn’t actually ask Sidekiq
if the job was done or not (this has since been fixed).

While looking into this I discovered that ActiveJob provides some nice
lifecycle-based callbacks. The callbacks, combined with a simple key-value store,
could provide all the functionality I needed, would be straightforward to implement,
and would be usable regardless of the job processing back-end. So I decided to
write my own gem to keep track of the status of ActiveJob jobs.

How can I use it?

More detailed instructions are available in the Readme in the project’s
repository. Here’s a quick guide
to getting up-and running.

First, tell ActiveJobStatus about your memory store. By default, use Rails’
built in memory store:

    # config/initializers/active_job_status.rb
    ActiveJobStatus.store = ActiveSupport::Cache::MemoryStore.new

If you are using Resque or Sidekiq, or have Redis in your stack already for
another reason, it’s a good idea to tell ActiveJobStatus to use Redis for
storing job metadata. To do so, you’ll first need to configure
ActiveSupport::Cache to use Redis for its store
(perhaps by using this gem). Then
use the following initializer to tell ActiveJob to use the proper store.
ActiveJob status will detect Redis and use some nice optimizations. You can also
use the Readthis gem to handle Redis.

    # config/initializers/active_job_status.rb
    ActiveJobStatus.store = ActiveSupport::Cache::RedisStore.new
    # or if you are using https://github.com/sorentwo/readthis
    ActiveJobStatus.store = ActiveSupport::Cache::ReadthisStore.new

Have your jobs inherit from ActiveJobStatus::TrackableJob instead of ActiveJob::Base:

    class MyJob < ActiveJobStatus::TrackableJob
    end

And, that’s about it in terms of setting things up. You can get useful
information about of the gem by looking up job status and batches.

Job Status

Check the status of a job using the ActiveJob job_id. Status of a job will only
be available for 72 hours after the job is queued. For right now you can’t
change that.

    my_job = MyJob.perform_later
    ActiveJobStatus::JobStatus.get_status(job_id: my_job.job_id)
    # => :queued, :working, :complete

Job Batches

For job batches you an use any key you want (for example, you might use a
primary key or UUID from your database). If another batch with the same key
exists, its jobs will be overwritten with the supplied list.

    my_key = "230923asdlkj230923"
    my_jobs = [my_first_job.job_id, my_second_job.job_id]
    my_batch = ActiveJobStatus::JobBatch.new(batch_id: my_key, job_ids: my_jobs)

Batches expire after 72 hours (259200 seconds).
You can change that by passing the initializer an integer value (in seconds).

    my_key = "230923asdlkj230923"
    my_jobs = [my_first_job.job_id, my_second_job.job_id]
    my_batch = ActiveJobStatus::JobBatch.new(batch_id: my_key,
                                             job_ids: my_jobs,
                                             expire_in: 500000)

You can easily add jobs to the batch:

    new_jobs = [some_new_job.job_id, another_new_job.job_id]
    my_batch.add_jobs(job_ids: new_jobs)

And you can ask the batch if all the jobs are completed or not:

    my_batch.completed?
    # => true, false

You can ask the batch for other bits of information:

    batch.batch_id
    # => "230923asdlkj230923"
    batch.job_ids
    # => ["b67af7a0-3ed2-4661-a2d5-ff6b6a254886", "6c0216b9-ea0c-4ee9-a3b2-501faa919a66"]

You can also search for batches:

    ActiveJobStatus::JobBatch.find(batch_id: my_key)

This method will return nil if no associated job ids can be found, otherwise it will
return an ActiveJobStatus::JobBatch object.

What does it actually do?

The implementation is fairly simple. ActiveJob’s
[lifecycle callbacks]
(http://edgeapi.rubyonrails.org/classes/ActiveJob/Callbacks/ClassMethods.html),
combined with a key-value store like Redis, create a simple way to keep track of
what jobs are queued, running, and (recently) completed.

The details:

The ActiveJobStatus::JobTracker module
provides a three simple methods: ::enqueue, ::update, and ::remove. These
methods write, update, or remove data from a key value store, where the key is
an arbitrary job id provided by the user, and the value is the current status of
the job. ActiveJobStatus::TrackableJob
then uses lifecycle callbacks provided by ActiveJob to call the JobTracker
methods at the right point in the job lifecycle. ActiveJobStatus::JobStatus
exposes a single ::get_status method that queries the key-value store.
This is all we need to keep track of job status.

Job batches are an added bit of functionality that is a wrapper
around a key-value store. With ActiveJobStatus::JobBatch, a batch is a single
record in a store, where the key is an arbitrary user-supplied ID, and the value
is an array of of TrackableJob ids. Using JobStatus we can then keep track
of whether all the jobs in a batch are complete or not, and provide a couple of
other nice semantics. The code is [here]
(https://github.com/cdale77/active_job_status/blob/master/lib/active_job_status/job_batch.rb)
if you are interested.

What could it do?

With ActionCable providing websocket functionality for Rails, it would be
awesome to hook up ActiveJob status to ActionCable to provide real time
notifications of job status to users. This type of functionality would be
perfect for an application that does a lot of background job work, where the
user needs to know the status of a job. For example, when importing a large data
file — the user could be notified via a push notification when the task is
done.

Other ways the gem could be expanded include making the 72 hour expiration limit
a bit more flexible, and I would like to update the Gem to work seemlessly with
the new ApplicationJob abstraction in Rails. Pull requests are welcome!

I hope you find this gem useful, and this post inspiration to build things!