Brandon Rice

Software development with a focus on web technologies.

One Rails App With Many Databases

| Comments

A traditional Rails application uses a single database. Programmers interact with it by writing models that inherit from ActiveRecord::Base. As the application grows, it may be useful to connect to different databases for a variety of reasons. One database might be dedicated to reports. Another may be the result of an entirely different process, and now the Rails application wants to read from it. Using multiple databases helps a Rails application scale, and may be a more manageable first step toward an architecture based on microservices.

Rails needs two things in order to back specific ActiveRecord models from different databases: A connection configuration and an establish_connection directive. First, the configuration.

config/database.yml
1
2
3
4
5
6
7
8
9
10
11
default: &default
  adapter: postgresql
  encoding: unicode
  pool: 5
  host: localhost
  username: postgres
  password: postgres

reporting_db:
  <<: *default
  database: reporting_db

If the new database has different connection or authentication options, make those additions.

Next, instruct Rails to use a different database for a particular model.

app/models/report_user.rb
1
2
3
class ReportUser < ActiveRecord::Base
  establish_connection("reporting_db")
end

When the ReportUser class is loaded, Rails creates an additional connection pool for the new database. All reads and writes involving this model now use the new database.

Those are the basics, but there’s a few more things to think about when working with multiple databases in the same Rails app.

Migrations

The ReportUser model works great if a report_users table already exists in the new database, but what about creating one from scratch? Generated migrations need a little tweaking because the default database is the assumed target.

db/migrate/201604091557_create_report_users.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
class CreateReportUsers < ActiveRecord::Migration
  def change
    ActiveRecord::Base.establish_connection("reporting_db")

    create_table :report_users do |t|
      t.string :first_name
      t.string :last_name
      # ... etc.
    end

    ActiveRecord::Base.establish_connection(Rails.env)
  end
end

This works, but there should be an easy way to create the database before running migrations.

lib/tasks/reporting.rake
1
2
3
4
5
6
7
8
9
10
11
12
namespace :reporting do
  namespace :db do
    desc 'Create the reporting database'
    task create: :environment do
      config = ActiveRecord::Base.configurations['reporting_db']

      # Database is null because it hasn't been created yet.
      ActiveRecord::Base.establish_connection(config.merge('database' => nil))
      ActiveRecord::Base.connection.create_database(config['database'], config)
    end
  end
end

Now we’re getting somewhere, but what about using this database for several additional models?

Just One Connection Pool

Imagine creating two more reporting models, ReportOrder and ReportProduct. They look identical to ReportUser, each with a call to establish_connection. The problem here is that each class creates its own independent connection pool, and each pool has some number of individual TCP connections to the database server. Maybe this doesn’t matter for three models, but what about ten? I previously wrote about the dangers of failing to care about TCP connections. Let’s refactor before this has an opportunity to become a problem.

app/models/reporting/base.rb
1
2
3
4
5
6
7
module Reporting
  class Base < ActiveRecord::Base
    self.abstract_class = true

    establish_connection('reporting_db')
  end
end
app/models/reporting/user.rb
1
2
3
4
module Reporting
  class User < Reporting::Base
  end
end
app/models/reporting/order.rb
1
2
3
4
module Reporting
  class Order < Reporting::Base
  end
end

All subclasses of Reporting::Base now share a single connection pool. This is the same way that ActiveRecord::Base creates a connection pool used by its other subclasses. The abstract_class assignment in the Reporting::Base model means child classes look for database tables using expected Rails-isms (i.e. reporting_users, reporting_orders) instead of following single table inheritance rules.

Architectural Thinking

We’ve nicely namespaced all of the reporting models. This convention can extend to include namespacing of related controllers and views. Good separation of concerns suggests that it makes sense to isolate the reporting concept. In a world where microservices are trendy, this might be the moment when someone suggests making a reporting service. That’s a heavy investment, but there is a reasonable compromise that still accomplishes many of the same design goals: A Rails engine.

An isolated Rails engine with its own database is basically a lightweight service. Generate an engine inside lib/reporting and relocate everything in the existing Reporting namespace into the engine. Make sure the engine is isolated.

lib/reporting/lib/reporting/engine.rb
1
2
3
4
5
module Reporting
  class Engine < ::Rails::Engine
    isolate_namespace Reporting
  end
end

It’s normal when using a Rails engine to copy the engine migrations into the enclosing application using rake reporting:install:migrations. This step is unnecessary when the engine has its own database, and is actually detrimental to the separation of concerns. Instead, add a few helper tasks alongside the earlier one for creating the database.

lib/tasks/reporting.rake
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
namespace :reporting do
  namespace :db do
    migration_path = Rails.root.to_s + '/lib/reporting/db/migrate'

    desc 'Migrate the Reporting database'
    task migrate: :environment do
      ActiveRecord::Base.establish_connection("reporting_db")
      ActiveRecord::Migrator.migrate(migration_path)
    end

    desc 'Rollback the Reporting database'
    task rollback: :environment do
      ActiveRecord::Base.establish_connection("reporting_db")
      ActiveRecord::Migrator.rollback(migration_path)
    end

    # Additional tasks for db:drop, db:seed, db:schema:load
  end
end

Treat the reporting engine as a different project. Develop it separately. Consider moving the code into its own repository and pulling it in as a gem. Strictly adhere to the engine’s isolation by keeping constants from unnecessarily bleeding across module boundaries.

1
2
3
4
5
6
7
8
9
10
11
12
13
class User < ActiveRecord::Base
end

module Reporting
  # Referring to ::User here is adding a dependency on the enclosing
  # application from the engine.

  class Order < Reporting::Base
  end
end

# Referring to Reporting::Order here is adding a dependency on the engine
# from the enclosing application.

Adding the above dependencies couples the engine to the application and vice versa. This is not always bad, but each additional dependency should be an explicit and careful choice.

If and when you decide to take the plunge on a reporting service, the engine is ready to convert into a standalone Rails application. In the meantime, repeat this pattern to grow an existing Rails app using multiple databases in a modularized, scalable manner.

If you enjoyed this post, please consider subscribing.

Comments