Skip to main content

One post tagged with "Active Record"

View All Tags

Fixing an ActiveRecord join constraint bug

ยท 4 min read
Garrett Blehm

This is a more technical dive into an issue I fixed in Rails. If you want a less technical/more personal blog, check out that blog.

Background:โ€‹

All code examples and links are from rails Rails 7.2.1. This was resolved in Rails 7.2.2.

Glossary:

  • Association - A high level declaration of how two models are related in Rails.
  • Reflection - A detailed representation of how an association is generated in Rails.
  • Reflection Chain - A list of reflections that connect two models. Used by Rails to generate SQL joins.

ActiveRecord builds a reflection chain that connects the parent model to it's joined association.

Given the following class:

class Parent < ActiveRecord::Base
has_one :child
has_one :grandchild, through: :child
end

Parent.joins(:child) has a reflection chain of Parent -> Child and Parent.joins(:grandchild) has a reflection chain of Parent -> Child -> Grandchild.

Problem:โ€‹

.left_outer_joins() resulted in incorrect SQL joins when a child association shared the same parent association but not entire association ancestry.

I found this problem while working on our internal record filtering gem. A simplified overview of our set up is below.

class PurchaseOrder < ActiveRecord::Base
belongs_to :created_by, class_name: "User"
has_one :created_by_contact, class_name: "Contact", through: :created_by, source: :scoped_contact

belongs_to :approved_by, class_name: "User"
has_one :approved_by_contact, class_name: "Contact", through: :approved_by, source: :scoped_contact
end
class User < ActiveRecord::Base
has_many :account_memberships, inverse_of: :user, dependent: :destroy
has_one :scoped_contact, class_name: "Contact", through: :scoped_account_membership, source: :contact
end
class AccountMembership < ActiveRecord::Base
belongs_to :contact
end

When we perform an inner join we get:

-- PurchaseOrder.joins(:created_by_contact, :approved_by_contact).to_sql

SELECT
"purchase_orders".*
FROM
"purchase_orders"
INNER JOIN "users" ON "users"."id" = "purchase_orders"."created_by_id"
INNER JOIN "account_memberships" ON "account_memberships"."user_id" = "users"."id"
INNER JOIN "contacts" ON "contacts"."id" = "account_memberships"."contact_id"
INNER JOIN "users" "approved_bies_purchase_orders_join" ON "approved_bies_purchase_orders_join"."id" = "purchase_orders"."approved_by_id"
INNER JOIN "account_memberships" "scoped_account_memberships_purchase_orders_join" ON "scoped_account_memberships_purchase_orders_join"."user_id" = "approved_bies_purchase_orders_join"."id"
INNER JOIN "contacts" "approved_by_contacts_purchase_orders" ON "approved_by_contacts_purchase_orders"."id" = "scoped_account_memberships_purchase_orders_join"."contact_id"

We see that we correctly join "purchase_orders"."created_by_id" to contacts and "purchase_orders"."approved_by_id" to approved_by_contacts_purchase_orders (an alias for contacts).

However, if we want to use .left_joins we see a problem.

-- PurchaseOrder.left_joins(:created_by_contact, :approved_by_contact)

SELECT
"purchase_orders".*
FROM
"purchase_orders"
LEFT OUTER JOIN "users" ON "users"."id" = "purchase_orders"."created_by_id"
LEFT OUTER JOIN "account_memberships" ON "account_memberships"."user_id" = "users"."id"
LEFT OUTER JOIN "contacts" ON "contacts"."id" = "account_memberships"."contact_id"
LEFT OUTER JOIN "contacts" "approved_by_contacts_purchase_orders" ON "approved_by_contacts_purchase_orders"."id" = "account_memberships"."contact_id"

We correctly join "purchase_orders"."created_by_id" to contacts but "purchase_orders"."created_by_id" is also joined to approved_by_contacts_purchase_orders.

We are missing the join to the approved_by_contact ๐Ÿ˜ฑ

Explanation of Problem:โ€‹

Given the following code:

PurchaseOrder.left_joins(:created_by_contact, :approved_by_contact)

The created_by_contact association with the PurchaseOrder model has a reflection chain of PurchaseOrder(created_by) -> User -> AccountMembership -> Contact.

When left joining on created_by_contact, the make_constraints method caches the reflection chain at each level. The cache key is the last reflection item of the reflection chain.

The following cache is generated for the created_by_contact reflection chain.

{
Contact: [PurchaseOrder(created_by), User, AccountMembership, Contact],
AccountMembership: [PurchaseOrder(created_by), User, AccountMembership],
User: [PurchaseOrder(created_by), User],
PurchaseOrder(created_by): [PurchaseOrder(created_by)]
}

Since created_by_contact is the first association joined, every reflection chain item is added to the cache.

The approved_by_contact association with the PurchaseOrder model has a reflection chain of PurchaseOrder(approved_by) -> User -> AccountMembership -> Contact.

When we add :approved_by_contact to the .left_joins() method, the make_constraints method checks the cache and finds a match on the Contact reflection key for the value of [PurchaseOrder(created_by), User, AccountMembership, Contact] and stops building any further join constraints.

Solution:โ€‹

Now that I know the problem, I can fix it. The make_constraints method didn't have enough information to discern if the reflection chain was a full or partial match.

I updated the code within the make_constraints method and passed the entire reflection chain to use as the cache key.

This allowed make_constraints to successfully match on when the reflection chain was the same but not match in my case when the reflection chain was different.

I created a PR to Rails with my fix and it was merged after three days! Two months later, it was released as part of Rails v7.2.2.