Adding a Primary Key To An Existing Table


We had a requirement for our Platform Migration project that every table in the schema needed to have a primary key. This was required so that we could migrate the database using a tool called Bucardo.

I didn’t even bat an eye at that requirement — we run a Rails app! We do things conventionally! We… actually had a table that didn’t have a primary key??

My coworker had done his due diligence and utilized the following query just to check. It returned a single table name as a result:

SELECT
    tbl.table_schema,
    tbl.table_name
FROM
    information_schema.tables tbl
WHERE
    table_type = 'BASE TABLE'
    AND table_schema NOT IN ('pg_catalog', 'information_schema')
    AND NOT EXISTS (
        SELECT
            1
        FROM
            information_schema.key_column_usage kcu
        WHERE
            kcu.table_name = tbl.table_name
            AND kcu.table_schema = tbl.table_schema
    );

Why didn’t it have a primary key? Well, it didn’t need one at the time — it was just a simple join table that only contained 2 foreign keys.

I suppose that was okay and after all, adding a new column to a table isn’t uncommon. Why would this be any different?

Well, it was a bit trickier than I had anticipated.

Step 1

Add the column!

A naive first approach was the following:

class AddPrimaryKeyToPackConfigItems < ActiveRecord::Migration[7.0]
  def change
    add_column :pack_config_items, :id, :uuid, primary_key: true
  end
end

That worked! Push it to prod!

Just kidding, it bombed.

When running the migration locally, it bombed because my local database had existing records in that table which would violate the primary key constraint we were trying to add.

My new plan: add the column without the constraint, backfill it, then add the constraint. Bulletproof.

class AddPrimaryKeyToPackConfigItems < ActiveRecord::Migration[7.0]
  def change
    add_column :pack_config_items, :id, :uuid
  end
end

A simple migration. I deployed it to production and started the backfill. This should be straightforward. We can just rely on SecureRandom.uuid to generate a value for every record:

# Note: this is a **very** small table
PackConfigItem.all.each do |pci|
  pci.update!(id: SecureRandom.uuid)
end

Looks good, right?

🙅🏻

ERROR:  zero-length delimited identifier at or near """"

The error was something like the above (I have since lost my terminal history otherwise I’d have copied it exactly).

What an opaque and unhelpful error message. 😒

I whipped out my shovel and started digging. It turns out this is a failure on ActiveRecord’s part (or so I believe). It got confused — because the table contained a column named id, it assumed this column was the primary key (yes, even though there was no constraint!). As such, it couldn’t determine which record to populate because all rows in the table had the same value for their primary key!

This time I reached for a more unique backfill:

pcis = PackConfigItem.all

pcis.each do |pci|
  sql = <<-SQL
    UPDATE pack_config_items pci
    SET id = '#{SecureRandom.uuid}'
    WHERE pci.pack_config_id = '#{pci.pack_config_id}'
    AND pci.pack_item_id = '#{pci.pack_item_id}'
  SQL

  ActiveRecord::Base.connection.execute(sql)
end

That worked 👍 All records now have values in their id column.

Now to finally add the primary key constraint! Here is the migration:

class AddPrimaryKeyConstraintToPackConfigItems < ActiveRecord::Migration[7.0]
  def up
    safety_assured {
      execute "ALTER TABLE pack_config_items ADD PRIMARY KEY (id);"
    }
  end

  def down
    safety_assured {
      execute "ALTER TABLE pack_config_items DROP CONSTRAINT pack_config_items_pkey;"
    }
  end
end

But I had a feeling that was way too easy. So I ran the tests. 🙂

Indeed, there were failures. The factory for the model couldn’t create a record without its primary key! But wait… we don’t typically specify primary keys in our factories like so:

FactoryBot.define do
  factory :pack_config_item do
    id # <-- we don't usually need to do this!
    pack_config
    pack_item
  end
end

So why was FactoryBot complaining? 🤔 Can you guess?

Generating the Primary Key felt wrong:

FactoryBot.define do
  factory :pack_config_item do
    id { SecureRandom.uuid } # insane
    pack_config
    pack_item
  end
end

It turns out this sort of implementation detail is actually handled for us by Postgres automatically. Or should I say automagically? ✨

So how does that happen?

Let’s look at our schema dump and see if we can find out:

-- table we are concerned with:
CREATE TABLE public.pack_config_items (
    pack_config_id uuid NOT NULL,
    pack_item_id uuid NOT NULL,
    id uuid NOT NULL
);

-- primary key constraint was there:
ALTER TABLE ONLY public.pack_config_items
    ADD CONSTRAINT pack_config_items_pkey PRIMARY KEY (id);

At first glance, this seems legitimate. The primary key is supposed to be on a column named id, of type uuid, and is not allowed to be NULL.

But it’s not exactly what we want! Not yet…

Let’s check another table:

CREATE TABLE public.another_table (
    id uuid DEFAULT public.gen_random_uuid() NOT NULL,
    -- rest of the table
);

Aha! The column is lacking a default! That seems obvious now that we see it, right?

Wait! What the heck kind of default is that?? Why it’s a Postgres function, of course! It appears to have been added in v13 (IIRC we used to rely on a Postgres extension prior to that in order to generate the UUIDs). You can read more about it here.

Let’s add another migration now that we’re aware of the necessary default:

class AddDefaultToPackConfigItemId < ActiveRecord::Migration[7.0]
  def change
    safety_assured {
      change_column_default :pack_config_items, :id, from: nil, to: "public.gen_random_uuid()"
    }
  end
end

And now let’s check the schema dump:

-- perfection 👌
CREATE TABLE public.pack_config_items (
    pack_config_id uuid NOT NULL,
    pack_item_id uuid NOT NULL,
    id uuid DEFAULT public.gen_random_uuid() NOT NULL
);

That did the trick! The tests were passing and there was no need to start generating id values in the factory. The plan was a success!

Now you don’t have to encounter the same pitfalls should you ever need to add a primary key to an existing table!