Adding a Primary Key To An Existing Table
We had a requirement for our Platform Migration project that every table in the schema needed to have a primary key. This was required so that we could migrate the database using a tool called Bucardo.
I didn’t even bat an eye at that requirement — we run a Rails app! We do things conventionally! We… actually had a table that didn’t have a primary key??
My coworker had done his due diligence and utilized the following query just to check. It returned a single table name as a result:
SELECT
tbl.table_schema,
tbl.table_name
FROM
information_schema.tables tbl
WHERE
table_type = 'BASE TABLE'
AND table_schema NOT IN ('pg_catalog', 'information_schema')
AND NOT EXISTS (
SELECT
1
FROM
information_schema.key_column_usage kcu
WHERE
kcu.table_name = tbl.table_name
AND kcu.table_schema = tbl.table_schema
);
Why didn’t it have a primary key? Well, it didn’t need one at the time — it was just a simple join table that only contained 2 foreign keys.
I suppose that was okay and after all, adding a new column to a table isn’t uncommon. Why would this be any different?
Well, it was a bit trickier than I had anticipated.
Step 1
Add the column!
A naive first approach was the following:
class AddPrimaryKeyToPackConfigItems < ActiveRecord::Migration[7.0]
def change
add_column :pack_config_items, :id, :uuid, primary_key: true
end
end
That worked! Push it to prod!
Just kidding, it bombed.
When running the migration locally, it bombed because my local database had existing records in that table which would violate the primary key constraint we were trying to add.
My new plan: add the column without the constraint, backfill it, then add the constraint. Bulletproof.
class AddPrimaryKeyToPackConfigItems < ActiveRecord::Migration[7.0]
def change
add_column :pack_config_items, :id, :uuid
end
end
A simple migration. I deployed it to production and started the backfill. This should be straightforward. We can just rely on SecureRandom.uuid
to generate a value for every record:
# Note: this is a **very** small table
PackConfigItem.all.each do |pci|
pci.update!(id: SecureRandom.uuid)
end
Looks good, right?
🙅🏻
ERROR: zero-length delimited identifier at or near """"
The error was something like the above (I have since lost my terminal history otherwise I’d have copied it exactly).
What an opaque and unhelpful error message. 😒
I whipped out my shovel and started digging. It turns out this is a failure on ActiveRecord’s part (or so I believe). It got confused — because the table contained a column named id
, it assumed this column was the primary key (yes, even though there was no constraint!). As such, it couldn’t determine which record to populate because all rows in the table had the same value for their primary key!
This time I reached for a more unique backfill:
pcis = PackConfigItem.all
pcis.each do |pci|
sql = <<-SQL
UPDATE pack_config_items pci
SET id = '#{SecureRandom.uuid}'
WHERE pci.pack_config_id = '#{pci.pack_config_id}'
AND pci.pack_item_id = '#{pci.pack_item_id}'
SQL
ActiveRecord::Base.connection.execute(sql)
end
That worked 👍 All records now have values in their id
column.
Now to finally add the primary key constraint! Here is the migration:
class AddPrimaryKeyConstraintToPackConfigItems < ActiveRecord::Migration[7.0]
def up
safety_assured {
execute "ALTER TABLE pack_config_items ADD PRIMARY KEY (id);"
}
end
def down
safety_assured {
execute "ALTER TABLE pack_config_items DROP CONSTRAINT pack_config_items_pkey;"
}
end
end
But I had a feeling that was way too easy. So I ran the tests. 🙂
Indeed, there were failures. The factory for the model couldn’t create a record without its primary key! But wait… we don’t typically specify primary keys in our factories like so:
FactoryBot.define do
factory :pack_config_item do
id # <-- we don't usually need to do this!
pack_config
pack_item
end
end
So why was FactoryBot complaining? 🤔 Can you guess?
Generating the Primary Key felt wrong:
FactoryBot.define do
factory :pack_config_item do
id { SecureRandom.uuid } # insane
pack_config
pack_item
end
end
It turns out this sort of implementation detail is actually handled for us by Postgres automatically. Or should I say automagically? ✨
So how does that happen?
Let’s look at our schema dump and see if we can find out:
-- table we are concerned with:
CREATE TABLE public.pack_config_items (
pack_config_id uuid NOT NULL,
pack_item_id uuid NOT NULL,
id uuid NOT NULL
);
-- primary key constraint was there:
ALTER TABLE ONLY public.pack_config_items
ADD CONSTRAINT pack_config_items_pkey PRIMARY KEY (id);
At first glance, this seems legitimate. The primary key is supposed to be on a column named id
, of type uuid, and is not allowed to be NULL
.
But it’s not exactly what we want! Not yet…
Let’s check another table:
CREATE TABLE public.another_table (
id uuid DEFAULT public.gen_random_uuid() NOT NULL,
-- rest of the table
);
Aha! The column is lacking a default! That seems obvious now that we see it, right?
Wait! What the heck kind of default is that?? Why it’s a Postgres function, of course! It appears to have been added in v13 (IIRC we used to rely on a Postgres extension prior to that in order to generate the UUIDs). You can read more about it here.
Let’s add another migration now that we’re aware of the necessary default:
class AddDefaultToPackConfigItemId < ActiveRecord::Migration[7.0]
def change
safety_assured {
change_column_default :pack_config_items, :id, from: nil, to: "public.gen_random_uuid()"
}
end
end
And now let’s check the schema dump:
-- perfection 👌
CREATE TABLE public.pack_config_items (
pack_config_id uuid NOT NULL,
pack_item_id uuid NOT NULL,
id uuid DEFAULT public.gen_random_uuid() NOT NULL
);
That did the trick! The tests were passing and there was no need to start generating id
values in the factory. The plan was a success!
Now you don’t have to encounter the same pitfalls should you ever need to add a primary key to an existing table!