The Complete SQL Migration Checklist: 12 Steps Before Production
At 11:47 AM on a Thursday, a team at a fintech company deployed what looked like a harmless migration: adding a NOT NULL column to a 200-million-row table. The migration locked the table for 14 minutes. Payment processing ground to a halt. The rollback failed because the column already existed. They spent six hours in incident response, lost $40,000 in transaction fees, and earned a permanent spot in the company's post-mortem hall of fame.
This wasn't a complex migration. It was a simple one that skipped a checklist. In this guide, we give you the 12-step checklist that prevents these disasters. Print it. Bookmark it. Run it before every deployment.
Step 1: Diff your schemas visually
Before you even look at the migration SQL, understand what changed at a structural level. A text diff of two schema dumps is noisy and error-prone. A semantic diff tells you exactly which tables, columns, constraints, and indexes changed.
- Export the old schema (pre-migration) and the new schema (post-migration).
- Run both through a semantic diff tool to see added, removed, and modified objects.
- Verify that every change in the diff was intentional.
- Look for surprises: extra indexes, dropped constraints, or renamed columns you didn't expect.
Tools like SchemaLens make this instant. Paste two schemas, get a color-coded visual diff, and generate the migration SQL โ all in your browser.
Step 2: Score the migration risk
Not all migrations are equal. Some are zero-risk. Others are "page the CEO" territory. Classify every migration before it ships:
- HIGH RISK: Dropping columns or tables, adding
NOT NULLwithout defaults, narrowing column types, dropping indexes used by queries, removing foreign keys. - MEDIUM RISK: Adding indexes on large tables, adding foreign keys with
ON DELETE CASCADE, renaming columns, changing default values. - LOW RISK: Adding nullable columns, adding tables, creating non-unique indexes on small tables, adding comments or check constraints.
High-risk migrations need a second reviewer, a tested rollback plan, and deployment during a scheduled maintenance window. Never ship high-risk changes in a Friday deploy.
Step 3: Verify backward compatibility
Your application code and your database schema rarely deploy in perfect lockstep. During a rolling deployment, old code runs against the new schema, and new code runs against the old schema. Both combinations must work.
- Adding a column? Make it
NULLfirst, backfill data, then addNOT NULLin a follow-up migration. - Renaming a column? Don't. Add the new column, dual-write to both, migrate reads, then drop the old column.
- Dropping a column? Verify zero queries reference it (check logs, ORM models, and raw SQL).
- Changing a type? Ensure the old code can read and write the new type without casting errors.
The golden rule: never break the version of the code that is currently running in production.
Step 4: Check for missing indexes
New foreign keys, new query patterns, and new join conditions often need indexes that the migration author forgot to add. A missing index on a foreign key in a 50-million-row table turns a 5ms query into a 45-second table scan.
- Every new foreign key gets an index on the referencing column.
- Every new column used in
WHERE,JOIN, orORDER BYclauses is evaluated for indexing. - Composite indexes are considered for multi-column filters.
- Existing indexes are checked for redundancy โ don't create a duplicate index.
Use the SQL Index Analyzer to audit your schema for missing indexes before deployment.
Step 5: Estimate lock time and table size
Some migrations are fast on development databases and catastrophically slow in production. The difference is data volume. Before deploying, estimate how long the migration will hold locks:
-- PostgreSQL: check table size
SELECT pg_size_pretty(pg_total_relation_size('users'));
-- MySQL: check table size and row count
SHOW TABLE STATUS LIKE 'users';
-- Estimate: adding a column with DEFAULT on 100M rows
-- PostgreSQL 11+: ~instant (metadata-only if not touched)
-- PostgreSQL <11: rewrite entire table, could be hours
If the estimated lock time exceeds your tolerance, restructure the migration:
- Use
pt-online-schema-change(Percona Toolkit) for MySQL - Use
pg_repackorpg_squeezefor PostgreSQL - Split the migration into smaller, online-safe steps
Step 6: Write idempotent SQL
Migrations fail. When they do, you don't want to spend 30 minutes manually cleaning up a half-applied state. Idempotent migrations can be retried safely:
-- Instead of this:
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- Do this:
ALTER TABLE users ADD COLUMN IF NOT EXISTS phone VARCHAR(20);
-- For indexes:
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_users_email ON users(email);
Not all dialects support IF NOT EXISTS for every operation. When they don't, wrap the migration in a transaction or use a migration framework (Flyway, Liquibase) that tracks applied migrations in a metadata table.
Step 7: Test on production-like data
A migration that takes 200ms on a 1,000-row development database might take 40 minutes on a 50-million-row production database. Test on realistic data:
- Restore a recent production backup to a staging environment.
- Run the migration and measure wall-clock time.
- Monitor CPU, disk I/O, and lock contention during the migration.
- Run your application's slowest queries before and after to catch plan regressions.
If you can't restore production data for compliance reasons, generate realistic test data with the SQL Test Data Generator that matches your production row counts and cardinality.
Step 8: Have a tested rollback plan
Every migration is a one-way door unless you have a rollback plan. Before deploying, ask: "If this breaks production, how do I undo it in under 5 minutes?"
- For additive changes (new columns, new tables): Rollback is usually a simple
DROP COLUMNorDROP TABLE. Keep the SQL ready. - For destructive changes (dropped columns, narrowed types): Rollback may require restoring from backup. Have a backup snapshot taken immediately before deploy.
- For data migrations: Run them in batches with checkpointing. If something fails, resume from the last checkpoint instead of starting over.
"Hope is not a rollback strategy. If you can't undo a migration in five minutes, don't ship it during business hours."
Step 9: Schedule during a low-traffic window
Even low-risk migrations can cause brief locks. Schedule them when traffic is minimal:
- Identify your lowest-traffic window from monitoring dashboards.
- Avoid deploys on Fridays, before holidays, or during peak business hours.
- Notify stakeholders (customer support, ops, on-call) before the deploy window.
- Have the on-call engineer aware and ready, even for "low-risk" changes.
Step 10: Monitor during and after deployment
The deploy isn't done when the migration finishes. It's done when you've confirmed the system is healthy:
- During: Watch database connection count, query latency, lock waits, and error rate. Set an alert if latency spikes above 2ร baseline.
- Immediately after: Run your smoke tests and verify critical user flows.
- 30 minutes after: Check slow query logs for new entries. A new index might have changed query plans in unexpected ways.
- 24 hours after: Review resource utilization. Some migrations have delayed effects (e.g., autovacuum overhead after large table rewrites).
Step 11: Verify constraint integrity
New constraints can fail if existing data violates them. Always validate before adding:
-- Check for NULLs before adding NOT NULL
SELECT COUNT(*) FROM users WHERE phone IS NULL;
-- Check for duplicates before adding UNIQUE
SELECT phone, COUNT(*) FROM users GROUP BY phone HAVING COUNT(*) > 1;
-- Check for orphans before adding FOREIGN KEY
SELECT COUNT(*) FROM orders WHERE user_id NOT IN (SELECT id FROM users);
-- PostgreSQL: add constraint as NOT VALID, then validate
ALTER TABLE users ADD CONSTRAINT chk_phone_format
CHECK (phone ~ '^\+?[0-9\- ]+$') NOT VALID;
ALTER TABLE users VALIDATE CONSTRAINT chk_phone_format;
The NOT VALID โ VALIDATE pattern in PostgreSQL lets you add a constraint without locking the table for validation. The validation step scans the table, but doesn't block reads or writes.
Step 12: Update documentation and runbooks
The final step is the most commonly skipped โ and the one that saves your future self. After every migration:
- Update your schema documentation (ER diagrams, data dictionaries).
- Update runbooks that reference table structures or query patterns.
- Notify the team in Slack/Teams with a summary of what changed and why.
- Archive the old and new schema snapshots for future comparison.
Future you โ debugging a production issue at 2 AM โ will thank present you for keeping the docs current.
The 12-step checklist (printable)
- 1. Diff schemas visually โ confirm every change is intentional
- 2. Score migration risk โ high-risk changes need extra review and scheduling
- 3. Verify backward compatibility โ old code must work with new schema
- 4. Check for missing indexes โ every FK and new filter column
- 5. Estimate lock time โ test on production-like data volume
- 6. Write idempotent SQL โ migrations should be safely retryable
- 7. Test on realistic data โ measure time, CPU, and I/O
- 8. Have a tested rollback plan โ know how to undo in under 5 minutes
- 9. Schedule low-traffic deploy โ avoid Fridays and peak hours
- 10. Monitor during and after โ latency, errors, slow queries
- 11. Verify constraint integrity โ existing data must satisfy new rules
- 12. Update documentation โ schema docs, runbooks, team notifications
Make it automatic
Running this checklist manually for every migration is valuable but time-consuming. The teams that scale safely automate the checks:
- CI/CD integration: Add schema diff checks to your pull request pipeline. The SchemaLens CLI runs in GitHub Actions, GitLab CI, and Bitbucket Pipelines. It fails builds on breaking changes and generates migration reports as PR comments.
- Pre-deploy diff: Compare staging and production schemas before every deploy. If they don't match expectations, block the deploy.
- Breaking change alerts: Configure Slack or webhook notifications when high-risk migrations are detected.
โ Run the checklist on your next migration
Compare your schemas, detect breaking changes, and generate migration SQL โ all in one tool.
Open SchemaLens โ