SQL Schema Design Interview Questions
Practice real-world database schema design questions from top tech companies. Write your solution, reveal the expert answer, and diff them side-by-side.
Design the Database Schema for Twitter
MediumYou are designing the core relational schema for a simplified Twitter-like platform. Focus on the entities and relationships that power the main feed, profiles, and interactions.
Core Entities
- Users with profiles
- Tweets (text, media, timestamps)
- Follow relationships
- Likes and retweets
Functional Requirements
- Users can post tweets
- Users can follow/unfollow others
- Users can like and retweet
- Fetch a user's home timeline
- Fetch a user's profile + tweets
Scale Hints
- 10M+ daily active users
- 500M+ tweets per day
- Follows: average 200 per user
- Reads >> Writes
Constraints
- Tweet text max 280 chars
- Username unique, 15 chars max
- Prevent self-follows
- Handle deleted tweets gracefully
CREATE TABLE users (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
username VARCHAR(15) NOT NULL UNIQUE,
display_name VARCHAR(50) NOT NULL,
bio TEXT,
avatar_url VARCHAR(500),
email VARCHAR(255) NOT NULL UNIQUE,
password_hash VARCHAR(255) NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
is_verified BOOLEAN NOT NULL DEFAULT FALSE,
follower_count INT NOT NULL DEFAULT 0,
following_count INT NOT NULL DEFAULT 0,
tweet_count INT NOT NULL DEFAULT 0
);
CREATE TABLE tweets (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
content VARCHAR(280) NOT NULL,
reply_to_tweet_id BIGINT REFERENCES tweets(id) ON DELETE SET NULL,
retweet_of_tweet_id BIGINT REFERENCES tweets(id) ON DELETE SET NULL,
media_urls TEXT[],
like_count INT NOT NULL DEFAULT 0,
retweet_count INT NOT NULL DEFAULT 0,
reply_count INT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
is_deleted BOOLEAN NOT NULL DEFAULT FALSE,
deleted_at TIMESTAMPTZ
);
CREATE TABLE follows (
follower_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
following_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (follower_id, following_id),
CONSTRAINT no_self_follow CHECK (follower_id != following_id)
);
CREATE TABLE likes (
user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
tweet_id BIGINT NOT NULL REFERENCES tweets(id) ON DELETE CASCADE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (user_id, tweet_id)
);
CREATE TABLE bookmarks (
user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
tweet_id BIGINT NOT NULL REFERENCES tweets(id) ON DELETE CASCADE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (user_id, tweet_id)
);
CREATE INDEX idx_tweets_user_created ON tweets(user_id, created_at DESC) WHERE is_deleted = FALSE;
CREATE INDEX idx_tweets_reply_to ON tweets(reply_to_tweet_id) WHERE reply_to_tweet_id IS NOT NULL;
CREATE INDEX idx_tweets_retweet_of ON tweets(retweet_of_tweet_id) WHERE retweet_of_tweet_id IS NOT NULL;
CREATE INDEX idx_follows_following ON follows(following_id, created_at DESC);
CREATE INDEX idx_likes_tweet ON likes(tweet_id, created_at DESC);
⚠️ Common Mistakes
- Forgetting the
no_self_followCHECK constraint - Storing the full tweet text for retweets instead of referencing the original
- Missing partial indexes for non-deleted tweets
- Using
ON DELETE CASCADEon the reply/retweet self-references (should be SET NULL) - Not indexing
follows.following_idwhich is needed for timeline queries - Storing denormalized counts without a strategy to keep them in sync
✅ What Interviewers Look For
- Understanding of many-to-many relationships (follows, likes)
- Self-referencing tables (replies, retweets)
- Soft deletes vs hard deletes and their trade-offs
- Index design for read-heavy workloads
- Data normalization vs denormalization decisions
Design the Database Schema for Uber
HardDesign the core relational schema for a ride-hailing platform. This includes users (riders & drivers), trips, payments, ratings, and location tracking.
Core Entities
- Riders and Drivers (both users)
- Trips (from request to completion)
- Payments and pricing
- Vehicle types and driver documents
- Ratings and reviews
Functional Requirements
- Rider requests a ride, driver accepts
- Real-time trip status tracking
- Automatic fare calculation
- Post-trip rating for both parties
- Driver onboarding with document verification
- Promo codes and discounts
Scale Hints
- 100M+ monthly active riders
- 5M+ active drivers
- 20M+ trips per day
- Payments in 70+ currencies
Constraints
- Trip fare must be immutable after completion
- Driver can only have one active vehicle at a time
- Ratings are 1-5 stars, optional comment
- Support split fares and refunds
CREATE TABLE users (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
phone VARCHAR(20) NOT NULL UNIQUE,
email VARCHAR(255) UNIQUE,
full_name VARCHAR(100) NOT NULL,
password_hash VARCHAR(255),
avatar_url VARCHAR(500),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
is_active BOOLEAN NOT NULL DEFAULT TRUE,
default_payment_method_id BIGINT
);
CREATE TABLE drivers (
user_id BIGINT PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
license_number VARCHAR(50) NOT NULL UNIQUE,
license_expiry DATE NOT NULL,
background_check_status VARCHAR(20) NOT NULL DEFAULT 'pending',
onboarding_status VARCHAR(20) NOT NULL DEFAULT 'started',
rating DECIMAL(2,1) NOT NULL DEFAULT 5.0,
total_trips INT NOT NULL DEFAULT 0,
lifetime_earnings DECIMAL(12,2) NOT NULL DEFAULT 0.00,
is_online BOOLEAN NOT NULL DEFAULT FALSE,
current_vehicle_id BIGINT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE vehicles (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
driver_id BIGINT NOT NULL REFERENCES drivers(user_id) ON DELETE CASCADE,
make VARCHAR(50) NOT NULL,
model VARCHAR(50) NOT NULL,
year INT NOT NULL,
color VARCHAR(30) NOT NULL,
license_plate VARCHAR(20) NOT NULL UNIQUE,
vehicle_type VARCHAR(20) NOT NULL DEFAULT 'economy',
seat_capacity INT NOT NULL DEFAULT 4,
is_active BOOLEAN NOT NULL DEFAULT TRUE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE vehicle_documents (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
vehicle_id BIGINT NOT NULL REFERENCES vehicles(id) ON DELETE CASCADE,
document_type VARCHAR(30) NOT NULL,
document_url VARCHAR(500) NOT NULL,
verification_status VARCHAR(20) NOT NULL DEFAULT 'pending',
uploaded_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
verified_at TIMESTAMPTZ,
expires_at DATE
);
CREATE TABLE trips (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
rider_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
driver_id BIGINT REFERENCES drivers(user_id) ON DELETE SET NULL,
vehicle_id BIGINT REFERENCES vehicles(id) ON DELETE SET NULL,
status VARCHAR(20) NOT NULL DEFAULT 'requested',
pickup_lat DECIMAL(10,8) NOT NULL,
pickup_lng DECIMAL(11,8) NOT NULL,
pickup_address VARCHAR(300) NOT NULL,
dropoff_lat DECIMAL(10,8) NOT NULL,
dropoff_lng DECIMAL(11,8) NOT NULL,
dropoff_address VARCHAR(300) NOT NULL,
requested_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
accepted_at TIMESTAMPTZ,
started_at TIMESTAMPTZ,
completed_at TIMESTAMPTZ,
cancelled_at TIMESTAMPTZ,
cancelled_by VARCHAR(10),
cancellation_reason TEXT,
distance_km DECIMAL(8,2),
duration_minutes INT,
base_fare DECIMAL(8,2) NOT NULL,
distance_fare DECIMAL(8,2) NOT NULL DEFAULT 0.00,
time_fare DECIMAL(8,2) NOT NULL DEFAULT 0.00,
surge_multiplier DECIMAL(3,2) NOT NULL DEFAULT 1.00,
promo_discount DECIMAL(8,2) NOT NULL DEFAULT 0.00,
total_fare DECIMAL(8,2) NOT NULL,
currency VARCHAR(3) NOT NULL DEFAULT 'USD',
payment_status VARCHAR(20) NOT NULL DEFAULT 'pending'
);
CREATE TABLE trip_events (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
trip_id BIGINT NOT NULL REFERENCES trips(id) ON DELETE CASCADE,
event_type VARCHAR(30) NOT NULL,
lat DECIMAL(10,8),
lng DECIMAL(11,8),
metadata JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE payments (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
trip_id BIGINT NOT NULL REFERENCES trips(id) ON DELETE CASCADE,
user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
amount DECIMAL(10,2) NOT NULL,
currency VARCHAR(3) NOT NULL DEFAULT 'USD',
payment_method VARCHAR(20) NOT NULL,
payment_status VARCHAR(20) NOT NULL DEFAULT 'pending',
transaction_id VARCHAR(100) UNIQUE,
processed_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE ratings (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
trip_id BIGINT NOT NULL REFERENCES trips(id) ON DELETE CASCADE,
from_user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
to_user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
rating INT NOT NULL CHECK (rating BETWEEN 1 AND 5),
comment TEXT,
category VARCHAR(20),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(trip_id, from_user_id)
);
CREATE INDEX idx_trips_rider ON trips(rider_id, requested_at DESC);
CREATE INDEX idx_trips_driver ON trips(driver_id, requested_at DESC);
CREATE INDEX idx_trips_status ON trips(status, requested_at DESC) WHERE status IN ('requested','accepted','started');
CREATE INDEX idx_trip_events_trip ON trip_events(trip_id, created_at DESC);
CREATE INDEX idx_payments_trip ON payments(trip_id);
CREATE INDEX idx_ratings_to_user ON ratings(to_user_id, created_at DESC);
⚠️ Common Mistakes
- Not separating
usersfromdrivers— a driver is also a user who can ride - Storing mutable fare fields without an immutable audit trail
- Forgetting event sourcing for trip status changes (needed for debugging/disputes)
- Using
FLOATfor money — always useDECIMAL - Missing unique constraint on
ratings(trip_id, from_user_id)to prevent duplicate ratings - Not handling driver document expiry and verification states
- Storing location as a single string instead of separate lat/lng with spatial indexing potential
✅ What Interviewers Look For
- Entity inheritance patterns (user → driver/rider)
- Immutability for financial data (fare never changes after trip ends)
- Audit trails and event logging
- State machine design for trip lifecycle
- Handling edge cases: cancellations, no-shows, refunds
Design the Database Schema for a URL Shortener
MediumDesign the database schema for a URL shortening service like bit.ly. Support custom aliases, analytics, expiration, and user accounts.
Core Entities
- Users (optional anonymous)
- Shortened URLs
- Click analytics
- Custom aliases and domains
Functional Requirements
- Shorten a long URL to a short code
- Custom aliases (e.g., bit.ly/my-brand)
- Redirect with 301/302
- Track clicks (timestamp, referrer, country, device)
- URL expiration (optional)
- QR code generation (metadata only)
Scale Hints
- 100M+ new URLs per month
- 10B+ redirects per month
- Short code: 6-8 alphanumeric chars
- Reads >> Writes by 100:1
Constraints
- Short codes must be unique and unguessable
- Custom aliases: 3-30 chars, alphanumeric + hyphen
- Max URL length: 2048 chars
- Soft-delete for analytics retention
CREATE TABLE users (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
email VARCHAR(255) UNIQUE,
password_hash VARCHAR(255),
display_name VARCHAR(50),
api_key VARCHAR(64) UNIQUE,
tier VARCHAR(20) NOT NULL DEFAULT 'free',
monthly_link_limit INT NOT NULL DEFAULT 100,
custom_domain VARCHAR(255),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
last_login_at TIMESTAMPTZ
);
CREATE TABLE short_links (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
user_id BIGINT REFERENCES users(id) ON DELETE SET NULL,
short_code VARCHAR(30) NOT NULL UNIQUE,
original_url TEXT NOT NULL,
is_custom_alias BOOLEAN NOT NULL DEFAULT FALSE,
title VARCHAR(200),
description TEXT,
status VARCHAR(20) NOT NULL DEFAULT 'active',
expires_at TIMESTAMPTZ,
password_hash VARCHAR(255),
total_clicks INT NOT NULL DEFAULT 0,
unique_clicks INT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
deleted_at TIMESTAMPTZ,
CONSTRAINT valid_short_code CHECK (short_code ~ '^[a-zA-Z0-9_-]+$'),
CONSTRAINT valid_alias_length CHECK (
is_custom_alias = FALSE OR LENGTH(short_code) BETWEEN 3 AND 30
)
);
CREATE TABLE link_tags (
link_id BIGINT NOT NULL REFERENCES short_links(id) ON DELETE CASCADE,
tag VARCHAR(30) NOT NULL,
PRIMARY KEY (link_id, tag)
);
CREATE TABLE click_events (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
link_id BIGINT NOT NULL REFERENCES short_links(id) ON DELETE CASCADE,
clicked_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
ip_hash VARCHAR(64) NOT NULL,
user_agent TEXT,
referrer TEXT,
country_code CHAR(2),
city VARCHAR(50),
device_type VARCHAR(20),
browser VARCHAR(30),
os VARCHAR(30),
language VARCHAR(10)
);
CREATE TABLE click_aggregates (
link_id BIGINT NOT NULL REFERENCES short_links(id) ON DELETE CASCADE,
date DATE NOT NULL,
total_clicks INT NOT NULL DEFAULT 0,
unique_clicks INT NOT NULL DEFAULT 0,
country_breakdown JSONB,
device_breakdown JSONB,
referrer_breakdown JSONB,
PRIMARY KEY (link_id, date)
);
CREATE TABLE link_rules (
id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
link_id BIGINT NOT NULL REFERENCES short_links(id) ON DELETE CASCADE,
rule_type VARCHAR(30) NOT NULL,
condition_value VARCHAR(255) NOT NULL,
redirect_url TEXT,
is_active BOOLEAN NOT NULL DEFAULT TRUE,
priority INT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_short_links_code ON short_links(short_code) WHERE deleted_at IS NULL;
CREATE INDEX idx_short_links_user ON short_links(user_id, created_at DESC);
CREATE INDEX idx_click_events_link ON click_events(link_id, clicked_at DESC);
CREATE INDEX idx_click_events_ip ON click_events(link_id, ip_hash, clicked_at);
CREATE INDEX idx_click_aggregates_date ON click_aggregates(date);
⚠️ Common Mistakes
- Storing the original URL as
VARCHARinstead ofTEXT— URLs can exceed 2048 chars - Not using a separate
click_eventstable and instead incrementing a counter — you lose analytics granularity - Missing unique constraint on
short_code— race conditions will create duplicates - Forgetting soft-delete — hard-deleting loses all click history
- Not hashing IPs for privacy compliance (GDPR)
- Storing raw user agents instead of parsed device/browser/os fields
- No expiration or status field — inactive links should still resolve but not count
✅ What Interviewers Look For
- Understanding of write-heavy vs read-heavy workloads
- Separating event data (clicks) from aggregate data for performance
- Privacy considerations (IP hashing, retention policies)
- Flexible routing rules (device-based, geo-based redirects)
- Handling collisions in short code generation
Ready to Diff Real Schemas?
SchemaLens is the fastest way to compare database schemas, spot breaking changes, and generate migration scripts — all in your browser.
Try SchemaLens Free →No signup required. Your data never leaves your browser.