← Home

How SchemaLens Works

SchemaLens is engineered software with a custom SQL parser, semantic diff engine, and migration generator. It runs entirely in your browser and from the command line. Here is exactly how it works.

Prefer the terminal?

The same engine runs locally via npm. No browser required.

npx schemalens-cli diff old.sql new.sql CLI Docs

~400

Lines of parser code

Runtime dependencies

<10ms

Parse 1,000 lines of SQL

Supported dialects

Automated tests

Architecture Overview

SchemaLens has three layers:

Parser — Turns CREATE TABLE text into structured JavaScript objects.
Diff Engine — Compares two object trees and produces a semantic change list.
Migration Generator — Converts changes into dialect-specific ALTER TABLE scripts.

All three layers run client-side in the browser. The same code is packaged as schemalens-cli for terminal use.

Privacy by architecture: Because parsing and diffing happen in your browser, your schema dumps never touch our servers. We cannot see your data even if we wanted to.

Why we wrote a custom parser

Our first instinct was to use node-sql-parser. It is mature and well-tested. Then we fed it a real production schema and watched it choke on:

Composite primary keys with CLUSTERED hints
Inline FOREIGN KEY references with ON DELETE CASCADE
PostgreSQL CREATE TYPE ... AS ENUM
MySQL TINYINT(1) rewritten from BOOLEAN
SQL Server bracket-quoted identifiers like [User Name]

Each failure meant either patching the library (slow, fragile) or preprocessing the SQL (error-prone). Worse, the minified build was ~500 KB. For a tool whose value is "paste and see results in 2 seconds," a 500 KB parser felt absurd.

Constraint: We only need to parse CREATE TABLE, CREATE INDEX, CREATE TYPE, CREATE VIEW, CREATE TRIGGER, and CREATE FUNCTION. We do not need SELECT, JOIN, or DML. A full SQL grammar is overkill.

So we wrote a custom parser. It is ~400 lines of vanilla JavaScript, zero dependencies, and handles everything we need.

The pipeline: from text to objects

Every schema dump goes through four stages:

Strip comments — Remove -- and /* */ blocks so they do not confuse the tokenizer.
Split statements — Break on semicolons, but respect semicolons inside string literals.
Parse each statement — Turn a CREATE TABLE string into a structured table object.
Normalize identifiers — Strip dialect-specific quotes so `users`, "users", and [users] all become users.

Stage 1: Strip comments

A naive regex like /--.*$/gm breaks if a line contains a string with '--' inside it. Our approach is character-by-character:

function stripComments(sql) {
  let out = '', i = 0;
  while (i < sql.length) {
    if (sql[i] === '-' && sql[i+1] === '-') {
      while (i < sql.length && sql[i] !== '\n') i++;
    } else if (sql[i] === '/' && sql[i+1] === '*') {
      i += 2;
      while (i < sql.length && !(sql[i] === '*' && sql[i+1] === '/')) i++;
      i += 2;
    } else {
      out += sql[i++];
    }
  }
  return out;
}

Stage 2: Split statements

Splitting on semicolons is naive—what if a default value is a string containing a semicolon? We track whether we are inside a quoted string and only split when we are not:

function splitStatements(sql) {
  const stmts = [];
  let current = '', inString = false, stringChar = null;
  for (let i = 0; i < sql.length; i++) {
    const ch = sql[i];
    if (!inString && (ch === "'" || ch === '"')) {
      inString = true; stringChar = ch;
    } else if (inString && ch === stringChar) {
      inString = false;
    } else if (!inString && ch === ';') {
      stmts.push(current.trim());
      current = ''; continue;
    }
    current += ch;
  }
  if (current.trim()) stmts.push(current.trim());
  return stmts;
}

Stage 3: Parse CREATE TABLE

The hardest part is splitting the body into individual columns. Commas inside default values or type definitions must not be treated as column separators. Our solution tracks parentheses nesting depth and only splits on commas at depth zero.

function parseCreateTable(stmt, dialect) {
  const match = stmt.match(
    /CREATE\s+TABLE\s+(?:IF\s+NOT\s+EXISTS\s+)?([\w"`\[\]]+(?:\.[\w"`\[\]]+)?)\s*\((.*)\)\s*[^)]*$/is
  );
  if (!match) return null;

  const tableName = match[1].replace(/["`\[\]]/g, '');
  const body = match[2];
  // ... split body by comma at depth 0, parse columns, parse constraints
}

Handling dialect quirks

Each database has its own syntax for the same concept. Rather than five separate parsers, we use a single parser with dialect-aware branches:

PostgreSQL: SERIAL, "quoted" identifiers, CREATE TYPE ... AS ENUM, EXCLUDE constraints
MySQL: AUTO_INCREMENT, `backtick` quotes, TINYINT(1) as boolean
SQLite: INTEGER PRIMARY KEY AUTOINCREMENT, limited ALTER TABLE (requires table recreation)
SQL Server: IDENTITY(1,1), [bracket] quotes, CLUSTERED hints, NVARCHAR
Oracle: VARCHAR2, NUMBER(p,s), tablespace clauses

The parser stores dialect on every parsed object, so the diff engine and migration generator can produce the correct output for the target database.

Diff engine: from objects to semantics

Once both schemas are parsed into object trees, diffing is straightforward:

Compare table names. Any table in A but not B is "removed." Any table in B but not A is "added."
For tables present in both, compare columns by name. Added columns, removed columns, and modified columns (type change, nullability change, default change) are flagged individually.
Compare constraints structurally. A PRIMARY KEY (id, org_id) is parsed into { type: 'PRIMARY KEY', columns: ['id', 'org_id'] }, so two constraints with different whitespace or ordering are still recognized as identical.
Compare indexes, views, triggers, and functions/procedures using the same structural approach.

This structural diff is why SchemaLens beats a text diff. A line-based diff would scream at you for reformatting whitespace. A semantic diff tells you that role_id changed from INT to BIGINT—the thing that actually matters.

Migration generation

The migration generator is a deterministic switch statement. For every detected change, it outputs the appropriate ALTER TABLE (or table recreation script for SQLite):

// PostgreSQL
ALTER TABLE "users" ALTER COLUMN "role_id" TYPE BIGINT;

// MySQL
ALTER TABLE `users` MODIFY COLUMN `role_id` BIGINT NULL;

// SQL Server
ALTER TABLE [users] ALTER COLUMN [role_id] BIGINT;

// SQLite (not supported — must recreate table)
-- SQLite does not support ALTER COLUMN. Recreate table with new schema.

The CLI: same engine, terminal interface

schemalens-cli packages the exact same parser and diff engine into an npm module. There is no API call. There is no network dependency. Your schemas stay on your machine.

# Install globally
npm install -g schemalens-cli

# Or run without installing
npx schemalens-cli diff old.sql new.sql --dialect postgres --format sql

# Pipe from stdin
cat new.sql | npx schemalens-cli diff old.sql -

# CI mode — fail on breaking changes
SCHEMALENS_STRICT=1 npx schemalens-cli diff old.sql new.sql

Testing strategy

We test the parser headlessly by extracting the script from app.html and running it in Node.js:

const fs = require('fs');
const html = fs.readFileSync('app.html', 'utf8');
const script = html.match(/<script>([\s\S]*)<\/script>/)[1];
const fn = new Function(script + '; return { parseSQL, diffSchemas };');
const { parseSQL, diffSchemas } = fn();

// Now we can unit-test the parser headlessly
const schema = parseSQL('CREATE TABLE users (id INT PRIMARY KEY);', 'postgres');
console.assert(Object.keys(schema.tables).length === 1);

This lets us run CI-style tests without a browser environment. We validate parsing, diffing, and migration generation for all five supported dialects on every commit.

Performance

The parser processes ~1,000 lines of SQL in under 10 milliseconds on a modern laptop. For a tool that parses two schemas and diffs them, the total runtime is typically under 50 ms. There is no perceptible delay between clicking "Compare" and seeing results.

Because there are no network round trips, the entire experience—from paste to diff to migration copy—feels instantaneous.

Honest limitations

Our parser is intentionally narrow. It does not handle:

Partitioned tables
Complex CHECK constraints with subqueries
Stored procedures or PL/pgSQL blocks (basic CREATE FUNCTION signature parsing is supported)
Live database connections — we compare DDL text, not actual database objects

For 95% of schema diff use cases, CREATE TABLE + CREATE INDEX + constraints is enough. When we hit the other 5%, we add support incrementally.

SchemaLens vs Liquibase: Liquibase is a full migration lifecycle framework. SchemaLens is for quick ad-hoc comparisons when you have two DDL dumps and want to see what changed without setting up a toolchain. If you already have Liquibase in your CI pipeline, keep using it. Use SchemaLens when you need answers in 10 seconds.

When should you use SchemaLens?

Scenario	SchemaLens	Liquibase / Flyway	Redgate
Quick ad-hoc comparison of two DDL dumps	✅ Perfect	Possible but heavy	Overkill
Managed migration lifecycle in CI/CD	Complement	✅ Designed for this	✅ Designed for this
No install, browser-based, shareable link	✅ Native	❌ CLI only	❌ Desktop only
Connect to live database and diff objects	❌ Not supported	✅ Supported	✅ Supported
Generate ALTER TABLE scripts	✅ Yes	✅ Yes	✅ Yes
Zero cost for basic use	✅ Free tier	✅ Open source	❌ $369+/user
Command-line usage	✅ `npx schemalens-cli`	✅ Native CLI	✅ Native CLI

See the engine in action

Paste two SQL schemas and watch the diff engine work in real time. No install, no signup, no data leaves your browser.

Try SchemaLens Free

Or run npx schemalens-cli in your terminal.

Built for the $100 AI Startup Race. Read the blog → or explore the open-source engine →