Schema Evolution and Lazy Migration
The ability to version schemas independently of data, migrate lazily when needed, and maintain multiple schema versions simultaneously. This is the major innovation that distinguishes FERIN from traditional data management approaches.
The Traditional Problem
In traditional data management, schema changes force immediate data migration. This creates friction, risk, and often prevents organizations from evolving their data structures.
Traditional: Big Bang Migration
- Must touch every row immediately
- Downtime or complex migration procedures
- Risk of data loss or corruption
- Blocks schema evolution - changes become expensive
- All-or-nothing proposition
FERIN: Lazy Migration
- Data remains valid at its current schema version
- Migrate only when needed
- Multiple schema versions coexist
- Consumers choose when to upgrade
- Schema evolution becomes routine, not exceptional
Concept vs. Concept Version
Understanding the distinction between concepts and concept versions is fundamental to schema evolution in FERIN. This two-level structure enables meaning to evolve while maintaining referential integrity.
Concept Plane (Meaning)
An abstract idea - the SI unit of length. The concept represents "what we mean" independent of how we define it at any point in time.
Content Plane (Data)
When to Create a New Concept Version
Create a new version of an existing concept when:
- The definition is clarified but the meaning remains fundamentally the same
- Precision or detail is added without changing what the concept represents
- Editorial improvements are made to the definition text
- Additional context or examples are provided
When to Create a New Concept
Create a new concept (with a new identifier) when:
- The fundamental meaning has changed
- The scope or domain of application has shifted
- The same term now refers to something different
- You need to track both old and new meanings simultaneously
The Test
Would existing references to this concept become confusing or misleading after this change?
- If no: Create a new version of the existing concept
- If yes: Create a new concept with a supersession relation
The Provider-Consumer Contract Model
FERIN implements a contract model between data providers and consumers, not a strict enforcement model. This is fundamental to enabling lazy migration.
Register (Provider)
Provides
- Schemas - Contracts that define data structure
- Schema-ed data - Instances conforming to schemas
- Version negotiation - Multiple schema versions available
- Sunset notices - Timeline for deprecation
Consumer
Receives
- Stability - Data remains queryable at known schema
- Predictability - Know when versions will sunset
- Control - Choose when to upgrade
- Compatibility - Multiple versions coexist
Analogy: API Versioning for Data
This is similar to REST API versioning, but applied to data schemas:
| API Versioning | FERIN Schema Versioning |
|---|---|
/api/v1/items | /items?schemaVersion=1 |
/api/v2/items | /items?schemaVersion=2 |
| Old versions deprecated on timeline | Old schemas deprecated on timeline |
| Consumers choose upgrade timing | Consumers choose migration timing |
Implementation Strategies
Version Column Pattern
Store schema version as a field on each record.
-- Single table with version tracking
CREATE TABLE items (
id UUID PRIMARY KEY,
schema_version TEXT NOT NULL,
content JSONB NOT NULL,
status TEXT NOT NULL,
valid_from TIMESTAMP,
valid_until TIMESTAMP
);
-- Query specific schema version
SELECT * FROM items
WHERE schema_version = '1.0'
AND status = 'valid';
-- Lazy migration: update one item at a time
UPDATE items SET
schema_version = '2.0',
content = jsonb_set(content, '{category}', '"default"')
WHERE id = ? AND schema_version = '1.0';Schema-per-Table Pattern
Maintain separate tables for each schema version.
-- Version 1 schema
CREATE TABLE items_v1 (
id UUID PRIMARY KEY,
name TEXT NOT NULL,
description TEXT
);
-- Version 2 schema (adds category)
CREATE TABLE items_v2 (
id UUID PRIMARY KEY,
name TEXT NOT NULL,
description TEXT,
category TEXT NOT NULL -- New required field
);
-- Query at version 1
SELECT * FROM items_v1 WHERE id = ?;
-- Migrate on-demand
INSERT INTO items_v2 (id, name, description, category)
SELECT id, name, description, 'default'
FROM items_v1 WHERE id = ?;Event Sourcing Pattern
Store schema changes as events, reconstruct state on demand.
// Schema change events
const events = [
{ type: 'schema_defined', version: '1.0',
fields: ['name', 'description'] },
{ type: 'schema_defined', version: '2.0',
fields: ['name', 'description', 'category'] },
{ type: 'item_created', id: '123',
schemaVersion: '1.0', data: {...} },
{ type: 'item_migrated', id: '123',
fromVersion: '1.0', toVersion: '2.0' }
];
// Reconstruct item at any schema version
function getItemAtVersion(id, targetVersion) {
// Apply events up to target version
// Return migrated state if migration exists
}Migration Decision Framework
Not all data needs immediate migration. Consider these factors:
| Factor | Migrate Soon | Can Wait |
|---|---|---|
| Data usage | Frequently accessed | Rarely queried |
| Consumer requirements | Consumer needs new fields | Consumer still uses old schema |
| Validation | Old schema has issues | Old schema still valid |
| Sunset timeline | Old schema deprecating soon | No deprecation planned |
| Volume | Small dataset | Large dataset, batch migration |