Database Patterns — How to Choose the Right Database for Your System

The wrong question

"Should I use Postgres or MongoDB?" is the wrong first question. The right first question is: "What are my access patterns?"

Every database is optimized for specific access patterns. Choosing a database without understanding your access patterns is like choosing a data structure without understanding your algorithm — you might get lucky, but you're probably making your life harder.

Access pattern analysis

Before evaluating any database, document your access patterns explicitly:

interface AccessPattern {
  name: string
  type: 'read' | 'write' | 'read-write'
  frequency: 'hot' | 'warm' | 'cold'
  latencyRequirement: 'realtime' | 'fast' | 'batch'
  consistency: 'strong' | 'eventual' | 'causal'
  description: string
}

const patterns: AccessPattern[] = [
  {
    name: 'user-profile-read',
    type: 'read',
    frequency: 'hot',
    latencyRequirement: 'realtime',
    consistency: 'eventual',
    description: 'Fetch user profile by ID on every page load',
  },
  {
    name: 'order-creation',
    type: 'write',
    frequency: 'warm',
    latencyRequirement: 'fast',
    consistency: 'strong',
    description: 'Create an order with line items, update inventory',
  },
  {
    name: 'analytics-aggregation',
    type: 'read',
    frequency: 'cold',
    latencyRequirement: 'batch',
    consistency: 'eventual',
    description: 'Aggregate daily metrics across all users',
  },
]

This exercise alone will clarify 80% of your database decisions. A system dominated by hot reads with eventual consistency is a completely different problem from a system dominated by transactional writes with strong consistency.

Pattern: Single table, multiple access patterns

The most underappreciated pattern in relational databases. Instead of normalizing everything into a dozen tables with complex JOINs, design your primary table around your hottest access pattern and use secondary structures for the rest.

-- Primary table: optimized for the hot path (fetch by ID)
CREATE TABLE documents (
  id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  type        TEXT NOT NULL,
  owner_id    UUID NOT NULL,
  data        JSONB NOT NULL,
  metadata    JSONB DEFAULT '{}',
  created_at  TIMESTAMPTZ DEFAULT now(),
  updated_at  TIMESTAMPTZ DEFAULT now()
);

-- Indexes for secondary access patterns
CREATE INDEX idx_documents_owner ON documents(owner_id);
CREATE INDEX idx_documents_type ON documents(type);
CREATE INDEX idx_documents_created ON documents(created_at DESC);

-- GIN index for JSONB queries when needed
CREATE INDEX idx_documents_data ON documents USING GIN (data);

This approach gives you the flexibility of a document store with the transactional guarantees of Postgres. The JSONB column handles schema variation without requiring migrations for every field change.

Pattern: Event-sourced writes, materialized reads

When your write patterns and read patterns are fundamentally different, trying to serve both from the same data structure creates pain. Event sourcing separates them cleanly.

interface DomainEvent {
  id: string
  aggregateId: string
  type: string
  data: Record<string, unknown>
  timestamp: Date
  version: number
}

// Write side: append events
async function processCommand(command: CreateOrder): Promise<void> {
  const events: DomainEvent[] = [
    {
      id: generateId(),
      aggregateId: command.orderId,
      type: 'OrderCreated',
      data: { customerId: command.customerId, items: command.items },
      timestamp: new Date(),
      version: 1,
    },
    {
      id: generateId(),
      aggregateId: command.orderId,
      type: 'InventoryReserved',
      data: { items: command.items },
      timestamp: new Date(),
      version: 2,
    },
  ]

  await eventStore.appendEvents(command.orderId, events)
}

// Read side: maintain materialized views
async function handleEvent(event: DomainEvent): Promise<void> {
  switch (event.type) {
    case 'OrderCreated':
      await orderReadModel.upsert({
        id: event.aggregateId,
        status: 'created',
        ...event.data,
      })
      break
    case 'InventoryReserved':
      await inventoryReadModel.decrementStock(event.data.items)
      break
  }
}

This pattern adds complexity, but it shines when you have strict audit requirements, need to rebuild read models from scratch, or have dramatically different read and write scaling needs.

The Postgres default

For most applications, the pragmatic answer is: start with Postgres.

Not because Postgres is the best at everything. It isn't. But because:

It's good enough at most things. Relational queries, JSON documents, full-text search, geospatial data, time-series with extensions — Postgres handles all of these competently.
Operational maturity. Backup, replication, monitoring, and recovery tooling for Postgres is battle-tested.
Team knowledge. Most engineers know SQL. Fewer know the query languages of specialized databases.
Flexibility. You can always extract a specialized database for a specific access pattern later. Starting specialized is much harder to reverse.

The right time to consider a specialized database is when you can prove — with data, not intuition — that Postgres is the bottleneck for a specific access pattern. And even then, the answer is often "add a specialized data store alongside Postgres" rather than "replace Postgres."

When to look beyond Postgres

There are legitimate cases for specialized databases:

Time-series data at scale. If you're ingesting millions of data points per second with time-based aggregation queries, TimescaleDB (Postgres extension) or a dedicated time-series database like ClickHouse will significantly outperform vanilla Postgres.

Graph traversals. If your core queries involve traversing relationships many levels deep — social networks, recommendation engines, dependency graphs — a graph database like Neo4j will express these queries more naturally and execute them more efficiently.

Real-time search. If full-text search is a primary feature with complex relevance scoring, Elasticsearch or Typesense will provide better search quality than Postgres tsvector.

Ultra-high write throughput. If you need to write millions of rows per second with minimal indexing, a log-structured database like ScyllaDB or Apache Cassandra is designed for exactly this.

The data modeling checklist

Before you finalize your schema, answer these questions:

What is your hottest read query? Optimize your primary indexes for this.
What are your consistency requirements? Strong consistency costs performance. Be explicit about where you need it.
How does your data grow? Time-series, append-only, or mutable? This affects storage strategy.
What are your JOIN patterns? If you never JOIN two tables, they might not need to be in the same database.
What are your backup and recovery requirements? Some databases make point-in-time recovery trivial. Others make it painful.

The best database decision is the one that's boring, predictable, and well-understood by your team. Save the interesting technology choices for the problems that actually need them.

This article is part of a series on data infrastructure fundamentals.

Database Patterns You Should Know Before Choosing Your Next Database

The wrong question

Access pattern analysis

Pattern: Single table, multiple access patterns

Pattern: Event-sourced writes, materialized reads

The Postgres default

When to look beyond Postgres

The data modeling checklist

Related Posts

Building for Scale: Architecture Patterns That Actually Work

Why Your Search Returns Nothing — And How MongoDB Vector Search Fixes It

Building a Production Embedding Pipeline with MongoDB Atlas and Voyage AI