Scaling Database Performance: From MVP to Production

Your database is often the bottleneck as your application grows. What works for an MVP with 100 users won't work for 100,000 users. Here's how to scale your database performance from prototype to production.

Start with the Right Foundation

Choose the Right Database

PostgreSQL: Excellent for complex queries, ACID compliance, and relational data. Great default choice.

MongoDB: Good for flexible schemas, document storage, and rapid iteration. Less ideal for complex joins.

MySQL: Widely supported, good for read-heavy workloads. Less feature-rich than PostgreSQL.

Redis: In-memory cache and key-value store. Perfect for sessions, caching, and real-time data.

Design Your Schema Well

Normalize, but not too much:

Normalize to reduce redundancy
Denormalize for read performance when needed

Use appropriate data types:

Don't use VARCHAR(255) for everything
Use integers for IDs, not strings
Use timestamps, not strings for dates

Index strategically:

Index foreign keys
Index frequently queried columns
Don't over-index (slows writes)

Indexing Strategies

Primary and Foreign Keys

Always index primary keys and foreign keys:

SQL

CREATE TABLE users (


id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL
);
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(id),
title VARCHAR(255),
INDEX idx_user_id (user_id)  -- Index foreign key

);

Composite Indexes

Create composite indexes for queries that filter on multiple columns:

SQL

-- Query: SELECT * FROM posts WHERE user_id = ? AND status = 'published'

CREATE INDEX idx_user_status ON posts(user_id, status);

Order matters: Put the most selective column first.

Partial Indexes

Index only a subset of rows:

SQL

-- Only index active users

CREATE INDEX idx_active_users ON users(email) WHERE status = 'active';

Query Optimization

Avoid N+1 Queries

Bad:

JAVASCRIPT

const users = await db.query('SELECT * FROM users');


for (const user of users) {
const posts = await db.query('SELECT * FROM posts WHERE user_id = ?', [user.id]);

}

Good:

JAVASCRIPT

const users = await db.query('SELECT * FROM users');


const userIds = users.map(u => u.id);

const posts = await db.query('SELECT * FROM posts WHERE user_id IN (?)', [userIds]);

Or use JOINs:

SQL

SELECT u.*, p.*


FROM users u
LEFT JOIN posts p ON u.id = p.user_id

WHERE u.id IN (?)

Use EXPLAIN

Always use EXPLAIN to understand query execution:

SQL

EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'user@example.com';

Look for:

Sequential scans (bad)
Index scans (good)
High execution times

Limit Results

Always use LIMIT:

SQL

SELECT * FROM posts ORDER BY created_at DESC LIMIT 20;

Select Only What You Need

Don't use SELECT *:

SQL

-- Bad


SELECT * FROM users;
-- Good

SELECT id, name, email FROM users;

Connection Pooling

Use connection pooling to reuse database connections:

JAVASCRIPT

// Using pg (PostgreSQL)


const pool = new Pool({
max: 20, // Maximum connections
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,

});

Caching Strategies

Application-Level Caching

Cache frequently accessed data:

JAVASCRIPT

// Using Redis


const cacheKey = user:${userId};
let user = await redis.get(cacheKey);
if (!user) {
user = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
await redis.setex(cacheKey, 3600, JSON.stringify(user)); // Cache for 1 hour

}

Query Result Caching

Cache expensive query results:

JAVASCRIPT

const cacheKey = 'popular_posts';


let posts = await redis.get(cacheKey);
if (!posts) {
posts = await db.query('SELECT * FROM posts WHERE views > 1000 ORDER BY views DESC LIMIT 10');
await redis.setex(cacheKey, 300, JSON.stringify(posts)); // Cache for 5 minutes

}

Cache Invalidation

Invalidate cache when data changes:

JAVASCRIPT

async function updateUser(userId, data) {


await db.query('UPDATE users SET ... WHERE id = ?', [userId]);
await redis.del(user:${userId}); // Invalidate cache

}

Read Replicas

Use read replicas to distribute read load:

JAVASCRIPT

// Write to primary


await primaryDb.query('INSERT INTO users ...');
// Read from replica

const users = await replicaDb.query('SELECT * FROM users');

Partitioning

Partition large tables by date or range:

SQL

-- Partition by month


CREATE TABLE events (
id SERIAL,
created_at TIMESTAMP,
data JSONB
) PARTITION BY RANGE (created_at);
CREATE TABLE events_2024_01 PARTITION OF events

FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

Monitoring and Profiling

Slow Query Log

Enable slow query logging:

SQL

-- PostgreSQL

log_min_duration_statement = 1000 -- Log queries > 1 second

Database Metrics

Monitor:

Query execution time
Connection pool usage
Cache hit rates
Disk I/O
CPU usage

Use APM Tools

Tools like New Relic, Datadog, or Sentry can help identify slow queries.

Scaling Strategies by Stage

MVP (0-1K users)

Single database instance
Basic indexing
Simple caching (if needed)

Growth (1K-10K users)

Connection pooling
Query optimization
Application-level caching
Database monitoring

Scale (10K-100K users)

Read replicas
Advanced indexing
Query result caching
Database tuning

Enterprise (100K+ users)

Database sharding
Partitioning
CDN for static data
Dedicated database team

Common Mistakes

1. Not using indexes: Causes full table scans

2. Over-fetching data: SELECT * everywhere

3. N+1 queries: Loading related data in loops

4. No connection pooling: Creating new connections per request

5. Ignoring slow queries: Not monitoring performance

6. Premature optimization: Optimizing before measuring

7. No caching strategy: Hitting database for everything

Tools and Resources

pgAdmin: PostgreSQL administration
Redis Insight: Redis management
EXPLAIN ANALYZE: Query analysis
pg_stat_statements: PostgreSQL query statistics
New Relic / Datadog: Application performance monitoring

Conclusion

Database performance is critical for application scalability. Start with good schema design and indexing, optimize queries, implement caching, and scale horizontally when needed.

Remember: Measure first, optimize second. Use monitoring tools to identify actual bottlenecks before optimizing.

---

*Struggling with database performance? Contact us to discuss optimization strategies for your application.*

Scaling Database Performance: From MVP to Production

Start with the Right Foundation

Choose the Right Database

Design Your Schema Well

Indexing Strategies

Primary and Foreign Keys

Composite Indexes

Partial Indexes

Query Optimization

Avoid N+1 Queries

Use EXPLAIN

Limit Results

Select Only What You Need

Connection Pooling

Caching Strategies

Application-Level Caching

Query Result Caching

Cache Invalidation

Read Replicas

Partitioning

Monitoring and Profiling

Slow Query Log

Database Metrics

Use APM Tools

Scaling Strategies by Stage

MVP (0-1K users)

Growth (1K-10K users)

Scale (10K-100K users)

Enterprise (100K+ users)

Common Mistakes

Tools and Resources

Conclusion

Enjoyed this article?