Back to Blog
11 min read

Scaling Database Performance: From MVP to Production

DatabasePerformanceScaling

Scaling Database Performance: From MVP to Production

Your database is often the bottleneck as your application grows. What works for an MVP with 100 users won't work for 100,000 users. Here's how to scale your database performance from prototype to production.

Start with the Right Foundation

Choose the Right Database

PostgreSQL: Excellent for complex queries, ACID compliance, and relational data. Great default choice.

MongoDB: Good for flexible schemas, document storage, and rapid iteration. Less ideal for complex joins.

MySQL: Widely supported, good for read-heavy workloads. Less feature-rich than PostgreSQL.

Redis: In-memory cache and key-value store. Perfect for sessions, caching, and real-time data.

Design Your Schema Well

Normalize, but not too much:

  • Normalize to reduce redundancy
  • Denormalize for read performance when needed

Use appropriate data types:

  • Don't use VARCHAR(255) for everything
  • Use integers for IDs, not strings
  • Use timestamps, not strings for dates

Index strategically:

  • Index foreign keys
  • Index frequently queried columns
  • Don't over-index (slows writes)

Indexing Strategies

Primary and Foreign Keys

Always index primary keys and foreign keys:

SQL

CREATE TABLE users (

id SERIAL PRIMARY KEY,

email VARCHAR(255) UNIQUE NOT NULL

);

CREATE TABLE posts (

id SERIAL PRIMARY KEY,

user_id INTEGER REFERENCES users(id),

title VARCHAR(255),

INDEX idx_user_id (user_id) -- Index foreign key

);

Composite Indexes

Create composite indexes for queries that filter on multiple columns:

SQL

-- Query: SELECT * FROM posts WHERE user_id = ? AND status = 'published'

CREATE INDEX idx_user_status ON posts(user_id, status);

Order matters: Put the most selective column first.

Partial Indexes

Index only a subset of rows:

SQL

-- Only index active users

CREATE INDEX idx_active_users ON users(email) WHERE status = 'active';

Query Optimization

Avoid N+1 Queries

Bad:

JAVASCRIPT

const users = await db.query('SELECT * FROM users');

for (const user of users) {

const posts = await db.query('SELECT * FROM posts WHERE user_id = ?', [user.id]);

}

Good:

JAVASCRIPT

const users = await db.query('SELECT * FROM users');

const userIds = users.map(u => u.id);

const posts = await db.query('SELECT * FROM posts WHERE user_id IN (?)', [userIds]);

Or use JOINs:

SQL

SELECT u.*, p.*

FROM users u

LEFT JOIN posts p ON u.id = p.user_id

WHERE u.id IN (?)

Use EXPLAIN

Always use EXPLAIN to understand query execution:

SQL

EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'user@example.com';

Look for:

  • Sequential scans (bad)
  • Index scans (good)
  • High execution times

Limit Results

Always use LIMIT:

SQL

SELECT * FROM posts ORDER BY created_at DESC LIMIT 20;

Select Only What You Need

Don't use SELECT *:

SQL

-- Bad

SELECT * FROM users;

-- Good

SELECT id, name, email FROM users;

Connection Pooling

Use connection pooling to reuse database connections:

JAVASCRIPT

// Using pg (PostgreSQL)

const pool = new Pool({

max: 20, // Maximum connections

idleTimeoutMillis: 30000,

connectionTimeoutMillis: 2000,

});

Caching Strategies

Application-Level Caching

Cache frequently accessed data:

JAVASCRIPT

// Using Redis

const cacheKey = user:${userId};

let user = await redis.get(cacheKey);

if (!user) {

user = await db.query('SELECT * FROM users WHERE id = ?', [userId]);

await redis.setex(cacheKey, 3600, JSON.stringify(user)); // Cache for 1 hour

}

Query Result Caching

Cache expensive query results:

JAVASCRIPT

const cacheKey = 'popular_posts';

let posts = await redis.get(cacheKey);

if (!posts) {

posts = await db.query('SELECT * FROM posts WHERE views > 1000 ORDER BY views DESC LIMIT 10');

await redis.setex(cacheKey, 300, JSON.stringify(posts)); // Cache for 5 minutes

}

Cache Invalidation

Invalidate cache when data changes:

JAVASCRIPT

async function updateUser(userId, data) {

await db.query('UPDATE users SET ... WHERE id = ?', [userId]);

await redis.del(user:${userId}); // Invalidate cache

}

Read Replicas

Use read replicas to distribute read load:

JAVASCRIPT

// Write to primary

await primaryDb.query('INSERT INTO users ...');

// Read from replica

const users = await replicaDb.query('SELECT * FROM users');

Partitioning

Partition large tables by date or range:

SQL

-- Partition by month

CREATE TABLE events (

id SERIAL,

created_at TIMESTAMP,

data JSONB

) PARTITION BY RANGE (created_at);

CREATE TABLE events_2024_01 PARTITION OF events

FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

Monitoring and Profiling

Slow Query Log

Enable slow query logging:

SQL

-- PostgreSQL

log_min_duration_statement = 1000 -- Log queries > 1 second

Database Metrics

Monitor:

  • Query execution time
  • Connection pool usage
  • Cache hit rates
  • Disk I/O
  • CPU usage

Use APM Tools

Tools like New Relic, Datadog, or Sentry can help identify slow queries.

Scaling Strategies by Stage

MVP (0-1K users)

  • Single database instance
  • Basic indexing
  • Simple caching (if needed)

Growth (1K-10K users)

  • Connection pooling
  • Query optimization
  • Application-level caching
  • Database monitoring

Scale (10K-100K users)

  • Read replicas
  • Advanced indexing
  • Query result caching
  • Database tuning

Enterprise (100K+ users)

  • Database sharding
  • Partitioning
  • CDN for static data
  • Dedicated database team

Common Mistakes

1. Not using indexes: Causes full table scans

2. Over-fetching data: SELECT * everywhere

3. N+1 queries: Loading related data in loops

4. No connection pooling: Creating new connections per request

5. Ignoring slow queries: Not monitoring performance

6. Premature optimization: Optimizing before measuring

7. No caching strategy: Hitting database for everything

Tools and Resources

  • pgAdmin: PostgreSQL administration
  • Redis Insight: Redis management
  • EXPLAIN ANALYZE: Query analysis
  • pg_stat_statements: PostgreSQL query statistics
  • New Relic / Datadog: Application performance monitoring

Conclusion

Database performance is critical for application scalability. Start with good schema design and indexing, optimize queries, implement caching, and scale horizontally when needed.

Remember: Measure first, optimize second. Use monitoring tools to identify actual bottlenecks before optimizing.

---

*Struggling with database performance? Contact us to discuss optimization strategies for your application.*

Enjoyed this article?

If you found this helpful, let's discuss how we can help with your next project.

Book a call