Choosing a Database for Your Startup

PostgreSQL vs MongoDB vs DynamoDB

PostgreSQL is the correct default database for most startups. It is ACID-compliant, supports JSON natively with rich indexing and querying capabilities, includes a full-text search engine that eliminates the need for a separate search service until you have tens of millions of records, and its query planner is mature enough to handle complex joins without expert tuning. The ecosystem around PostgreSQL — extensions like pgvector for vector search, TimescaleDB for time-series data, and PostGIS for geospatial queries — means you can solve a wide variety of data problems without changing databases.

MongoDB trades relational consistency for schema flexibility. For rapid prototyping where your data model changes every two weeks, the lack of migrations is genuinely useful. The limitation appears when you need to query across documents — without foreign keys and joins, queries that are trivial in PostgreSQL require either deeply nested documents (which creates update anomalies) or application-level joins (which creates N+1 query patterns at scale). DynamoDB is AWS's fully managed NoSQL service and is genuinely excellent for specific use cases: single-table designs with known access patterns, serverless applications that need to scale to millions of requests per second, and systems that require sub-10ms latency at any load. Its steep learning curve — the single-table design pattern takes weeks to internalise — makes it wrong for most early-stage startups that are still figuring out their data access patterns.

Managed Services Comparison

Supabase is the strongest choice for PostgreSQL as a managed service at early stage. The free tier provides 500 MB of database storage, Row Level Security built directly into the database (eliminating a category of application-level access control bugs), and a REST and realtime API generated automatically from your schema. It's open source, which means you can self-host when vendor lock-in becomes a concern. The developer experience for a Next.js and TypeScript stack is exceptional, with a JavaScript client library that handles auth, storage, and database queries in a consistent API.

PlanetScale offers MySQL-compatible serverless database hosting with a feature that's genuinely innovative: branch-based schema migrations. Each schema change is made on a database branch, can be reviewed and tested in isolation, and then merged with zero-downtime deployment — eliminating the maintenance window that kills availability during schema migrations. Railway provides PostgreSQL, Redis, and MySQL in a single platform starting at $5 per month with no free-tier sleep mode, which makes it preferable to Render for production database services that need to be available continuously. For teams that want minimal configuration overhead in the first year, Railway's PostgreSQL is the best balance of price, reliability, and simplicity.

When to Use a Managed Database

Use a managed database from the first line of code. The operational overhead of managing your own database server — backups, replication, failover, patching, disk space management — is engineering time that is better spent on product. The cost difference between a managed database and a self-hosted one is $20–$50 per month at early stage; the engineering time saved is worth 10 to 50 times that amount. Database administration is a specialised skill that takes years to develop; managed services let you access a professional-grade database without the specialisation.

The exception is cost at significant scale. When your database bill reaches $500–$1,000 per month on a managed service, it's worth evaluating whether running on a dedicated instance on AWS or GCP would be cheaper. At that scale, you can also afford to hire or contract with a DBA who knows how to tune a PostgreSQL instance. Before that scale, the additional configuration work of self-hosting is not offset by the cost savings, and the risk of a data loss event due to improper backup configuration is real.

Scaling Inflection Points

Three scaling inflection points require database architecture decisions. The first is at roughly 100,000 active users or 10 GB of data: at this point, unindexed queries begin to show measurable slowness, and the right response is adding indexes on frequently queried columns rather than changing databases. Explain Analyze in PostgreSQL shows exactly which queries are doing sequential table scans; fix those first before assuming you've outgrown the database.

The second inflection point is at roughly 1 million users or 500 GB of data: at this scale, vertical scaling (bigger instance) becomes expensive and horizontal scaling becomes necessary. Read replicas handle read-heavy workloads cheaply — most applications read ten times more than they write. The third inflection point is at multiple millions of users where single-region databases create latency issues for globally distributed users. Neon, CockroachDB, and PlanetScale all offer multi-region distribution for PostgreSQL-compatible databases, allowing reads from the nearest region. Most startups never reach the third inflection point, and 90% of database performance problems are solved by indexes and query optimisation long before architectural changes become necessary.

Frequently Asked Questions

Should I use a separate search database like Elasticsearch or Meilisearch? Not until PostgreSQL's full-text search is clearly insufficient for your use case. PostgreSQL full-text search handles most product search requirements up to tens of millions of records with proper indexing. Add Meilisearch or Typesense when you need typo-tolerance, faceted search, or instant search with millisecond response times. Elasticsearch is worth its operational overhead only at truly large scale — over 100 million documents or complex analytical search queries.

What is connection pooling and why does it matter? Database connections are expensive to create and limited in number. Connection pooling maintains a pool of open connections that application requests share, rather than opening a new connection for every query. In serverless environments — Vercel, AWS Lambda — connection pooling is essential because each serverless function invocation creates a new database connection without it. Use PgBouncer or Supabase's built-in connection pooler (which uses PgBouncer under the hood) for PostgreSQL in serverless environments.

How do I handle database migrations safely in production? Use an additive-only migration strategy: never delete or rename columns in a single migration. Instead, add the new column, migrate data to it, update the application to use the new column, and then drop the old column in a separate migration after the application change is stable. Tools like Flyway and Liquibase manage migration versioning. Prisma's migration system handles this well for TypeScript-based stacks.

What is Row Level Security and should I use it? Row Level Security (RLS) is a PostgreSQL feature that enforces access control at the database level — specific rows are only visible to users who match defined policies. Supabase makes RLS configurable without raw SQL and builds its client-side auth around it. For multi-tenant SaaS applications, RLS prevents data leakage between customers at the database level rather than relying solely on application logic, which is a more robust security model.

When should I start thinking about database backups? From day one of storing real user data. Managed services like Supabase, PlanetScale, and Railway all include automated daily backups, but you should verify the backup retention window (typically 7–14 days) and test restoration at least once. Know the recovery time objective for your business — if a database restore takes four hours and you can't afford four hours of downtime, you need continuous backup with point-in-time recovery enabled.