Scalability

[!NOTE] This section describes current scaling behavior and near-term bottlenecks from the existing Cloud Run + Neon setup.

Current scaling envelope

Cloud Run deployment config supports up to 10 instances in current cloudbuild settings.

Resource	Current setting
Max instances	10
CPU	1
Memory	1Gi
Min instances	0 in cloudbuild.yaml

Database scalability

Neon PostgreSQL is accessed through Prisma with pooled DATABASE_URL and direct DIRECT_URL for migrations.

This split supports:

efficient runtime connection multiplexing
safer schema migration behavior

Caching strategy

Current patterns include:

in-process cache for dashboard stats and selected reads
database aggregation for analytics counters
no dedicated distributed cache layer for general app data yet

What breaks first under load

Likely first stress points:

burst writes on registration and waitlist transitions
DB connection pressure if pooling is misconfigured
Cloud Run cold starts when min instances is 0

Mitigations

keep pgBouncer-backed pooled URL for runtime
tune hot endpoints with selective caching
increase Cloud Run min instances for lower startup latency
introduce queue-backed async fan-out for heavy side effects

10,000 concurrent registration target

To move toward 10,000 concurrent registrants safely:

keep registration writes transaction-safe (already in place)
add distributed cache or queue for non-critical side effects
pre-warm service with min instances > 0
load-test registration endpoints with realistic burst profiles

Future direction

Redis-backed session or queue primitives are a natural next step for:

queueing heavy async jobs
distributed cache coordination
higher-throughput promotion pipelines

Security Deployment