Engineered for99.99% availability
The CloudIP platform is built layer by layer to stay online when something goes wrong. This page is the engineering posture: targets, the architecture behind them, and the proof we run every day.
What we engineer for
A single source of truth for the numbers we hold ourselves to. Marketing copy reads the same values as the runbooks.
| Metric | Target | What it means |
|---|---|---|
| Uptime | 99.99% | Target uptime across the platform. Roughly 4.4 minutes of monthly downtime budget. |
| Read latency (p95, US) | < 50 ms | Read p95 measured coast-to-coast within the United States, served from the nearest edge replica. |
| Write latency (p95, US) | < 150 ms | Write p95 measured against the regional primary database, including queue acknowledgements. |
| Recovery Point Objective (RPO) | ≤ 60 seconds | Maximum window of data potentially lost in a worst-case region failure for critical tables. |
| Recovery Time Objective (RTO) | ≤ 5 minutes | Maximum time to restore the service for a regional incident affecting a critical workload. |
| Hot backup retention | 30 days | Time-Travel-style point-in-time restore window covering every tenant database. |
| Cold archive retention | 7 years | Object-locked archive in geographically redundant storage with retention enforced at the storage layer. |
| Backup durability | 11 nines | Backups are stored on R2 with eleven nines of annual durability, replicated cross-region. |
| US edge presence | 30+ POPs | Compute and cache run on Cloudflare’s anycast network with more than thirty US points of presence. |
| DDoS / WAF | Always on | Layer 3, 4, and 7 protection plus a managed WAF rule pack are enabled for every customer. |
| Encryption | AES-256 + TLS 1.3 | Customer data is encrypted at rest with AES-256 and in transit with TLS 1.3 on every connection. |
| Admin MFA | Mandatory | Administrators must enroll TOTP or WebAuthn before performing privileged operations. |
| DR drills | Quarterly, public | A scripted disaster-recovery game day is run every quarter and the results are posted publicly. |
| Public post-mortems | ≥ 15 minutes impact | Any incident with customer-visible impact of fifteen minutes or more is documented in a public post-mortem. |
Six pillars behind the numbers
Every target on this page maps to specific engineering practice. None of these is theoretical \u2014 each is a system that runs in production.
Global edge, U.S. focus
Compute and cache run on 30+ U.S. points of presence on Cloudflare’s anycast network. Requests are served from the location closest to the user, with automatic failover between POPs when a region misbehaves.
Five-database isolation
The platform runs on five logically isolated databases — auth, tenant operations, business records, communications, and audit. A schema change or hot table in one cannot stop the others.
Cross-region data replication
Backups, blob storage, and database snapshots are replicated to a second U.S. region within seconds of being written. Object lock prevents tampering even by the customer.
Canary deploys with auto-rollback
Every release ships to 5% of traffic, then 25%, then 100%, with synthetic checks watching the error rate. A bad deploy rolls itself back automatically within 90 seconds.
Graceful degradation by design
Each external dependency — cards, email, AI, shipping — sits behind a circuit breaker with a graceful fallback. A vendor outage shows up as a queue, not a broken page.
Always-on edge protection
Layer 3, 4, and 7 DDoS protection, a managed WAF, and per-tenant rate limits run in front of every request. A noisy integration cannot drain capacity for the rest of the platform.
What lives where
Resilience comes from making each layer fail independently. Here is the layered map of the CloudIP platform.
Edge
Cloudflare anycast network with 30+ U.S. POPs. Static pages, marketing sites, and storefronts are cached at the edge so they continue to serve traffic during platform maintenance.
Compute
Workers run as stateless V8 isolates that auto-scale to demand. The main app is split into sibling Workers for OG image generation, billing, AI, and real-time communications, so a bug in one cannot stop the others.
State
Stateful coordination runs on Durable Objects — one per tenant for collaboration, one per resource for booking, and one per provider for circuit-breaker state. Failures stay isolated to a single instance.
Data
Five logically split D1 databases with regional read replicas via the Sessions API. Daily Time-Travel restores are validated against invariants in a sandbox.
Backup
Snapshots land on R2 with eleven nines of durability and object-lock retention enforced at the storage layer. Cross-region replication runs continuously.
Queues
Cloudflare Queues with retries, dead-letter queues, and replay tooling. Asynchronous work survives transient failures without operator action.
Observability
Every request carries a propagated request ID. Logs and metrics feed Workers Analytics Engine. Synthetic Health-Check probes measure every Worker every 60 seconds from three U.S. regions.
The platform tests itself
A reliability story you cannot verify is marketing. These are the jobs that run on a schedule, and the artifacts they produce.
Synthetic probes every minute
Three U.S. regions hit every Worker\u2019s health endpoint every 60 seconds. Failures roll up into the public status page.
Nightly Time-Travel restore
A randomly chosen tenant is restored to its state from one hour ago, validated against invariants, and discarded. Failures page on-call.
Quarterly DR drill
A full sandbox tenant is restored from cold backup once a quarter. Observed RTO and RPO are published in the changelog.
Common questions about availability
Specific answers, not marketing fluff.
During the current development phase, 99.99% is an engineering target rather than a contractual service-level agreement. Customers requiring binding SLAs, custom RPO and RTO guarantees, or dedicated infrastructure can engage CloudIP Professional Services for a custom contract.
Need a custom HA contract?
Dedicated infrastructure, binding SLAs, custom RPO and RTO targets, and cross-cloud cold backup are available through CloudIP Professional Services.