Backup and Disaster-Recovery Policy — Veloxis

Effective date: [to be set on first publication] Last revised: 2026-05-24 Owner: CA Krishna Gujarathi Review cadence: Annual + on every material infrastructure change


This policy specifies how the Veloxis Platform is backed up, where backups live, and how the Firm recovers from failure or destruction events. It is read with the Information Security Policy and the Incident Response Plan.


1. Recovery objectives

Metric Target Definition
RPO (Recovery Point Objective) 24 hours Maximum acceptable data loss measured in time
RTO (Recovery Time Objective) 4 hours for partial restore from same-region backup; 24 hours for full rebuild on new VM Maximum acceptable time to restore service after a disaster

These objectives are appropriate for the Firm's current single-firm, internal-tool deployment. If Veloxis becomes externally-customer-facing, the objectives will be tightened.

2. What is backed up

Subject Method Frequency Storage Encryption Retention
PostgreSQL database (every table, including TokenMap, AuditLog, AIPrivacyLog) pg_dump --format=custom --compress=9 to local file, then upload to R2 Nightly at 02:00 IST Local + Cloudflare R2 (India region where possible) AES-256-GCM with BACKUP_AES_KEY (separate from firm key) before upload 30 days rolling
Cloudflare R2 object storage (uploaded client documents + generated outputs) R2 internal versioning + lifecycle policy Continuous R2 native Server-side encryption 30 days versioned, then permanent latest
Source code Git push to github.com:kgujarathi/veloxis.git Every commit GitHub (US) TLS in transit; encryption at rest by GitHub Permanent (Git history)
Knowledge base + legal docs + checklists Same as source code (in Git) Every commit GitHub Same Permanent
.env files containing keys Not backed up to any external location. Stored mode-600 on production VM only; documented in encrypted Firm password store Manual Firm password store At rest in password store Permanent / on rotation
Application logs (Nginx, PM2, PostgreSQL, sidecar, cron) Rotated by logrotate; latest 180 days retained on local disk Continuous Local VM disk Disk encryption (LUKS) 180 days rolling (CERT-In Directions §IV)

3. Where backups live

  • Local working backup on the production VM at /var/backups/veloxis/. Last seven days. Mode-600. Owned by application user.
  • Off-server backup on Cloudflare R2 bucket veloxis-backups. Last thirty days. Encrypted before upload using BACKUP_AES_KEY (AES-256-GCM); R2 also encrypts at rest server-side, giving two layers.
  • Off-site backup of the encryption keyBACKUP_AES_KEY itself lives in the Firm's password store (1Password / Bitwarden / similar) so a complete VM loss does not destroy the only key.

The Firm does not back up to a country other than India. R2 is configured to use the India region as primary where available.

4. Backup integrity

  • Each nightly backup runs a pg_restore --list check after upload to confirm the dump is readable.
  • Once a quarter, the Firm restores the latest backup into a throwaway PostgreSQL instance and runs the Veloxis migration verification scripts. The result is recorded under docs/incidents/backup-restore-drill-{YYYY-QQ}.md.
  • A failed quarterly restore is treated as a SEV-2 incident under the Incident Response Plan.

5. Disaster scenarios and recovery

5.1 Single-row corruption

  • Restore the affected row(s) from the latest nightly dump.
  • Document the corruption + the restore in the application audit log.

5.2 Database corruption

  • Stop veloxis-prod to prevent further writes.
  • Restore the latest clean nightly dump into a fresh PostgreSQL instance.
  • Verify TokenMap + AuditLog + AIPrivacyLog row counts against the previous business-day reconciliation.
  • Restart veloxis-prod pointing at the restored DB.
  • Investigate the corruption trigger; if security-related, escalate to SEV-1.

5.3 VM compromise

  • Detach the compromised VM from the public internet.
  • Snapshot the disk for forensics.
  • Provision a fresh Linode VM in the India region.
  • Re-run the Linode bootstrap script (scripts/bootstrap-prod-vm.sh).
  • git clone the Veloxis repository; check out the last-known-good commit.
  • Restore the database from the latest pre-compromise R2 backup.
  • Re-issue keys (FIRM_KEY_BYTES, BACKUP_AES_KEY, JWT signing secret, DB password, AI-provider keys); re-encrypt TokenMap rows via the key-rotation script.
  • Restore the R2 documents (already on R2 and untouched if R2 access was not compromised).
  • Repoint DNS to the new VM IP.
  • Verify the public health endpoint returns HTTP 200.
  • The Firm publishes a post-mortem under docs/incidents/ within 30 days.

5.4 R2 service outage

  • Application falls back to serving documents already in the on-VM cache.
  • Uploads queue locally and retry on R2 recovery.
  • If R2 outage exceeds 24 hours, the Firm provisions an alternative S3-compatible bucket (e.g., a separate Linode Object Storage) and redirects writes there until R2 recovers.

5.5 Linode region failure

  • The Firm provisions a fresh VM in an alternative Linode region (Singapore as nearest), restores from R2 backup, repoints DNS.
  • Note: a region-failure recovery into a non-India region may temporarily breach sectoral data-localisation overlays (RBI, SEBI, IRDAI). The Firm flags affected engagements and disables AI features for them until the Indian region is restored. Affected partners are notified within four hours.

5.6 Catastrophic loss (region + R2 bucket + Firm password store all unavailable simultaneously)

This requires four independent failures. The Firm accepts this residual risk for the current scale; the source code in GitHub remains intact and the Firm can rebuild from scratch, with seven years of audit working papers being the unrecoverable loss. Customer-facing deployment will require a third backup region to retire this residual risk.

6. Operational procedure

The nightly backup job is a cron entry on the production VM:

0 2 * * * /home/kgujrathi/veloxis/scripts/backup-prod.sh >> /var/log/veloxis/backup.log 2>&1

The script scripts/backup-prod.sh (to be authored) performs the steps in §2 above and emits a one-line summary log entry. Daily log monitoring confirms the job ran.

7. Restore drill schedule

Drill Frequency Owner
Quarterly read-only restore 4 × per year Krishna
Annual VM-rebuild dry run on a sandbox VM 1 × per year Krishna
Tabletop walk-through of §5 scenarios 1 × per year Krishna

Each drill produces a dated entry under docs/incidents/. Findings feed remediation tasks.

8. Future-readiness

For external-customer deployment, the Firm will upgrade:

  • RPO to 1 hour (streaming replication or WAL shipping).
  • RTO to 1 hour (warm standby in a second region).
  • Backup retention to 1 year minimum.
  • Quarterly restore drill to monthly.

These upgrades are tracked in the Veloxis backlog under "Customer-readiness".

Veloxis is operated by VKG & Associates, Chartered Accountants. Concerns about this document may be raised with the Grievance Officer at krishna@vkg.co.in.