Backup and Disaster-Recovery Policy — Veloxis

Effective date: [to be set on first publication] Last revised: 2026-05-24 Owner: CA Krishna Gujarathi Review cadence: Annual + on every material infrastructure change

This policy specifies how the Veloxis Platform is backed up, where backups live, and how the Firm recovers from failure or destruction events. It is read with the Information Security Policy and the Incident Response Plan.

1. Recovery objectives

Metric	Target	Definition
RPO (Recovery Point Objective)	24 hours	Maximum acceptable data loss measured in time
RTO (Recovery Time Objective)	4 hours for partial restore from same-region backup; 24 hours for full rebuild on new VM	Maximum acceptable time to restore service after a disaster

These objectives are appropriate for the Firm's current single-firm, internal-tool deployment. If Veloxis becomes externally-customer-facing, the objectives will be tightened.

2. What is backed up

Subject	Method	Frequency	Storage	Encryption	Retention
PostgreSQL database (every table, including TokenMap, AuditLog, AIPrivacyLog)	`pg_dump --format=custom --compress=9` to local file, then upload to R2	Nightly at 02:00 IST	Local + Cloudflare R2 (India region where possible)	AES-256-GCM with `BACKUP_AES_KEY` (separate from firm key) before upload	30 days rolling
Cloudflare R2 object storage (uploaded client documents + generated outputs)	R2 internal versioning + lifecycle policy	Continuous	R2 native	Server-side encryption	30 days versioned, then permanent latest
Source code	Git push to `github.com:kgujarathi/veloxis.git`	Every commit	GitHub (US)	TLS in transit; encryption at rest by GitHub	Permanent (Git history)
Knowledge base + legal docs + checklists	Same as source code (in Git)	Every commit	GitHub	Same	Permanent
`.env` files containing keys	Not backed up to any external location. Stored mode-600 on production VM only; documented in encrypted Firm password store	Manual	Firm password store	At rest in password store	Permanent / on rotation
Application logs (Nginx, PM2, PostgreSQL, sidecar, cron)	Rotated by `logrotate`; latest 180 days retained on local disk	Continuous	Local VM disk	Disk encryption (LUKS)	180 days rolling (CERT-In Directions §IV)

3. Where backups live

Local working backup on the production VM at /var/backups/veloxis/. Last seven days. Mode-600. Owned by application user.
Off-server backup on Cloudflare R2 bucket veloxis-backups. Last thirty days. Encrypted before upload using BACKUP_AES_KEY (AES-256-GCM); R2 also encrypts at rest server-side, giving two layers.
Off-site backup of the encryption key — BACKUP_AES_KEY itself lives in the Firm's password store (1Password / Bitwarden / similar) so a complete VM loss does not destroy the only key.

The Firm does not back up to a country other than India. R2 is configured to use the India region as primary where available.

4. Backup integrity

Each nightly backup runs a pg_restore --list check after upload to confirm the dump is readable.
Once a quarter, the Firm restores the latest backup into a throwaway PostgreSQL instance and runs the Veloxis migration verification scripts. The result is recorded under docs/incidents/backup-restore-drill-{YYYY-QQ}.md.
A failed quarterly restore is treated as a SEV-2 incident under the Incident Response Plan.

5. Disaster scenarios and recovery

5.1 Single-row corruption

Restore the affected row(s) from the latest nightly dump.
Document the corruption + the restore in the application audit log.

5.2 Database corruption

Stop veloxis-prod to prevent further writes.
Restore the latest clean nightly dump into a fresh PostgreSQL instance.
Verify TokenMap + AuditLog + AIPrivacyLog row counts against the previous business-day reconciliation.
Restart veloxis-prod pointing at the restored DB.
Investigate the corruption trigger; if security-related, escalate to SEV-1.

5.3 VM compromise

Detach the compromised VM from the public internet.
Snapshot the disk for forensics.
Provision a fresh Linode VM in the India region.
Re-run the Linode bootstrap script (scripts/bootstrap-prod-vm.sh).
git clone the Veloxis repository; check out the last-known-good commit.
Restore the database from the latest pre-compromise R2 backup.
Re-issue keys (FIRM_KEY_BYTES, BACKUP_AES_KEY, JWT signing secret, DB password, AI-provider keys); re-encrypt TokenMap rows via the key-rotation script.
Restore the R2 documents (already on R2 and untouched if R2 access was not compromised).
Repoint DNS to the new VM IP.
Verify the public health endpoint returns HTTP 200.
The Firm publishes a post-mortem under docs/incidents/ within 30 days.

5.4 R2 service outage

Application falls back to serving documents already in the on-VM cache.
Uploads queue locally and retry on R2 recovery.
If R2 outage exceeds 24 hours, the Firm provisions an alternative S3-compatible bucket (e.g., a separate Linode Object Storage) and redirects writes there until R2 recovers.

5.5 Linode region failure

The Firm provisions a fresh VM in an alternative Linode region (Singapore as nearest), restores from R2 backup, repoints DNS.
Note: a region-failure recovery into a non-India region may temporarily breach sectoral data-localisation overlays (RBI, SEBI, IRDAI). The Firm flags affected engagements and disables AI features for them until the Indian region is restored. Affected partners are notified within four hours.

5.6 Catastrophic loss (region + R2 bucket + Firm password store all unavailable simultaneously)

This requires four independent failures. The Firm accepts this residual risk for the current scale; the source code in GitHub remains intact and the Firm can rebuild from scratch, with seven years of audit working papers being the unrecoverable loss. Customer-facing deployment will require a third backup region to retire this residual risk.

6. Operational procedure

The nightly backup job is a cron entry on the production VM:

0 2 * * * /home/kgujrathi/veloxis/scripts/backup-prod.sh >> /var/log/veloxis/backup.log 2>&1

The script scripts/backup-prod.sh (to be authored) performs the steps in §2 above and emits a one-line summary log entry. Daily log monitoring confirms the job ran.

7. Restore drill schedule

Drill	Frequency	Owner
Quarterly read-only restore	4 × per year	Krishna
Annual VM-rebuild dry run on a sandbox VM	1 × per year	Krishna
Tabletop walk-through of §5 scenarios	1 × per year	Krishna

Each drill produces a dated entry under docs/incidents/. Findings feed remediation tasks.

8. Future-readiness

For external-customer deployment, the Firm will upgrade:

RPO to 1 hour (streaming replication or WAL shipping).
RTO to 1 hour (warm standby in a second region).
Backup retention to 1 year minimum.
Quarterly restore drill to monthly.

These upgrades are tracked in the Veloxis backlog under "Customer-readiness".

Veloxis is operated by VKG & Associates, Chartered Accountants. Concerns about this document may be raised with the Grievance Officer at krishna@vkg.co.in.