Cloud Infrastructure Strategy
Module Purpose: This module defines the underlying cloud architecture required to support Cazo's "Private AI" strategy. It focuses on isolating tenant data and providing GPU-accelerated compute for self-hosted inference.
Strategic Goals
- Data Sovereignty: Ensure customer PII (chat logs, images) never leaves our VPC.
- Cost Predictability: Use reserved instances for predictable 24/7 inference workloads.
- Security: Strict network segmentation between "Public Web Tier" and "Private AI Tier".
Use Case Matrix
| ID | Capability | Description |
|---|---|---|
[CLD-001] |
VPC Provisioning | Setup isolated network with public/private subnets. |
[CLD-002] |
Security Groups | Whitelisting traffic for internal AI APIs. |
[CLD-003] |
Inference Cluster | Provisioning GPU nodes for Ollama/vLLM. |
[CLD-004] |
Auto-Scaling | Dynamic scaling based on inference queue depth. |