Draft a technology stack document that maps AI/ML accelerator options (Intel Habana Gaudi2, AMD Instinct, Google TPU v5e) against latency-sensitive ad-tech inference workloads at 5M req/s with sub-20ms P99 requirements

Generate draft a technology stack document that maps ai/ml accelerator options (intel habana gaudi2, amd instinct, google tpu v5e) against latency-sensitive ad-tech inference workloads at 5m req/s with sub-20ms p99 requirements for Computer Systems Design and Related Services industry

Computer Systems Design and Related Services

Agent Configuration

Login required: You need to sign in to execute this agent.

Current Tech Stack File

Click to upload or drag and drop

Allowed: XLSX, PDF, VSDX, PNG, CSV

Max size: 50MB

Upload existing architecture diagrams, current performance data, or baseline documentation in Excel, PDF, or Visio format

Request Volume Profile *

Specify the sustained traffic pattern for ad-tech inference workloads

Latency SLA Requirements *

Define the critical latency threshold and percentile requirements for ad auction decisions

Model Complexity Categories *

Select the types of ML models currently deployed or planned for the inference workload

Accelerator Evaluation Criteria *

Define the primary optimization targets for accelerator selection

Compliance Requirements *

Specify regulatory or data residency constraints impacting architecture decisions

Deployment Target Environments *

Indicate primary infrastructure environments for deployment

Budget Parameters *

Specify hardware budget ranges and cost optimization priorities

Current Performance Baseline *

Provide detailed metrics from existing GPU-based inference deployment including current bottleneck locations, measured throughput, and known latency spikes

Integration Constraints *

Specify APIs, frameworks, libraries, and deployment tools that must maintain compatibility with new accelerator choice

Stakeholder Primary Concerns *

Identify the key stakeholder who will drive final accelerator decision

Draft a technology stack document that maps AI/ML accelerator options (Intel Habana Gaudi2, AMD Instinct, Google TPU v5e) against latency-sensitive ad-tech inference workloads at 5M req/s with sub-20ms P99 requirements

Agent Configuration

Executing Agent

Edit Agent

Current Agent: