Platform Comparison Guide
Choose the right Hyperbolic service based on your specific needs. This comprehensive comparison will help you understand the differences between our three core offerings.Quick Comparison Table
| Feature | On-Demand GPU | Serverless Inference | Reserved Clusters |
|---|---|---|---|
| Best For | Training, development, experiments | Production APIs, prototypes | Enterprise workloads |
| Setup Time | < 5 minutes | Instant | 24-48 hours |
| Minimum Commitment | None (hourly) | None (pay-per-use) | 3 months |
| Pricing Model | $/hour | $/1M tokens | $/month |
| GPU Access | Full SSH access | API only | Full SSH access |
| Scaling | Manual | Automatic | Pre-configured |
| Support Level | Community | Standard | Priority |
| SLA | 99.5% | 99.9% | 99.99% |
Detailed Service Comparison
On-Demand GPUs
When to Use:
- Training custom models
- Running experiments and notebooks
- Batch processing jobs
- Need full control over environment
- Temporary compute needs
- Full root access via SSH
- Custom Docker images
- Persistent storage options
- Choose specific GPU models
- Multiple GPU configurations
- H100 SXM: Starting at $1.39/hour
- H200 SXM: Starting at $1.99/hour
- No setup fees or commitments
- No automatic scaling
- Manual instance management
- Availability depends on supply
- No built-in load balancing
Serverless Inference
When to Use:
- Production API endpoints
- Quick prototyping
- Variable/unpredictable traffic
- Using standard models
- Cost-sensitive applications
- Instant deployment
- Auto-scaling
- OpenAI-compatible API
- 25+ pre-loaded models
- Pay only for usage
- Check our pricing page for up-to-date details.
- No custom models (without arrangement)
- Rate limits apply
Reserved Clusters
When to Use:
- 24/7 production workloads
- Guaranteed availability needed
- High-volume consistent usage
- Custom configurations required
- Enterprise compliance needs
- Dedicated resources
- Custom configurations
- Priority support
- 99.99% SLA
- Volume discounts up to 40%
- Custom quotes based on:
- Cluster size (32-100+ GPUs)
- Contract length (3-24 months)
- GPU type and configuration
- Typical savings: Depends on usage and contract length
- Minimum 3-month commitment
- 24-48 hour setup time
- Less flexibility
- Capacity planning required
Use Case Scenarios
Scenario 1: AI Startup Building an App
- Development Phase
- MVP/Beta Phase
- Production Scale
Recommended: On-Demand GPU
- Experiment with different models
- Train custom models
- Test various configurations
- No commitment while iterating
Scenario 2: Research Team
Recommended: On-Demand GPU- Access to latest GPU models
- Full control for experiments
- Flexible rental periods
- Multiple concurrent experiments
Scenario 3: Enterprise Integration
Recommended: Reserved Clusters- Compliance requirements (SOC2, HIPAA)
- Dedicated resources
- Custom security configurations
- SLA guarantees
- Direct support channel
Scenario 4: Hackathon Project
Recommended: Serverless Inference- Instant API access
- No setup required
- Free tier available
- Focus on building, not infrastructure
Getting Started
1
Assess Your Needs
- Workload type (training vs inference)
- Usage pattern (constant vs variable)
- Budget constraints
- Timeline requirements
2
Start Small
- Try Serverless for quick testing
- Rent On-Demand for a few hours
- Run benchmarks and calculate costs
3
Scale Appropriately
- Monitor usage patterns
- Calculate break-even points
- Contact sales for volume discounts

