Skip to main content

Platform Comparison Guide

Choose the right Hyperbolic service based on your specific needs. This comprehensive comparison will help you understand the differences between our three core offerings.

Quick Comparison Table

FeatureOn-Demand GPUServerless InferenceReserved Clusters
Best ForTraining, development, experimentsProduction APIs, prototypesEnterprise workloads
Setup Time< 5 minutesInstant24-48 hours
Minimum CommitmentNone (hourly)None (pay-per-use)3 months
Pricing Model$/hour$/1M tokens$/month
GPU AccessFull SSH accessAPI onlyFull SSH access
ScalingManualAutomaticPre-configured
Support LevelCommunityStandardPriority
SLA99.5%99.9%99.99%

Detailed Service Comparison

On-Demand GPUs

When to Use:
  • Training custom models
  • Running experiments and notebooks
  • Batch processing jobs
  • Need full control over environment
  • Temporary compute needs
Key Features:
  • Full root access via SSH
  • Custom Docker images
  • Persistent storage options
  • Choose specific GPU models
  • Multiple GPU configurations
Pricing Structure:
  • H100 SXM: Starting at $1.39/hour
  • H200 SXM: Starting at $1.99/hour
  • No setup fees or commitments
Limitations:
  • No automatic scaling
  • Manual instance management
  • Availability depends on supply
  • No built-in load balancing

Serverless Inference

When to Use:
  • Production API endpoints
  • Quick prototyping
  • Variable/unpredictable traffic
  • Using standard models
  • Cost-sensitive applications
Key Features:
  • Instant deployment
  • Auto-scaling
  • OpenAI-compatible API
  • 25+ pre-loaded models
  • Pay only for usage
Pricing Examples:Limitations:
  • No custom models (without arrangement)
  • Rate limits apply

Reserved Clusters

When to Use:
  • 24/7 production workloads
  • Guaranteed availability needed
  • High-volume consistent usage
  • Custom configurations required
  • Enterprise compliance needs
Key Features:
  • Dedicated resources
  • Custom configurations
  • Priority support
  • 99.99% SLA
  • Volume discounts up to 40%
Pricing Structure:
  • Custom quotes based on:
    • Cluster size (32-100+ GPUs)
    • Contract length (3-24 months)
    • GPU type and configuration
  • Typical savings: Depends on usage and contract length
Limitations:
  • Minimum 3-month commitment
  • 24-48 hour setup time
  • Less flexibility
  • Capacity planning required

Use Case Scenarios

Scenario 1: AI Startup Building an App

Recommended: On-Demand GPU
  • Experiment with different models
  • Train custom models
  • Test various configurations
  • No commitment while iterating

Scenario 2: Research Team

Recommended: On-Demand GPU
  • Access to latest GPU models
  • Full control for experiments
  • Flexible rental periods
  • Multiple concurrent experiments

Scenario 3: Enterprise Integration

Recommended: Reserved Clusters
  • Compliance requirements (SOC2, HIPAA)
  • Dedicated resources
  • Custom security configurations
  • SLA guarantees
  • Direct support channel

Scenario 4: Hackathon Project

Recommended: Serverless Inference
  • Instant API access
  • No setup required
  • Free tier available
  • Focus on building, not infrastructure

Getting Started

1

Assess Your Needs

  • Workload type (training vs inference)
  • Usage pattern (constant vs variable)
  • Budget constraints
  • Timeline requirements
2

Start Small

  • Try Serverless for quick testing
  • Rent On-Demand for a few hours
  • Run benchmarks and calculate costs
3

Scale Appropriately

  • Monitor usage patterns
  • Calculate break-even points
  • Contact sales for volume discounts

Need Help Deciding?