On-Demand GPU Overview

Access the world’s most powerful AI accelerators instantly through Hyperbolic’s On-Demand Cloud platform. Get enterprise-grade H100 and H200 clusters with guaranteed availability and InfiniBand networking.

Enterprise-Grade GPU Infrastructure

On-Demand Cloud

Professional H100 & H200 GPU ClustersAccess the latest NVIDIA H100 and H200 GPUs, from single instances to massive clusters with InfiniBand networking.

Latest Hardware: H100 80GB and H200 141GB GPUs
Guaranteed Availability: 99.5% uptime SLA
Instant Deployment: Clusters ready in minutes
Scalable: Single GPU to 128+ GPU clusters
InfiniBand Networking: For maximum multi-GPU performance
Flexible Pricing: Pay hourly with no commitments

Perfect for: LLM training, large-scale AI workloads, production inference, and distributed computing

Key Features

Instant Deployment

Deploy GPUs in under 5 minutes
No sales calls or procurement delays
Pre-configured with CUDA, PyTorch, TensorFlow
Available in multiple regions worldwide

Flexible Access

Full SSH root access to your instances
Docker support with pre-built ML images
Persistent storage options available (depending on region)

Simple Billing

Pay only for what you use (hourly billing)
No upfront commitments or contracts
Automatic failure detection (no charges for failed instances)
Multiple payment methods: credit card or crypto

Developer-Friendly

REST API for automation
Agent-compatible endpoints

Available Hardware

Current GPU Offerings

GPU Model	VRAM	Price/Hour	Use Cases
H100 SXM	80GB HBM3	$1.39/hr	Large-scale training, LLM fine-tuning, production inference
H100 + InfiniBand	80GB HBM3	$1.89/hr	Multi-node distributed training, massive model parallelism
H200	141GB HBM3e	$1.99/hr	Next-gen AI workloads, ultra-large models, advanced research

More GPUs Coming Soon: We’re expanding our offerings to include Blackwell (B200) chips.

Regions & Data Centers

GPUs are available across multiple regions for optimal latency and compliance requirements.

Available Regions

North America
Europe
United Kingdom

Select regions closest to your users for inference workloads, or regions with the best GPU availability for training tasks.

Use Cases

Model Training

Fine-tune LLMs on custom datasets with multi-GPU support
Train computer vision models with high-throughput data pipelines
Run distributed training across multiple nodes with InfiniBand
Experiment with architectures using automatic checkpointing

Recommended GPUs: H100 for most training tasks, H200 for ultra-large models

Model Deployment

Host custom inference endpoints with auto-scaling capabilities
Deploy production model servers using TorchServe, Triton, or vLLM
Run batch inference jobs with optimized throughput
A/B test different models with traffic splitting

Recommended GPUs: H100 for high-throughput inference, H200 for serving 70B+ parameter models

Development & Research

Prototype AI applications with Jupyter notebooks
Test GPU-accelerated code with full debugging capabilities
Build ML pipelines with MLflow or Kubeflow integration
Reproduce paper results with exact environment replication

Recommended GPUs: H100 for most research, H200 for cutting-edge experiments

Multi-GPU Clusters

Large language model training with model parallelism
Distributed deep learning with data parallelism
High-performance computing workloads
Massive batch processing with coordinated jobs

Recommended Setup: 8x or 16x H100 with InfiniBand for optimal performance

Security Best Practices

Always follow security best practices when deploying GPU instances with public access.

Instance Security

SSH Keys: Use strong SSH keys, never share private keys
Updates: Keep your OS and packages updated
Monitoring: Set up logging and monitoring for suspicious activity

Data Protection

Encryption: Use encrypted storage for sensitive data
Backups: Regular backups of important models and datasets
Access Control: Implement proper IAM policies
Compliance: Ensure compliance with data regulations (GDPR, HIPAA)

Performance Benchmarks

Training Performance Comparison

Workload	H100 80GB	H100 + IB (8x)	H200 141GB	H200 + IB (8x)
Llama 2 7B Fine-tuning	18 min/epoch	3 min/epoch	15 min/epoch	2 min/epoch
Llama 2 70B Fine-tuning	3.5 hrs/epoch	30 min/epoch	2.8 hrs/epoch	25 min/epoch
Mixtral 8x7B Training	8 hrs	1.2 hrs	6.5 hrs	1 hr
GPT-3 175B Fine-tuning	N/A	48 hrs	36 hrs	28 hrs

Benchmarks are approximate and vary based on batch size, precision, and optimization settings.

Getting Started

Set Up Your Account

Create your account
Add your SSH public key in account settings
Fund with $10+ to get started (credit card or crypto)

Choose Your GPU

Browse available GPUs at app.hyperbolic.ai

Choose H100 for most AI workloads, or H200 for ultra-large models requiring maximum memory.

Launch Instance

Select GPU type and quantity
Configure storage (if needed)
Add or configure your SSH key for access
Click “Rent” to deploy

Connect & Build

# Connect via SSH
{ssh command from your Active Instances page}

# Verify GPU is available
nvidia-smi

# Start building!
python train.py

Resources

Quickstart Guide

5-minute tutorial to launch your first GPU instance

API & Automation

Automate deployments with our comprehensive REST API

API Reference

Complete API documentation for all endpoints

Next Steps

Launch Your First GPU

Ready to deploy? Start with $10 in free credits when you sign up. No credit card required for trial.

Need help? Email [email protected] for support inquiries, or use the in-app chat widget for immediate assistance.

Overview

On-Demand GPU

Serverless Inference

Reserved Clusters

General Platform

Enterprise-Grade GPU Infrastructure

On-Demand Cloud

Key Features

Instant Deployment

Flexible Access

Simple Billing

Developer-Friendly

Available Hardware

Current GPU Offerings

Regions & Data Centers

Available Regions

Use Cases

Security Best Practices

Instance Security

Data Protection

Performance Benchmarks

Training Performance Comparison

Getting Started

Resources

Quickstart Guide

API & Automation

API Reference

Next Steps

Launch Your First GPU

Overview

On-Demand GPU

Serverless Inference

Reserved Clusters

General Platform

​Enterprise-Grade GPU Infrastructure

On-Demand Cloud

​Key Features

​Instant Deployment

​Flexible Access

​Simple Billing

​Developer-Friendly

​Available Hardware

​Current GPU Offerings

​Regions & Data Centers

​Available Regions

​Use Cases

​Security Best Practices

​Instance Security

​Data Protection

​Performance Benchmarks

​Training Performance Comparison

​Getting Started

​Resources

Quickstart Guide

API & Automation

API Reference

​Next Steps

Launch Your First GPU

Enterprise-Grade GPU Infrastructure

Key Features

Instant Deployment

Flexible Access

Simple Billing

Developer-Friendly

Available Hardware

Current GPU Offerings

Regions & Data Centers

Available Regions

Use Cases

Security Best Practices

Instance Security

Data Protection

Performance Benchmarks

Training Performance Comparison

Getting Started

Resources

Next Steps