Skip to main content

Storage and Port Configuration

Learn how to configure storage volumes and network ports for your H100 and H200 GPU instances to support machine learning workloads, model training, and inference deployments.

Storage Options

Every GPU instance comes with onboard storage included, plus the option to attach persistent storage volumes.

Storage Types Overview

Automatically Included with Every Instance
  • Cost: Free (included with instance pricing)
  • Provisioning: Automatically available when instance starts
  • Persistence:Erased when instance is terminated
  • Performance: High-speed NVMe SSD (up to 7,000 MB/s)
  • Use Cases: Active training, temporary data, cache, scratch space
Onboard Storage by Configuration:
GPU TypeRegionOnboard Storage Size
H100us-central-118TB
H100eu-north-410TB
H100uk-southeast-324TB
H200uk-central-324TB
Onboard storage is erased when the instance is terminated. Always save important data to persistent storage or external services before terminating an instance.

Working with Storage

Using Onboard Storage

Onboard storage is automatically mounted and ready to use when your instance starts:
# View all available storage
df -h

# Onboard storage is typically mounted at:
# /home/ubuntu (root volume for OS and user files)
# /mnt or /data (additional onboard storage space)

# Check onboard storage usage
du -sh /home/ubuntu/*
du -sh /mnt/*
The exact mount points may vary by instance configuration. Use df -h or lsblk to see all available storage.

Creating and Attaching Persistent Storage

1

Create Persistent Volume

In the Hyperbolic web console:
  1. Navigate to the Storage section
  2. Click “Create Persistent Volume”
  3. Specify size (100GB - 10TB)
  4. Select region (currently us-central-1 only)
  5. Name your volume for easy identification
Persistent storage incurs additional hourly charges. Check current pricing in the console.
2

Attach to Instance

After creating the volume:
  1. Go to your running instance details
  2. Click “Attach Storage”
  3. Select your persistent volume from the list
  4. The volume will be attached as a block device (e.g., /dev/vdb)
3

Mount and Use

SSH into your instance and mount the volume:
# Check if volume is attached
lsblk

# Format if new volume (only do this once!)
sudo mkfs.ext4 /dev/vdb

# Create mount point
sudo mkdir -p /mnt/persistent

# Mount the volume
sudo mount /dev/vdb /mnt/persistent

# Set permissions
sudo chown -R $USER:$USER /mnt/persistent

# Make mount persistent across reboots
echo "/dev/vdb /mnt/persistent ext4 defaults 0 2" | sudo tee -a /etc/fstab

Storage Configuration

Storage Planning by Workload

When launching an instance, plan your storage strategy based on your workload and available options: Training Workloads:
  • Use onboard storage for active training data and scratch space
  • If available (us-central-1), attach persistent storage for:
    • Model checkpoints
    • Final trained models
    • Datasets you want to reuse
  • For regions without persistent storage, implement regular backups to S3/GCS/Azure
Inference Workloads:
  • Load models into onboard storage for fastest performance
  • Use persistent storage (if available) for model library
  • Cache frequently accessed data on onboard storage
Development/Experimentation:
  • Use onboard storage for active development
  • Save important results to persistent storage or external services
  • Implement git hooks to backup code changes
Storage Recommendations by Configuration:
GPURegionOnboardStrategy
H100us-central-118TBUse onboard for active work + persistent volumes for long-term storage
H100eu-north-410TBAbundant onboard space, but backup critical data before termination
H100uk-southeast-324TBAbundant onboard space, but backup critical data before termination
H200uk-central-324TBHigh-performance onboard storage, export models before termination

Managing Storage Volumes

1

Check Your Onboard Storage

# View all storage available on your instance
df -h

# Check disk usage by directory
du -sh /*

# Monitor I/O performance
iostat -x 1
Your onboard storage is automatically available and includes:
  • System root volume (OS and applications)
  • Additional data volume (varies by configuration: 2TB - 24TB)
2

Manage Persistent Storage Volumes

If you’ve created persistent storage (us-central-1 only):
# List block devices to find your persistent volume
lsblk

# Persistent volumes appear as /dev/vd* devices
# Mount your persistent volume
sudo mkdir -p /mnt/persistent
sudo mount /dev/vdb /mnt/persistent

# Make mount persistent across reboots
echo "/dev/vdb /mnt/persistent ext4 defaults 0 2" | sudo tee -a /etc/fstab
3

Transfer Data Before Termination

Remember: Onboard storage is erased when the instance is terminated!
Before terminating an instance:
# Option 1: Copy to persistent storage (if available)
rsync -avP /home/ubuntu/important-data/ /mnt/persistent/backup/

# Option 2: Upload to S3
aws s3 sync /home/ubuntu/models/ s3://my-bucket/models/

# Option 3: Upload to Google Cloud Storage
gsutil -m cp -r /home/ubuntu/checkpoints/ gs://my-bucket/checkpoints/

# Option 4: Create tar archive and upload
tar -czf models.tar.gz /home/ubuntu/models/
curl -T models.tar.gz https://transfer.sh/models.tar.gz

Data Management Best Practices

Organizing Your Storage

Using Onboard Storage (All Instances):
# Onboard storage structure (size varies: 2TB - 24TB)
/home/ubuntu/           # User home directory
├── code/              # Your application code
├── data/              # Active datasets
├── models/            # Working models
└── outputs/           # Results and logs

/mnt/data/             # Additional onboard space (if available)
├── cache/             # Temporary files
├── checkpoints/       # Training checkpoints
└── scratch/           # Experimental work
Using Persistent Storage (When Available):
# Persistent volume (created separately, attached to instance)
/mnt/persistent/        # Survives instance termination
├── datasets/          # Reusable datasets
├── model-library/     # Trained models collection
├── checkpoints/       # Important checkpoints
└── shared-resources/  # Team shared data

Backup Strategies

Since onboard storage is erased on termination, implement appropriate backup strategies: For Instances with Persistent Storage (us-central-1):
# Automated backup from onboard to persistent storage
# Add to crontab: crontab -e
0 */2 * * * rsync -avP /home/ubuntu/models/ /mnt/persistent/models/
0 */4 * * * rsync -avP /home/ubuntu/checkpoints/ /mnt/persistent/checkpoints/
For Instances without Persistent Storage:
# Option 1: Backup to S3
aws s3 sync /home/ubuntu/models/ s3://my-bucket/models/ --delete

# Option 2: Backup to Google Cloud Storage
gsutil -m rsync -r /home/ubuntu/models/ gs://my-bucket/models/

# Option 3: Backup to Azure Blob Storage
az storage blob sync -s /home/ubuntu/models/ -c mycontainer

# Automate with cron (every 6 hours)
0 */6 * * * aws s3 sync /home/ubuntu/important/ s3://my-bucket/backup/

Optimizing Storage Performance

Maximize Onboard Storage Performance:
  • Onboard NVMe provides up to 7,000 MB/s throughput
  • Use for active datasets and model training
  • Keep frequently accessed files on onboard storage
  • Clean temporary files regularly to maintain performance
Persistent Storage Optimization (if available):
  • Network-attached with up to 1,000 MB/s throughput
  • Best for long-term storage, not active training
  • Use for model archives and dataset libraries
  • Consider compression for infrequently accessed data
Managing Limited Storage (2TB configurations):
# Monitor disk usage closely
watch -n 60 'df -h | grep -v tmpfs'

# Clean package caches
pip cache purge
conda clean --all -y
apt-get clean

# Remove old Docker images if using containers
docker system prune -a -f

# Stream large datasets instead of downloading
# Example with TensorFlow:
dataset = tf.data.TFRecordDataset(["s3://bucket/data.tfrecord"])
Data Lifecycle Management:
# Set up automated cleanup for temporary files
find /home/ubuntu/cache -type f -mtime +1 -delete
find /tmp -type f -mtime +1 -delete

# Compress old checkpoints
find /home/ubuntu/checkpoints -name "*.ckpt" -mtime +7 -exec gzip {} \;

# Archive completed experiments
tar -czf experiment-$(date +%Y%m%d).tar.gz /home/ubuntu/experiments/completed/

Port Configuration

Configure network ports to enable access to services running on your GPU instances.

Exposing Services

SSH Port Forwarding

The most secure method for accessing services:
# Local machine: Create SSH tunnel
ssh -L 8888:localhost:8888 ubuntu@[instance-ip] -i ~/.ssh/hyperbolic_key.pem

# On instance: Launch Jupyter
jupyter notebook --no-browser --port=8888

# Access at: http://localhost:8888

Multiple Port Forwarding

# Forward multiple ports simultaneously
ssh -L 8888:localhost:8888 \
    -L 6006:localhost:6006 \
    -L 5000:localhost:5000 \
    ubuntu@[instance-ip] -i ~/.ssh/hyperbolic_key.pem

Advanced Networking

SOCKS Proxy Configuration

For full network access through your instance:
# Create SOCKS proxy
ssh -D 8080 ubuntu@[instance-ip] -i ~/.ssh/hyperbolic_key.pem

# Configure applications to use SOCKS proxy at localhost:8080

Persistent Tunnels

Use autossh for maintaining persistent connections:
# Install autossh
sudo apt-get install autossh

# Create persistent tunnel with auto-reconnect
autossh -M 0 -f -N \
  -o "ServerAliveInterval 30" \
  -o "ServerAliveCountMax 3" \
  -L 8888:localhost:8888 \
  ubuntu@[instance-ip] -i ~/.ssh/hyperbolic_key.pem

Security Considerations

Never expose services directly to the internet without proper authentication and encryption. Always use SSH tunnels for development and testing.

Best Practices

  1. Use SSH tunnels for all development services
  2. Implement authentication before exposing any service
  3. Enable HTTPS for production deployments
  4. Monitor access logs regularly
  5. Rotate SSH keys periodically
# Monitor active connections
netstat -tulpn | grep LISTEN

# Check SSH connection attempts
sudo tail -f /var/log/auth.log | grep sshd

# List established connections
ss -tunap | grep ESTABLISHED

Storage and Port Automation

Monitoring and Alerts

Set up monitoring for both storage types:
#!/bin/bash
# storage-monitor.sh

echo "=== Storage Health Check ==="

# Check onboard storage
ONBOARD_USAGE=$(df -h /home/ubuntu | tail -1 | awk '{print $5}' | sed 's/%//')
ONBOARD_SIZE=$(df -h /home/ubuntu | tail -1 | awk '{print $2}')
echo "Onboard Storage: $ONBOARD_SIZE (${ONBOARD_USAGE}% used)"

# Determine alert threshold based on size
if [[ "$ONBOARD_SIZE" == *"2T"* ]]; then
    THRESHOLD=70  # Lower threshold for 2TB configs
else
    THRESHOLD=85  # Standard threshold for larger configs
fi

# Alert if over threshold
if [ $ONBOARD_USAGE -gt $THRESHOLD ]; then
    echo "⚠️  WARNING: Onboard storage ${ONBOARD_USAGE}% full (threshold: ${THRESHOLD}%)"
    echo "   → Clean temporary files: find /tmp -type f -mtime +1 -delete"
    echo "   → Clear package cache: pip cache purge && conda clean --all"
fi

# Check for persistent storage
if mountpoint -q /mnt/persistent 2>/dev/null; then
    PERSISTENT_USAGE=$(df -h /mnt/persistent | tail -1 | awk '{print $5}' | sed 's/%//')
    PERSISTENT_SIZE=$(df -h /mnt/persistent | tail -1 | awk '{print $2}')
    echo "Persistent Storage: $PERSISTENT_SIZE (${PERSISTENT_USAGE}% used)"
    echo "✓ Data on persistent storage survives termination"
else
    echo "⚠️  No persistent storage attached"
    echo "⚠️  ALL DATA WILL BE LOST ON INSTANCE TERMINATION!"
fi

# I/O performance tracking
echo -e "\n=== Storage Performance ==="
iostat -x 1 3 | tail -4 | head -3

# Backup status check
echo -e "\n=== Backup Status ==="
if crontab -l 2>/dev/null | grep -q rsync; then
    echo "✓ Automated backups are configured"
    crontab -l | grep rsync
else
    echo "⚠️  No automated backups configured"
    echo "   → Set up backups to persistent storage or external services"
fi

Troubleshooting

Common Storage Issues

Symptoms: Persistent volume not visible or mount failsSolutions:
# 1. Check if persistent volume is attached
lsblk
# Look for /dev/vdb or similar

# 2. Check if it has a filesystem
sudo file -s /dev/vdb

# 3. If "data" (no filesystem), format it (ONLY for new volumes!)
sudo mkfs.ext4 /dev/vdb

# 4. Create mount point and mount
sudo mkdir -p /mnt/persistent
sudo mount /dev/vdb /mnt/persistent

# 5. Fix permissions
sudo chown -R $USER:$USER /mnt/persistent

# 6. Make persistent across reboots
echo "/dev/vdb /mnt/persistent ext4 defaults 0 2" | sudo tee -a /etc/fstab
Note: Persistent storage must be created in the web console first, then attached to your instance.
Symptoms: Training fails, services crash, unable to save checkpointsSolutions:
# 1. Check what's using space
du -sh /* 2>/dev/null | sort -rh | head -20
df -h

# 2. Clean temporary files and caches
find /tmp -type f -mtime +1 -delete
find ~/cache -type f -mtime +7 -delete
pip cache purge
conda clean --all -y
apt-get clean

# 3. Compress old checkpoints
find ~/checkpoints -name "*.ckpt" -mtime +3 -exec gzip {} \;

# 4. If you have persistent storage, move data there
if mountpoint -q /mnt/persistent; then
    rsync -avP ~/models/ /mnt/persistent/models/
    rm -rf ~/models/old_versions/
fi

# 5. For limited storage (2TB), use external storage
# Upload to S3 and delete local copies
aws s3 sync ~/outputs/ s3://my-bucket/outputs/ --delete-removed

# 6. Remove Docker images if using containers
docker image prune -a -f
docker system prune -a -f --volumes
Prevention Tips:
  • Set up automated cleanup in cron
  • Use persistent storage for long-term data (if available)
  • Stream large datasets instead of downloading
  • Implement regular backups to external storage

Common Port Issues

Symptoms: Service fails to start on specified portSolutions:
# Find process using port
sudo lsof -i :8888

# Kill process if needed
sudo kill -9 [PID]

# Or use different port
jupyter notebook --port=8889
Symptoms: Service running but not accessibleSolutions:
# Verify service is listening
netstat -tulpn | grep [PORT]

# Check SSH tunnel is active
ps aux | grep ssh

# Restart SSH tunnel
ssh -L [PORT]:localhost:[PORT] ubuntu@[instance-ip] -i ~/.ssh/key.pem

Getting Help

If you encounter issues with storage or port configuration:
  1. Check the instance logs in the web console
  2. Review the troubleshooting section above
  3. Use the Intercom widget in the console for immediate assistance
  4. Contact [email protected] with:
    • Instance ID
    • Error messages
    • Steps to reproduce the issue

Next Steps

Troubleshooting Guide

Find solutions to common issues and error messages