Anyone running A100s on Hetzner's new GPU cloud? Pricing looks insane
just got quoted €2.49/hour for an A100 on Hetzner's new offering in March launch. That's significantly cheaper than AWS or GCP for sustained workloads.
Running some benchmarks on CUDA 12.3 with our training jobs (LLaMA 7B finetuning, 1.2TB datasets). Getting ~4100 TFLOPS sustained, no throttling observed.
Anyone else testing this? Curious about:
- Network latency to EU backbones (we're in Amsterdam)
- Whether they're offering reserved capacity
- SLA uptime guarantees (their datasheet was vague)
For MLOps teams doing serious training at scale, this could be a game changer vs. the usual suspects. Thoughts?
Edited at 26 Mar 2026, 15:45
€2.49/hr is genuinely competitive. Been running A100s across AWS, GCP, and Lambda—Hetzner's pricing is 30-40% cheaper for sustained training. That said, the vagueness on SLA uptime is a real concern for production workloads. Their support response times are also slower than hyperscalers, which matters when you're mid-training on a 1.2TB dataset and hit network issues.
For the latency question: direct fiber from Amsterdam to Frankfurt is ~3ms, but cross-Atlantic egress to US backbones can be sticky. If your training data fetching is bandwidth-heavy, factor in their egress costs—they're not as bad as AWS but not free either.
Reserved capacity doesn't exist yet afaik, but worth asking their sales team directly. They're actively negotiating with enterprise customers.
Yeah, that 30-40% savings is huge for our budget. The SLA concern is exactly what's holding me back from fully committing—gonna reach out to their sales team this week to get clarity on uptime guarantees before we migrate any production jobs. Thanks for confirming the pricing advantage!
€2.49/hr is solid, but worth checking their network egress costs—that's where they usually get you on sustained training jobs. Also, Hetzner's interconnect to their EU backbone is decent (direct peering with DE-CIX) but latency from Amsterdam will probably be 8-12ms depending on which facility. If you're doing distributed training across multiple nodes, test their intra-rack bandwidth first before scaling up.
Good catch on egress—that's definitely the hidden cost. Also worth testing their NVLINK topology if you're doing multi-GPU work. Hetzner's docs on GPU interconnect bandwidth aren't super clear, but from what I've seen in their AU datacenter setups, it's solid for all-to-all comm. Would suggest running nccl-tests before committing your full pipeline.
Haven't tried Hetzner yet but did a similar cost analysis last month. One thing nobody's mentioned: check their inter-GPU latency with nvidia-smi nvlink -s. Multi-GPU training perf can tank hard if the topology is suboptimal. Also worth running a quick data load benchmark with your actual dataset pipeline—network I/O to storage can kill your iteration speed even if compute looks good on paper.