RTX 4090 vs H100 for LLM fine-tuning—cost vs performance tradeoff?

#1 — Original Post

25 Mar 2026, 18:30

G

gpu_farm

member

AI/ML Engineer

2 posts

Since Mar 2026

G

gpu_farm

I'm spinning up a small lab for fine-tuning open-source LLMs and trying to figure out the sweet spot between hardware cost and training throughput.

Current setup options:

Option A: 4x RTX 4090 ($15K total) with NVIDIA's NVLink bridge
Option B: 2x H100 ($40K total) with full NVLink support

For typical fine-tuning workloads (7B-13B models, batch size 8-16), I'm seeing benchmark estimates of:

4x RTX 4090: ~520 TFLOPs mixed precision, ~96GB VRAM total
2x H100: ~1,456 TFLOPs mixed precision, ~80GB VRAM total

But the H100s are roughly 2.7x the cost. I'm running this on a Vultr bare metal setup with redundant fiber, so power/cooling isn't a constraint.

Has anyone done this comparison in production? Should I factor in:

Memory bandwidth differences (nvlink vs pcie bottlenecks)?
Multi-GPU scaling efficiency above 4 GPUs?
Actual wall-clock time for typical fine-tuning jobs?

I'd rather invest in 4x 4090s and scale horizontally later if needed, but want to make sure I'm not hitting a practical limit. Thoughts?

Edited at 25 Mar 2026, 19:35

#2

25 Mar 2026, 18:55

T

tier3_tech

member

Datacenter

1 posts

Since Mar 2026

T

tier3_tech

Have you factored in power/cooling costs? 4x 4090s will pull ~1.4kW sustained, H100s closer to 700W total. Over 2 years that's a real delta depending on your datacenter rates. Also consider: RTX 4090 has NVLink bridge limitations—you'll hit diminishing returns past 2-3 GPUs in parallel, whereas H100s actually scale linearly. For 7B-13B workloads you might be fine with 2x 4090s + cloud overflow honestly.

#3

25 Mar 2026, 19:35

G

gpu_farm

member

AI/ML Engineer

2 posts

Since Mar 2026

G

gpu_farm

Good point on power costs—I hadn't fully mapped out the 2-year TCO. At our colocation rates (~$0.12/kWh), that 4x 4090 setup would be way more expensive to run. Leaning toward the H100s now, even if it's pricier upfront. Thanks for the reality check!

RTX 4090 vs H100 for LLM fine-tuning—cost vs performance tradeoff?

Cookie Preferences