HammerIO Benchmarks — GPU vs CPU Compression Speed on Jetson Orin

8,537 MB/s

GPU Decompress

nvCOMP LZ4 — in-memory

4,258 MB/s

GPU Decompress

nvCOMP LZ4 — 10GB roundtrip

4.3x

Decompress speedup

in-memory vs CPU zstd-1

5.8x

Roundtrip decompress

vs CPU zstd-1

Decompression Throughput (MB/s) — In-Memory Performance

Roundtrip Results — 10 GB

End-to-end performance including disk I/O.

Method	Processor	Compress MB/s	Decompress MB/s	Ratio	Integrity
nvCOMP LZ4	GPU	517	4,258	1.98x	PASS
zstd-1	CPU	1,094	733	2.00x	PASS
zstd-3	CPU	1,014	741	2.00x	PASS

In-Memory Performance

Raw algorithm throughput without disk I/O bottleneck.

Method	Processor	Compress MB/s	Decompress MB/s	Integrity
nvCOMP LZ4	GPU	705	8,537	PASS
nvCOMP Snappy	GPU	1,615	5,756	PASS
zstd-1	CPU	1,747	2,001	PASS

Note: GPU crossover point is approximately 10 MB — below that, kernel launch overhead dominates. HammerIO automatically routes small files to CPU.

Real-World Results

Measured on an actual project backup — not synthetic test data.

Workload	Contents	Original	Compressed	Ratio	Integrity
Project backup	Python, HTML, config files	Mixed	zstd	5.09x	PASS

Why this matters: Synthetic benchmarks use uniform test data that compresses predictably. Real projects contain a mix of source code, markup, configuration, and binary files. A 5.09x ratio on real-world mixed content demonstrates that HammerIO's smart routing delivers strong compression where it counts — not just on lab data.

What these numbers mean

8,537 MB/s means...

A 1GB model checkpoint restores in ~0.12 seconds in-memory. A forward-deployed AI node that reboots in the field is back to full inference in under a second.

GPU crossover at ~10 MB

Below 10 MB, kernel launch overhead dominates. Above that, GPU decompression is 4.3x faster than CPU. Large ML datasets and model weights see the biggest gains.

CPU fallback is still fast

CPU zstd-1 at 1,747 MB/s compress and 2,001 MB/s decompress in-memory. HammerIO only uses GPU when the overhead is worth it.

Run it yourself

pip install hammerio
hammer benchmark --1gb

View benchmark source on GitHub →
Installation guide · CLI reference · Python API

Benchmark Results