What is the DGX Spark?

The NVIDIA DGX Spark is a compact desktop AI computer built around the GB10 Grace Blackwell Superchip. It started life as “Project DIGITS” and ships running DGX OS, NVIDIA’s Ubuntu-based distribution, on ARM64.

The trick that makes it interesting

A normal GPU workstation keeps CPU memory and GPU memory in separate pools connected over PCIe. The GPU has a fixed amount of fast VRAM, and the moment your model plus its KV cache exceeds that ceiling you are forced to quantize hard or shard the model across several cards.

The DGX Spark does it differently. The Grace CPU and the Blackwell GPU sit in one package and share a single 128 GB pool of LPDDR5X memory, kept coherent over NVLink-C2C. The GPU addresses the same memory the CPU sees, with no PCIe copy in the middle. That means you can load models that would normally demand a multi-GPU server and simply run them, because there is no small VRAM wall to hit.

The tradeoff is bandwidth. LPDDR5X is slower than the GDDR7 or HBM on a discrete datacenter card, so the Spark is a “fits very large models, runs them at moderate tokens per second” machine rather than a raw-throughput record-setter. For local development, prototyping, and fine-tuning, that is exactly the right tradeoff.

Who it is for

It is aimed at developers, researchers, and data scientists who want to prototype, fine-tune, and run inference on current-generation reasoning models locally, without renting cloud GPUs or babysitting a noisy rack server. With 128 GB of unified memory you can work with models up to roughly 200 billion parameters on a single unit.

When one box is not enough

Each Spark has built-in ConnectX-7 200GbE networking. Connect two of them with a direct cable and you can run models up to about 405 billion parameters across the pair. NVIDIA documents stacking configurations up to four units (512 GB unified memory, 4,000 TFLOPS FP4) for the largest open-weight models.

Where to go next

If the box is physically in front of you, jump to the first boot tutorial. If you want the numbers first, read hardware at a glance. For the deeper “why,” the unified memory explainer is the place to start.