Skip to content

Connect two Sparks

You have two DGX Sparks and want them to act as one larger machine, for models up to ~405B parameters or for distributed fine-tuning.

The idea is a direct point-to-point link between the two ConnectX-7 200GbE ports, running RoCE (RDMA over Converged Ethernet) for high-throughput, low-latency GPU-to-GPU communication.

Two DGX Spark units connected directly by a QSFP cable

Two Sparks linked directly over a 200GbE ConnectX-7 (QSFP) cable. Image: FiberMall.

  • Two Sparks, both running DGX OS with NVIDIA drivers.
  • An approved QSFP cable between the two CX-7 ports. NVIDIA lists the Amphenol NJAAKK-N911 (and the 0.5 m NJAAKK0006) and the Luxshare LMTQF022-SD-R.
  • sudo on both, and internet access for the initial software setup.
  1. Connect the QSFP cable directly between the CX-7 ports on the two units.

  2. Identify which OS interface maps to the physical port. Each QSFP port shows up under two interface names; prefer the enp1... primary. The authoritative tool is:

    Terminal window
    ibdev2netdev

The recommended path for a single-cable setup is automatic link-local addressing via netplan. Following NVIDIA’s Connect Two Sparks playbook, on both nodes:

Terminal window
sudo wget -O /etc/netplan/40-cx7.yaml <url-from-the-playbook>
sudo chmod 600 /etc/netplan/40-cx7.yaml
sudo netplan apply

This assigns link-local 169.254.x.x addresses on the fast interface. For a dual-cable full-bandwidth setup you must assign static IPs manually so all four interfaces are addressed.

The netplan drop-in lives alongside the system’s other network config:

  • Directory/etc/netplan/
    • 00-installer-config.yaml the stock DGX OS config (leave it)
    • 40-cx7.yaml the CX-7 fast-link config you just added

Multi-node jobs need passwordless SSH between the same username on both nodes. NVIDIA’s discover-sparks.sh automates this using mDNS/Avahi.

GPU collective operations go through NCCL, which on the Spark must be built for Blackwell compute capability sm_121. You also have to force NCCL traffic onto the 200GbE interface rather than the 1GbE management network, via environment variables documented in the playbook.

If you run the workload in Docker, the container needs host networking and the RoCE device mapped in:

Terminal window
docker run --network=host --device=/dev/infiniband --ulimit memlock=-1 ...

Confirm the link with standard network tools and an NCCL communication test. Once it passes, the pair is ready for distributed serving (vLLM/Ray, TensorRT-LLM multi-node) or distributed training.

For the conceptual picture of why this works and where the bottlenecks are, read multi-Spark networking.