NVIDIA dropped a significant announcement at GTC this week: NemoClaw, an open source stack designed to run always-on AI agents locally on NVIDIA hardware — with better privacy, better security, and zero token costs.
If you've been following the OpenClaw movement — AI agents that run persistently on your own machine, connected to your apps, files, and workflows — NemoClaw is NVIDIA's play to make that experience dramatically better on their GPUs.
What NemoClaw actually does
Two things, essentially:
- Local model support via Nemotron. NVIDIA's open Nemotron models (including the new Nemotron 3 Super at 120B parameters and the compact Nano 4B) let you run inference locally. No cloud API calls, no per-token charges, and your data stays on your machine.
- OpenShell — a safer runtime. When an AI agent can browse the web, execute code, and manage files on your behalf, security matters. OpenShell is NVIDIA's runtime designed to execute agent actions with guardrails, reducing the risk of an agent doing something you didn't intend.
The new models worth knowing about
GTC also brought a wave of model announcements aimed squarely at local agent use:
- Nemotron 3 Super (120B) — 12B active parameters, scores 85.6% on PinchBench (the top open model in its class for agent tasks). Runs on DGX Spark and RTX PRO workstations.
- Nemotron 3 Nano (4B) — compact enough for GeForce RTX systems. Strong instruction-following and tool use with minimal VRAM.
- Mistral Small 4 (119B) — Mistral's latest, optimized for chat, coding, and agent tasks on DGX Spark.
- Qwen 3.5 optimizations — Alibaba's models with native vision support and 262K context windows, tuned for NVIDIA GPUs.
Why this matters
The trajectory is clear: AI agents are moving from cloud-dependent services toward locally hosted, always-on systems that run on dedicated hardware. NVIDIA is positioning DGX Spark and high-end RTX machines as "agent computers" — purpose-built for running your personal AI stack.
For anyone building with or around AI tools, the takeaway is practical: the models that can run locally are getting good enough that "local-first" isn't just a privacy preference anymore — it's becoming a viable architecture for real agent workflows.
NemoClaw is open source and available now. More details on NVIDIA's NemoClaw page and the GTC blog post.