Tag

#NVIDIA

51 articles

Validating Distributed LLM Serving Benchmarks with NVIDIA srt-slurm, SLURM Recipes, Parameter Sweeps, and Pareto Analysis

NVIDIA's srt-slurm framework simplifies distributed LLM serving benchmarking using SLURM, enabling reproducible workflows and advanced performance analysis.

Jul 214

NVIDIA Releases Cosmos 3 Edge: A 4B-Parameter Open World Model That Reasons and Generates Robot Actions On-Device

NVIDIA has released Cosmos 3 Edge, a 4-billion-parameter open world model designed to run entirely on-device, enabling robots to reason and act autonomously in real-time environments.

Jul 2012

NVIDIA AI Releases Nemotron 3 Embed: An Open Embedding Collection Whose 8B Checkpoint Ranks #1 on RTEB

NVIDIA has released Nemotron 3 Embed, an open embedding collection featuring three models, with the 8B checkpoint ranking #1 on the RTEB benchmark.

Jul 1633

tech

Nokia’s AI-RAN platform: a radio comeback that runs on NVIDIA

Learn to build and simulate an AI-powered radio access network platform similar to Nokia's AI-RAN, using NVIDIA's Aerial system and PyTorch for spectrum optimization.

Jul 1420

tech

Meet Nemotron Labs 3 Puzzle 75B A9B: A Compressed Hybrid MoE LLM Delivering 2.03x Server Throughput

NVIDIA introduces Nemotron-Labs-3-Puzzle-75B-A9B, a compressed hybrid MoE LLM delivering 2.03x server throughput, leveraging hardware-aware compression and knowledge distillation.

Jul 934

NVIDIA Releases Nemotron-Labs-3-Puzzle-75B-A9B: A Compressed Hybrid MoE LLM Delivering 2.03x Server Throughput at Matched User Throughput

Learn how NVIDIA's new AI model Nemotron-Labs-3-Puzzle-75B-A9B uses compression and smart design to work faster and more efficiently than previous versions, without sacrificing quality.

Jul 839

NVIDIA’s Cosmos-Framework Tutorial: Designing a Colab-Friendly Miniature of Cosmos 3 World Models with Omnimodal Mixture-of-Transformers

NVIDIA's Cosmos framework tutorial demonstrates how to build a Colab-friendly, miniature version of Cosmos 3 world models using omnimodal Mixture-of-Transformers. The approach enables developers to experiment with multimodal prediction using synthetic data and shared cross-modal attention.

Jul 729

NVIDIA Releases Audex (Nemotron-Labs-Audex-30B-A3B): A Unified Audio-Text LLM That Preserves the Text Intelligence of Its Backbone

Learn how NVIDIA's new AI system Audex combines audio and text processing in one powerful model, preserving text intelligence while adding speech capabilities.

Jul 722

NVIDIA BioNeMo accelerates Anthropic Claude Science

Anthropic's Claude Science platform now integrates NVIDIA's BioNeMo Agent Toolkit to accelerate computational life sciences research. The platform enables scientists to interact with AI agents using natural language, streamlining research workflows.

Jul 249

tech

Taiwan raids Super Micro offices in probe over Nvidia chip smuggling to China

Taiwanese authorities have raided Super Micro Computer Inc. and its partners in a probe over suspected smuggling of NVIDIA chips to China, reflecting growing tensions in global tech supply chains.

Jun 3044

NVIDIA BioNeMo Agent Toolkit Turns Biomolecular Models Into Callable Skills for AI Agents in Drug Discovery

NVIDIA's new BioNeMo Agent Toolkit turns biomolecular models into callable AI skills, significantly boosting task completion rates in drug discovery.

Jun 2943

tech

How to Use NVIDIA Canary-1B-v2 for ASR, Translation, and Automatic SRT Subtitle Export in Python

NVIDIA's Canary-1B-v2 model enables developers to build multilingual ASR and translation pipelines with automatic SRT subtitle export, showcasing advancements in AI-powered speech processing.

Jun 2347