Tag

#CUDA

5 articles

A Coding Implementation to Master GPU Computing with CuPy, Custom CUDA Kernels, Streams, Sparse Matrices, and Profiling

This article explains CuPy, a GPU-accelerated Python library for high-performance numerical computing, and how it leverages CUDA kernels, streams, and sparse matrices for machine learning workloads.

May 1441

tools

NVIDIA AI Just Released cuda-oxide: An Experimental Rust-to-CUDA Compiler Backend that Compiles SIMT GPU Kernels Directly to PTX

NVIDIA has released cuda-oxide, an experimental Rust-to-CUDA compiler backend that enables direct compilation of SIMT GPU kernels to PTX bytecode, streamlining GPU development for Rust developers.

May 1169

CUDA Proves Nvidia Is a Software Company

Nvidia's CUDA software platform has become a critical competitive advantage, creating a formidable barrier that rivals traditional hardware superiority in the AI industry.

May 1157

Sakana AI and NVIDIA Introduce TwELL with CUDA Kernels for 20.5% Inference and 21.9% Training Speedup in LLMs

Learn to implement sparse matrix operations using CUDA kernels to achieve 20.5% inference and 21.9% training speedup in LLMs, following the TwELL approach by Sakana AI and NVIDIA.

May 1061

A Coding Tutorial for Running PrismML Bonsai 1-Bit LLM on CUDA with GGUF, Benchmarking, Chat, JSON, and RAG

Learn how to run a tiny but powerful AI model called Bonsai 1-bit LLM on your computer using CUDA and GGUF technology.

Apr 1883