Tag

#LLMs

4 articles

Paged Attention in Large Language Models LLMs

Paged Attention emerges as a key solution to the GPU memory bottleneck in large language models, enabling more efficient memory usage and higher concurrency in AI inference systems.

Mar 2424

This AI Paper Introduces TinyLoRA, A 13-Parameter Fine-Tuning Method That Reaches 91.8 Percent GSM8K on Qwen2.5-7B

Researchers from Meta, Cornell, and CMU introduce TinyLoRA, a 13-parameter fine-tuning method that achieves 91.8% accuracy on GSM8K using Qwen2.5-7B.

Mar 2421

Investors bet $1 billion on Yann LeCun's vision for AI beyond LLMs

Yann LeCun has raised $1 billion for his new startup AMI Labs, marking Europe's largest seed funding round ever. Investors are betting on his vision for AI beyond LLMs.

Mar 1049

research

Talk like a graph: Encoding graphs for large language models

Feb 2768