NVIDIA Releases Polar, a Token-Faithful Rollout Framework for GRPO Training Across Codex, Claude Code, and Qwen Code
Source: MarkTechPost Reinforcement learning for language agents is growing more complex. Agents now manage multi-turn tool use, long-running...
Together AI Open-Sources OSCAR: An Attention-Aware 2-Bit KV Cache Quantization System for Long-Context LLM Serving
Source: MarkTechPost Long-context inference makes the KV cache one of the main costs of serving LLMs. During autoregressive...
NVIDIA AI Releases Gated DeltaNet-2: A Linear Attention Layer That Decouples Erase and Write in the Delta Rule
Source: MarkTechPost Linear attention replaces the unbounded KV cache of softmax attention with a fixed-size recurrent state. This...
Nous Research Releases Contrastive Neuron Attribution (CNA): Sparse MLP Circuit Steering Without SAE Training or Weight Modification
Source: MarkTechPost Instruction-tuned language models refuse harmful requests. But which part of the model is actually responsible —...
Perplexity Open-Sources Bumblebee: A Read-Only Supply-Chain Scanner for Developer Endpoints
Source: MarkTechPost Attackers increasingly target the packages, editor extensions, and AI tool configs on developer machines and not...
Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents (4B/9B/27B) That Outperform OpenAI Operator and Gemini 2.5 Computer Use on Online-Mind2Web
Source: MarkTechPost Microsoft Research’s AI Frontiers lab released Fara1.5. It is a family of computer-use agent (CUA) models...
Qwen Introduces Qwen3.7-Max: A Reasoning Agent Model With a 1M-Token Context Window
Source: MarkTechPost Most AI models today are not designed for sustained, multi-step autonomous execution. Tasks like running hundreds...
Cohere Releases Command A+: A 218B Sparse MoE Model for Agentic Workflows That Runs on as Few as Two H100 GPUs
Source: MarkTechPost Cohere just released Command A+, as an open-source model targeting enterprise agentic workflows. Available under an...
Meet Turbovec: A Rust Vector Index with Python Bindings, and Built on Google’s TurboQuant Algorithm
Source: MarkTechPost Vector search underpins most retrieval-augmented generation (RAG) pipelines. At scale, it gets expensive. Storing 10 million...
NVIDIA AI Releases Nemotron-Labs-Diffusion: A Tri-Mode Language Model with 6× Tokens Per Forward Over Qwen3-8B
Source: MarkTechPost NVIDIA researchers have released Nemotron-Labs-Diffusion, a language model family that unifies three decoding modes in one...