Machine Learning – Page 17 – aifuturefront.com

Rethinking Toxic Data in LLM Pretraining: A Co-Design Approach for Improved Steerability and Detoxification

Source: MarkTechPost In the pretraining of LLMs, the quality of training data is crucial in determining model performance....

May 14, 2025

Reinforcement Learning, Not Fine-Tuning: Nemotron-Tool-N1 Trains LLMs to Use Tools with Minimal Supervision and Maximum Generalization

Source: MarkTechPost Equipping LLMs with external tools or functions has become popular, showing great performance across diverse domains....

May 13, 2025

RL^V: Unifying Reasoning and Verification in Language Models through Value-Free Reinforcement Learning

Source: MarkTechPost LLMs have gained outstanding reasoning capabilities through reinforcement learning (RL) on correctness rewards. Modern RL algorithms...

May 13, 2025

PrimeIntellect Releases INTELLECT-2: A 32B Reasoning Model Trained via Distributed Asynchronous Reinforcement Learning

Source: MarkTechPost As language models scale in parameter count and reasoning complexity, traditional centralized training pipelines face increasing...

May 12, 2025

This AI Paper Introduces Effective State-Size (ESS): A Metric to Quantify Memory Utilization in Sequence Models for Performance Optimization

Source: MarkTechPost In machine learning, sequence models are designed to process data with temporal structure, such as language,...

May 11, 2025

LightOn AI Released GTE-ModernColBERT-v1: A Scalable Token-Level Semantic Search Model for Long-Document Retrieval and Benchmark-Leading Performance

Source: MarkTechPost Semantic retrieval focuses on understanding the meaning behind text rather than matching keywords, allowing systems to...

May 11, 2025

ZeroSearch from Alibaba Uses Reinforcement Learning and Simulated Documents to Teach LLMs Retrieval Without Real-Time Search

Source: MarkTechPost Large language models are now central to various applications, from coding to academic tutoring and automated...

May 10, 2025

Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs with Agentic Reasoning and Dynamic Tool Use

Source: MarkTechPost LLMs have made impressive gains in complex reasoning, primarily through innovations in architecture, scale, and training...

May 10, 2025

AI That Teaches Itself: Tsinghua University’s ‘Absolute Zero’ Trains LLMs With Zero External Data

Source: MarkTechPost LLMs have shown advancements in reasoning capabilities through Reinforcement Learning with Verifiable Rewards (RLVR), which relies...

May 9, 2025

NVIDIA Open-Sources Open Code Reasoning Models (32B, 14B, 7B)

Source: MarkTechPost NVIDIA continues to push the boundaries of open AI development by open-sourcing its Open Code Reasoning...

May 8, 2025