Machine Learning – Page 18 – aifuturefront.com

Researchers from Tsinghua and ModelBest Release Ultra-FineWeb: A Trillion-Token Dataset Enhancing LLM Accuracy Across Benchmarks

Source: MarkTechPost The data quality used in pretraining LLMs has become increasingly critical to their success. To build...

May 15, 2025

Meta AI Introduces CATransformers: A Carbon-Aware Machine Learning Framework to Co-Optimize AI Models and Hardware for Sustainable Edge Deployment

Source: MarkTechPost As machine learning systems become integral to various applications, from recommendation engines to autonomous systems, there’s...

May 14, 2025

Study shows vision-language models can’t handle queries with negation words

Source: MIT News – Artificial intelligence Imagine a radiologist examining a chest X-ray from a new patient. She...

May 14, 2025

Rethinking Toxic Data in LLM Pretraining: A Co-Design Approach for Improved Steerability and Detoxification

Source: MarkTechPost In the pretraining of LLMs, the quality of training data is crucial in determining model performance....

May 14, 2025

Reinforcement Learning, Not Fine-Tuning: Nemotron-Tool-N1 Trains LLMs to Use Tools with Minimal Supervision and Maximum Generalization

Source: MarkTechPost Equipping LLMs with external tools or functions has become popular, showing great performance across diverse domains....

May 13, 2025

RL^V: Unifying Reasoning and Verification in Language Models through Value-Free Reinforcement Learning

Source: MarkTechPost LLMs have gained outstanding reasoning capabilities through reinforcement learning (RL) on correctness rewards. Modern RL algorithms...

May 13, 2025

PrimeIntellect Releases INTELLECT-2: A 32B Reasoning Model Trained via Distributed Asynchronous Reinforcement Learning

Source: MarkTechPost As language models scale in parameter count and reasoning complexity, traditional centralized training pipelines face increasing...

May 12, 2025

This AI Paper Introduces Effective State-Size (ESS): A Metric to Quantify Memory Utilization in Sequence Models for Performance Optimization

Source: MarkTechPost In machine learning, sequence models are designed to process data with temporal structure, such as language,...

May 11, 2025

LightOn AI Released GTE-ModernColBERT-v1: A Scalable Token-Level Semantic Search Model for Long-Document Retrieval and Benchmark-Leading Performance

Source: MarkTechPost Semantic retrieval focuses on understanding the meaning behind text rather than matching keywords, allowing systems to...

May 11, 2025

ZeroSearch from Alibaba Uses Reinforcement Learning and Simulated Documents to Teach LLMs Retrieval Without Real-Time Search

Source: MarkTechPost Large language models are now central to various applications, from coding to academic tutoring and automated...

May 10, 2025