Machine Learning – Page 15 – aifuturefront.com

Georgia Tech and Stanford Researchers Introduce MLE-Dojo: A Gym-Style Framework Designed for Training, Evaluating, and Benchmarking Autonomous Machine Learning Engineering (MLE) Agents

Source: MarkTechPost Machine learning engineering (MLE) involves developing, tuning, and deploying machine learning systems that require iterative experimentation,...

May 15, 2025

Researchers from Tsinghua and ModelBest Release Ultra-FineWeb: A Trillion-Token Dataset Enhancing LLM Accuracy Across Benchmarks

Source: MarkTechPost The data quality used in pretraining LLMs has become increasingly critical to their success. To build...

May 15, 2025

Meta AI Introduces CATransformers: A Carbon-Aware Machine Learning Framework to Co-Optimize AI Models and Hardware for Sustainable Edge Deployment

Source: MarkTechPost As machine learning systems become integral to various applications, from recommendation engines to autonomous systems, there’s...

May 14, 2025

Study shows vision-language models can’t handle queries with negation words

Source: MIT News – Artificial intelligence Imagine a radiologist examining a chest X-ray from a new patient. She...

May 14, 2025

Rethinking Toxic Data in LLM Pretraining: A Co-Design Approach for Improved Steerability and Detoxification

Source: MarkTechPost In the pretraining of LLMs, the quality of training data is crucial in determining model performance....

May 14, 2025

Reinforcement Learning, Not Fine-Tuning: Nemotron-Tool-N1 Trains LLMs to Use Tools with Minimal Supervision and Maximum Generalization

Source: MarkTechPost Equipping LLMs with external tools or functions has become popular, showing great performance across diverse domains....

May 13, 2025

RL^V: Unifying Reasoning and Verification in Language Models through Value-Free Reinforcement Learning

Source: MarkTechPost LLMs have gained outstanding reasoning capabilities through reinforcement learning (RL) on correctness rewards. Modern RL algorithms...

May 13, 2025

PrimeIntellect Releases INTELLECT-2: A 32B Reasoning Model Trained via Distributed Asynchronous Reinforcement Learning

Source: MarkTechPost As language models scale in parameter count and reasoning complexity, traditional centralized training pipelines face increasing...

May 12, 2025

This AI Paper Introduces Effective State-Size (ESS): A Metric to Quantify Memory Utilization in Sequence Models for Performance Optimization

Source: MarkTechPost In machine learning, sequence models are designed to process data with temporal structure, such as language,...

May 11, 2025

LightOn AI Released GTE-ModernColBERT-v1: A Scalable Token-Level Semantic Search Model for Long-Document Retrieval and Benchmark-Leading Performance

Source: MarkTechPost Semantic retrieval focuses on understanding the meaning behind text rather than matching keywords, allowing systems to...

May 11, 2025