
From Fine-Tuning to Prompt Engineering: Theory and Practice for Efficient Transformer Adaptation
Source: MarkTechPost The Challenge of Fine-Tuning Large Transformer Models Self-attention enables transformer models to capture long-range dependencies in...

A sounding board for strengthening the student experience
Source: MIT News – Artificial intelligence During his first year at MIT in 2021, Matthew Caren ’25 received...

Unpacking the bias of large language models
Source: MIT News – Artificial intelligence Research has shown that large language models (LLMs) tend to overemphasize information...

EPFL Researchers Introduce MEMOIR: A Scalable Framework for Lifelong Model Editing in LLMs
Source: MarkTechPost The Challenge of Updating LLM Knowledge LLMs have shown outstanding performance for various tasks through extensive...
StepFun Introduces Step-Audio-AQAA: A Fully End-to-End Audio Language Model for Natural Voice Interaction
Source: MarkTechPost Rethinking Audio-Based Human-Computer Interaction Machines that can respond to human speech with equally expressive and natural...

Internal Coherence Maximization (ICM): A Label-Free, Unsupervised Training Framework for LLMs
Source: MarkTechPost Post-training methods for pre-trained language models (LMs) depend on human supervision through demonstrations or preference feedback...

MemOS: A Memory-Centric Operating System for Evolving and Adaptive Large Language Models
Source: MarkTechPost LLMs are increasingly seen as key to achieving Artificial General Intelligence (AGI), but they face major...
Sakana AI Introduces Text-to-LoRA (T2L): A Hypernetwork that Generates Task-Specific LLM Adapters (LoRAs) based on a Text Description of the Task
Source: MarkTechPost Transformer models have significantly influenced how AI systems approach tasks in natural language understanding, translation, and...

OpenThoughts: A Scalable Supervised Fine-Tuning SFT Data Curation Pipeline for Reasoning Models
Source: MarkTechPost The Growing Complexity of Reasoning Data Curation Recent reasoning models, such as DeepSeek-R1 and o3, have...

Bringing meaning into technology deployment
Source: MIT News – Artificial intelligence In 15 TED Talk-style presentations, MIT faculty recently discussed their pioneering research...