Alibaba’s Qwen Team Launches Qwen3.7-Plus, Adding Vision, Deep Reasoning, Tool Invocation, and Autonomous Iteration on the Bailian Platform
Source: MarkTechPost Alibaba’s Qwen team has released Qwen3.7-Plus. The model is now available through Alibaba Cloud’s Bailian platform....
JetBrains Releases Mellum2: A 12B MoE Model for Fast, Specialized Tasks in Multi-Model AI Pipelines
Source: MarkTechPost JetBrains released Mellum2, open-sourcing the weights under the Apache 2.0 license. The first version of Mellum...
How to Speed Up Transformer Training Using NVIDIA Apex (FusedAdam, FusedLayerNorm) and Native torch.amp
Source: MarkTechPost In this tutorial, we work through an implementation of NVIDIA Apex, focusing on the components that...
MiniMax Releases MiniMax M3 with MSA Architecture Supporting 1M-Token Context, Native Multimodality, and Agentic Coding
Source: MarkTechPost MiniMax officially released MiniMax M3 on June 1, 2026. The model introduces MSA (MiniMax Sparse Attention),...
Meet Memory OS: A 6-Layer Open-Source Memory Stack Built on Top of Hermes Agent
Source: MarkTechPost Hermes Agent already remembers across sessions. The open-source agent from Nous Research ships with curated memory...
Parallax: A Parameterized Local Linear Attention That Keeps Softmax and Adds a Learned Covariance Correction Branch
Source: MarkTechPost The Transformer’s attention mechanism has barely changed since 2017. Most efficiency work has tried to replace...
An Implementation of the Microsoft Agent Governance Toolkit for Safe AI Agent Tool Use with Policies, Approvals, Audit Logs, and Risk Controls
Source: MarkTechPost In this tutorial, we build a governed AI-agent workflow using Microsoft’s Agent Governance Toolkit as the...
A Coding Implementation on Loguru for Designing Robust, Structured, Concurrent, and Production-Ready Python Logging Pipelines
Source: MarkTechPost In this tutorial, we implement a practical use case with Loguru, a powerful, flexible, and production-ready...
Trajectory Releases a Concurrent Multi-LoRA Training Stack for Continual Learning, Reporting a 2.81× Experiment-Throughput Gain
Source: MarkTechPost Trajectory’s concurrent multi-LoRA stack reports a 2.81× experiment-throughput gain over single-tenant RL, with all code in...
Build Skill-Augmented AI Agents with SkillNet for Search, Evaluation, Graph Analysis, and Task Planning
Source: MarkTechPost In this tutorial, we implement a SkillNet use case as a practical framework for discovering, installing,...