Qwen Researchers Release Qwen3-TTS: an Open Multilingual TTS Suite with Real-Time Latency and Fine-Grained Voice Control
Source: MarkTechPost Alibaba Cloud’s Qwen team has open-sourced Qwen3-TTS, a family of multilingual text-to-speech models that target three...
Microsoft Releases VibeVoice-ASR: A Unified Speech-to-Text Model Designed to Handle 60-Minute Long-Form Audio in a Single Pass
Source: MarkTechPost Microsoft has released VibeVoice-ASR as part of the VibeVoice family of open source frontier voice AI...
FlashLabs Researchers Release Chroma 1.0: A 4B Real Time Speech Dialogue Model With Personalized Voice Cloning
Source: MarkTechPost Chroma 1.0 is a real time speech to speech dialogue model that takes audio as input...
Inworld AI Releases TTS-1.5 For Realtime, Production Grade Voice Agents
Source: MarkTechPost Inworld AI has introduced Inworld TTS-1.5, an upgrade to its TTS-1 family that targets realtime voice...
Salesforce AI Introduces FOFPred: A Language-Driven Future Optical Flow Prediction Framework that Enables Improved Robot Control and Video Generation
Source: MarkTechPost Salesforce AI research team present FOFPred, a language driven future optical flow prediction framework that connects...
How AutoGluon Enables Modern AutoML Pipelines for Production-Grade Tabular Models with Ensembling and Distillation
Source: MarkTechPost In this tutorial, we build a production-grade tabular machine learning pipeline using AutoGluon, taking a real-world...
Liquid AI Releases LFM2.5-1.2B-Thinking: a 1.2B Parameter Reasoning Model That Fits Under 1 GB On-Device
Source: MarkTechPost Liquid AI has released LFM2.5-1.2B-Thinking, a 1.2 billion parameter reasoning model that runs fully on device...
Zhipu AI Releases GLM-4.7-Flash: A 30B-A3B MoE Model for Efficient Local Coding and Agents
Source: MarkTechPost GLM-4.7-Flash is a new member of the GLM 4.7 family and targets developers who want strong...
How to Design a Fully Streaming Voice Agent with End-to-End Latency Budgets, Incremental ASR, LLM Streaming, and Real-Time TTS
Source: MarkTechPost In this tutorial, we build an end-to-end streaming voice agent that mirrors how modern low-latency conversational...
Microsoft Research Releases OptiMind: A 20B Parameter Model that Turns Natural Language into Solver Ready Optimization Models
Source: MarkTechPost Microsoft Research has released OptiMind, an AI based system that converts natural language descriptions of complex...