site:www.marktechpost.com

Alibaba’s Qwen AI Releases Compact Dense Qwen3-VL 4B/8B (Instruct & Thinking) With FP8 Checkpoints

Do you actually need a giant VLM when dense Qwen3-VL 4B/8B (Instruct/Thinking) with FP8 runs in low VRAM yet retains 256K→1M context and the full capability surface? Alibaba’s Qwen team has expanded ...

marktechpost

QeRL: NVFP4-Quantized Reinforcement Learning (RL) Brings 32B LLM Training to a Single H100—While Improving Exploration

What QeRL changes in the Reinforcement Learning (RL) loop? Most RLHF/GRPO/DAPO pipelines spend the bulk of wall-clock time in rollouts (token generation). QeRL shifts the policy’s weight path to NVFP4 ...

marktechpost

Microsoft AI Debuts MAI-Image-1: An In-House Text-to-Image Model that Enters LMArena’s Top-10

Michal Sutter is a data science professional with a Master of Science in Data Science from the University of Padova. With a solid foundation in statistical analysis, machine learning, and data ...

marktechpost

Andrej Karpathy Releases ‘nanochat’: A Minimal, End-to-End ChatGPT-Style Pipeline You Can Train in ~4 Hours for ~$100

Andrej Karpathy has open-sourced nanochat, a compact, dependency-light codebase that implements a full ChatGPT-style stack—from tokenizer training to web UI inference—aimed at reproducible, hackable ...

marktechpost

Agentic Context Engineering (ACE): Self-Improving LLMs via Evolving Contexts, Not Fine-Tuning

ACE positions “context engineering” as a first-class alternative to parameter updates. Instead of compressing instructions into short prompts, ACE accumulates and organizes domain-specific tactics ...

marktechpost

Sentient AI Releases ROMA: An Open-Source and AGI Focused Meta-Agent Framework for Building AI Agents with Hierarchical Task Execution

ROMA provides a setup.sh quick start with Docker Setup (Recommended) or Native Setup, plus flags for E2B sandbox integration (--e2b, --test-e2b). The stack lists Backend: Python 3.12+ with ...

marktechpost

ServiceNow AI Research Releases DRBench, a Realistic Enterprise Deep-Research Benchmark

ServiceNow Research has released DRBench, a benchmark and runnable environment to evaluate “deep research” agents on open-ended enterprise tasks that require synthesizing facts from both public web ...

marktechpost

Google Introduces Speech-to-Retrieval (S2R) Approach that Maps a Spoken Query Directly to an Embedding and Retrieves Information without First Converting Speech to Text

In the traditional cascade modeling approach, automatic speech recognition (ASR) first produces a single text string, which is then passed to retrieval. Small transcription errors can change query ...

marktechpost

ByteDance AI Introduces Doubao-1.5-Pro Language Model with a ‘Deep Thinking’ Mode and Matches GPT 4o and Claude 3.5 Sonnet Benchmarks at 50x Cheaper

The artificial intelligence (AI) landscape is evolving rapidly, but this growth is accompanied by significant challenges. High costs of developing and deploying large-scale AI models and the ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results

Alibaba’s Qwen AI Releases Compact Dense Qwen3-VL 4B/8B (Instruct & Thinking) With FP8 Checkpoints

QeRL: NVFP4-Quantized Reinforcement Learning (RL) Brings 32B LLM Training to a Single H100—While Improving Exploration

Microsoft AI Debuts MAI-Image-1: An In-House Text-to-Image Model that Enters LMArena’s Top-10

Andrej Karpathy Releases ‘nanochat’: A Minimal, End-to-End ChatGPT-Style Pipeline You Can Train in ~4 Hours for ~$100

Agentic Context Engineering (ACE): Self-Improving LLMs via Evolving Contexts, Not Fine-Tuning

Sentient AI Releases ROMA: An Open-Source and AGI Focused Meta-Agent Framework for Building AI Agents with Hierarchical Task Execution

ServiceNow AI Research Releases DRBench, a Realistic Enterprise Deep-Research Benchmark

Google Introduces Speech-to-Retrieval (S2R) Approach that Maps a Spoken Query Directly to an Embedding and Retrieves Information without First Converting Speech to Text

AI Agents

Context Engineering

How to Evaluate Your RAG Pipeline with Synthetic Data?

ByteDance AI Introduces Doubao-1.5-Pro Language Model with a ‘Deep Thinking’ Mode and Matches GPT 4o and Claude 3.5 Sonnet Benchmarks at 50x Cheaper