Everything happening in AI, tuned for you
Models · Research · Tools · Safety — one feed, daily
Boston Children’s uses AI to unlock new diagnoses
How Braintrust turns customer requests into code with Codex
Strengthening societal resilience with Rosalind Biodefense
Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler
A shared playbook for trustworthy third party evaluations
How Endava builds an agentic organization with Codex
OpenAI’s Frontier Governance Framework
MUFG aims to become AI-native with OpenAI
ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM
Cisco and OpenAI redefine enterprise engineering with Codex
Building self-improving tax agents with Codex
Reachy Mini goes fully local
Shipping a Trillion Parameters With a Hub Bucket: Delta Weight Sync in TRL
Election information and safeguards in 2026
Warp’s big bet on building open source with GPT-5.5
Claude Sonnet 4.6 tops GPQA Diamond with 81.2%
Anthropic's latest model beats human experts on graduate-level science questions across physics, chemistry and biology. The model achieves state-of-the-art results without sacrificing speed.
Harness, Scaffold, and the AI Agent Terms Worth Getting Right
OpenAI, Grupo Folha and Grupo UOL announce strategic content partnership
Stanford: AI agents now match senior radiologists on CT reads
A multi-agent system evaluated 12,000 chest CTs achieving 94.3% accuracy, on par with board-certified radiologists. The system uses vision models and domain-specific reasoning chains.
Cursor 1.0 — full codebase awareness + MCP support ships
After 18 months in beta, Cursor hits 1.0 with whole-repo indexing, natural language refactors, and MCP tool integration. The release includes a redesigned agent mode.
DeepMind: Constitutional RL cuts harmful outputs by 73%
New training method embeds 58 constitutional principles directly into the reward model, reducing policy violations without sacrificing capability. The approach generalises across model sizes.
Mistral 7B v3 — 128k context, fully open weights
Mistral drops their best open model yet: 128k token context, Apache 2.0 license, and scores that rival GPT-4o-mini. Optimised for long-document tasks and RAG applications.
Emergent reasoning appears at 30B params: new scaling law
MIT CSAIL paper identifies a sharp phase transition in reasoning ability at ~30B parameters, challenging previous compute-optimal training assumptions. Key implications for resource allocation.
OpenAI launches Realtime API for voice assistants
The Realtime API enables sub-300ms voice-to-voice latency, making responsive voice assistants practical. Supports interruption handling and speaker diarization out of the box.
EU AI Act: first enforcement actions filed against three vendors
European regulators filed the first formal enforcement actions under the EU AI Act against vendors of high-risk AI systems. Cases involve medical AI and recruitment screening tools.