AI Daily — 2026-05-14

English 中文

Codex debuts in ChatGPT mobile app for cross-device work · Gemini Flash rumored to match GPT-5.5 ...

Covering 34 AI news items

🔥 Top Stories

1. Codex debuts in ChatGPT mobile app for cross-device work

OpenAI previews Codex running in the ChatGPT mobile app, enabling users to start new coding tasks, review outputs, steer execution, and approve next steps directly from a phone while preserving files and project context across devices. This pocket-sized coding workflow could boost developer flexibility and speed, though it also raises considerations for security, offline access, and cross-device governance. Source-x

2. Gemini Flash rumored to match GPT-5.5 with 20x cheaper inference

Rumors circulating online claim Gemini Flash could achieve roughly 92% of GPT-5.5’s coding and reasoning performance while slashing inference costs by 15–20x and delivering sub-200ms latency for most queries, leveraging distillation and sparsity. If validated, this would substantially reshape cost-effective deployment of frontier models and AI services. Source-x

3. MinT: Scalable LoRA Training and Serving Platform

MindLab’s MinT offers a managed LoRA post-training and online-serving stack that keeps the base model resident and streams LoRA revisions through rollout, update, export, evaluation, serving, and rollback, avoiding full checkpoint materialization. The system targets workflows with many trained policies from a small set of base deployments, potentially improving latency and update cycles in production. Source-huggingface

Generated by AI News Agent | 2026-05-14

📰 Featured

Open Source & Tools

K-Dense AI launches Scientific Agent Skills with BYOK desktop co-scientist — Claude Scientific Skills are rebranded as Scientific Agent Skills and extended to work with any AI agent that supports the open Agent Skills standard; the BYOK desktop co-scientist runs locally with 40+ models, 100+ databases, and 135 skills, with data staying on-device and optional cloud scaling via Modal. Source-github
PAI v5.0.0: Life Operating System Released — Daniel Miessler’s Personal AI Infrastructure debuts v5.0.0 as a Life Operating System for agentic AI, introducing the Pulse daemon, Life Dashboard, identity layer, Algorithm v6.3.0, ISA primitive, 45 skills, 171 workflows, and 37 hooks, with privacy via containment zones and a simple install/migration path. Source-github

Multimodal & Embeddings

MulTaBench Advances Multimodal Tabular Learning with Tuned Embeddings — Fine-tuning task-specific embeddings (instead of freezing them) improves performance on Multimodal Tabular Learning benchmarks, underscoring the practical value of embedding adaptation in foundation models. Source-huggingface
LVLMs Generalize Beyond 128K Context in Training — Long-context continued pre-training extends a 7B LVLM from 32K to 128K context and offers practical recipes to generalize beyond, providing actionable guidance for long-context LVLM training. Source-huggingface

AI Safety & Partnerships

Claude Code changes trigger massive rate-limit cuts, sparking backlash — A developer reports about 40x rate-limit reductions after Claude Code changes, signaling frustration and the need for policy/product clarity in toolchains. Source-x
Anthropic, Gates Foundation Form $200M Partnership — A major philanthropic collaboration providing grants, Claude credits, and technical support to programs in global health, life sciences, education, agriculture, and economic mobility to advance safe, reliable AI applications. Source-x

Hardware & Inference

NVIDIA Releases NVFP4-Quantized Kimi-K2.6 and Kimi-2.5 — NVIDIA rolls out NVFP4-quantized versions of Moonshot AI’s Kimi models (Kimi-K2.6-NVFP4 and Kimi-K2.5-NVFP4), claimed to be ready for commercial/non-commercial use with benchmarked accuracy comparable to native INT4 baselines. Source-reddit

⚡ Quick Bites

Anthropic: US Leads Frontier AI, Two 2028 Leadership Scenarios — Snapshot of U.S. leadership scenarios for 2028 in frontier AI policy and strategy. Source-x
US Allows H200 Chip Sales to 10 Chinese Firms, AI Gap Shrinks — Potential shift in competitiveness due to relaxation of certain chip export controls. Source-x
Figure streams 8 hours of autonomous, unsupervised work — Demonstrates long-duration autonomous AI agent operation. Source-x
AnyFlow: Any-Step Video Diffusion with On-Policy Distillation — Introduces on-policy distillation for video diffusion tasks. Source-huggingface
EVA-Bench Launches End-to-End Voice Agent Benchmark — New benchmark for end-to-end voice agent evaluation. Source-huggingface
Supertonic 3 adds 31 languages for on-device TTS — Expands on-device TTS language coverage. Source-github
Ring-2.6-1T: Trillion-Parameter Real-World AI Model — A trillion-parameter model discussed in real-world contexts. Source-reddit
Scenema Audio Releases Zero-shot Expressive Voice Cloning Weights — Weights enabling zero-shot expressive cloning become available. Source-reddit
Trains Qwen3.5 to jailbreak via RL, improves defenses — RL-based jailbreak attempts and defensive improvements discussed. Source-reddit
Multi-Token Prediction boosts Qwen on LLaMA.cpp + TurboQuant — Multi-token prediction enhances Qwen on LLaMA.cpp with TurboQuant. Source-reddit
cyankiwi AWQ 4-bit Quantization: Joint Scales and Ranges Fit — AWQ-4bit with joint scales/ranges fitting approach showcased. Source-reddit
Automated AI Researcher Runs Locally with llama.cpp — Local research automation via llama.cpp highlighted. Source-reddit
MIT RLCR Teaches AI to Say ‘I’m Not Sure’ — RLCR method encourages caution in AI responses. Source-reddit
Grok Build Beta Unveils Agentic CLI for Coding and Automations — New agentic CLI tooling for coding and automations. Source-x
Shape-Rotating Calculator Inside LLM Reveals Neural Geometry — Visualizes geometric properties inside LLMs. Source-x
mattpocock releases Skills toolkit for real-engineer AI agents — Skills toolkit released for building practical AI agents. Source-github
Qwen3.6: Is the Q4 vs Q6 gap big? — Discussion on performance gap between Q4 and Q6 in Qwen3.6. Source-reddit
Llama.cpp ROCm Consumes More VRAM Than Vulkan for KV Cache — VRAM usage comparison between ROCm and Vulkan backends. Source-reddit
Local LLMs Struggle with Fictionalizing Beyond Knowledge Cutoffs — Local LLMs show limits in hallucinating beyond known data. Source-reddit
Reddit discussion: using local LLMs as daily personal knowledge bases — Local LLMs discussed as daily knowledge bases. Source-reddit
VS Code’s Agents Window Brings Local AI, Still Online with Copilot — VS Code Agents Window enables local AI usage while staying online with Copilot. Source-reddit
Posting a Real Monet as AI Sparks Art Experiment — A real Monet is used as AI in an art experiment. Source-x
Can’t paste images in Claude Code over SSH — Claude Code image paste blocked over SSH. Source-x
Relics from the prehistoric era of AI — Early-era AI relics discussed in a post. Source-x

Generated by AI News Agent | 2026-05-14