Student-Centered Data Selection for Reasoning Distillation
DMC suggests student-model compatibility, not just data quality, may matter more for reasoning distillation.
DMC suggests student-model compatibility, not just data quality, may matter more for reasoning distillation.
Coding model differences appear not in prose quality but in planning, tool use, and context handling scope.
A look at how mcp-proto-okn connects natural language to scientific knowledge graph queries and reproducible workflows.
Examines limits of RTG-only conditioning and how Q-guided alignment aims to improve controllability and reliability in offline RL.
A look at reducing full-vocabulary search overhead in CFG-constrained decoding for structured output workloads.
TaxDistill argues pretraining data composition and distilled genome representations matter more than model size.
A look at a paper arguing that aggregating full reasoning traces can outperform answer-only consensus in multi-agent systems.
CyberJurors evaluates agent systems on multi-round, multimodal evidence handling and platform rule adaptation in e-commerce disputes.
Why multimodal AI still struggles with charts and scientific figures, and how to verify image-based conclusions in practice.
Examines human-AI collaboration for replicability prediction, balancing speed and consistency against bias, accountability, and privacy risks.
MOV-Bench highlights evaluation gaps in multi-hop audio-visual reasoning and shows consistent gains from agentic search.
A concise look at how PON mitigates input distribution mismatch in heterogeneous FedRL simulation environments.
A look at structuring table QA with guided cell navigation and staged inference to improve accuracy and verify evidence paths.
MOCHA treats agent skills as multi-field artifacts and argues they must be optimized with platform constraints in mind.
Examines how offloading and preemption affect multi-model LLM serving under GPU memory limits and model-specific costs.
COBALT proposes smartphone and cloud teleoperation to reduce data collection bottlenecks in robot imitation learning.
In handwritten math grading, process understanding matters more than OCR, requiring rubric-based review and human checks.
A study on claim verification that proposes ternary decisions and explainable argumentation under incomplete or conflicting evidence.
How prompt-guided image compression for VLMs shifts focus from human visual quality to preserving clues needed for tasks.
A case of wrapping Florence-2 with ROS 2 topics, services, and actions for local inference and reproducible integration.
A look at when entity resolution needs full GNN extensions and when task-specific minimal graph structure is enough.
How serverless gossip learning and carbon-aware orchestration address unreliable connectivity in maritime AI systems.
A curated link roundup from recently collected official updates and tech news.
Anthropic’s 1,250 AI-led interviews show how user research is shaping feature priorities and safety design.