Tag: llm

999 articles · Page 6 / 42

Why LLM self-review should be judged by generator-evaluator consistency, not accuracy alone, in agent workflows.

SourceJul 1, 20262026-07-01

How Agents Should Help Users Form Preferences

Why AI agents must move beyond preference elicitation to support preference formation, with evaluation and safety in view.

hardware

SourceJul 1, 20262026-07-01

LLM Agents Split Research Ideas And Verification

How a dual-agent LLM pipeline separates proposing tighter relaxations from verification in automated research.

hardware

CommunityJul 1, 20262026-07-01

Model Evaluation Now Depends on Quotas and Throughput

Model value now depends on performance, quotas, throughput, and pricing, not benchmark scores alone.

agi

SourceJul 1, 20262026-07-01

When Public AI Transparency Becomes Mere Paper Compliance

Public-sector AI disclosures can look compliant yet fail users if they lack meaningful, actionable information.

llm

SourceJul 1, 20262026-07-01

Rethinking AI Tutors Beyond Cloud Chatbots in Education

In education, AI design matters more than raw performance, with student privacy, data minimization, and teacher control at stake.

llm

CommunityJul 1, 20262026-07-01

What AI Pricing Hides About Safety Operations

Commercial APIs and open-weight models differ not just in performance, but in who runs blocking, logging, and policy enforcement.

hardware

SourceJun 30, 20262026-06-30

AI Paper Review Between Assistance and Official Evaluation

Examines Google PAT's paper-checking results and limits, and where AI should fit in academic review workflows.

k-ai-pulse

RoundupJun 30, 20262026-06-30

AI Resource Roundup (24h) - 2026-06-30

A curated link roundup from recently collected official updates and tech news.

hardware

NewsJun 30, 20262026-06-30

Claude Science Focuses on Scientific Research Workflows

Anthropic's Claude Science emphasizes integrating tools, data, compute, and review into one scientific workflow.

hardware

CommunityJun 30, 20262026-06-30

Deploying AI for Rhythm Games by Function

Rhythm game AI works best when API and local inference are split by function, balancing latency, limits, cost, and memory.

agi

CommunityJun 30, 20262026-06-30

How To Read AI Firms Calling For Regulation

How to assess whether AI firms' calls for regulation signal safety commitments, competitive strategy, or both.

hardware

SourceJun 30, 20262026-06-30

Smaller Fast Weights Beat Bigger LSTMs in Traffic Forecasting

A compact fast-weight recurrent model reported lower pooled RMSE than a larger LSTM using only 22.4% of the parameters.

llm

SourceJun 30, 20262026-06-30

Why Apple Moved Security Patches Ahead of iOS

A look at Apple’s reported early security patch rollout and why patch timing matters more in an AI-driven threat environment.

llm

SourceJun 29, 20262026-06-29

Beyond PR Passes: Governing Repositories for Coding Agents

Autonomous coding agents should be evaluated beyond PR pass rates, with repository-level risk and structural health in view.

hardware

SourceJun 29, 20262026-06-29

Class Frequency Guided Noise Schedules in Diffusion Models

Examines how class imbalance affects score learning in diffusion models and why frequency-guided noise schedules matter.

hardware

CommunityJun 29, 20262026-06-29

Cloud LLM Costs Versus Local Deployment Decisions

Compare cloud token-based LLM pricing with local deployment to assess cost, control, latency, and break-even conditions.

agi

SourceJun 29, 20262026-06-29

CoIn Rethinks 3D Scene Editing Without Precise Masks

CoIn links 2D inpainting and 3DGS to reduce reliance on precise multiview masks in 3D scene editing workflows.

llm

CommunityJun 29, 20262026-06-29

Do Language Models Really Build Stable World Models

Strong language performance may not imply a stable world model. Reassessing LLMs through failures in time, space, and physics.

llm

SourceJun 29, 20262026-06-29

GRACE Rethinks VLM Quantization With QAT and Distillation

How GRACE combines QAT and distillation to balance accuracy and deployment cost in vision-language models.

llm

SourceJun 29, 20262026-06-29

LLM Data Fusion for Single and Multi Truth

A look at using LLMs for single- and multi-truth data fusion, with implications for RAG, memory, and data quality.

hardware

SourceJun 29, 20262026-06-29

Measuring Domain Gaps in Cross-Sensor Diffusion Super-Resolution

Why top satellite SR models on synthetic data may not lead on real cross-sensor imagery, and how to evaluate the gap.

llm

SourceJun 29, 20262026-06-29

MMG-Pop Rethinks Social Popularity Prediction Across Platforms

MMG-Pop uses multimodal and temporal graph signals from Bluesky and Reddit to reassess social popularity prediction.

hardware

CommunityJun 29, 20262026-06-29

Model Distillation, API Control, and Sovereign AI Risks

How model distillation expands from efficiency to API cost, competitive training, and control over data and compute.

Aionda

Tag: llm

Why Generator Evaluator Consistency Matters In LLM Self-Review

How Agents Should Help Users Form Preferences

LLM Agents Split Research Ideas And Verification

Model Evaluation Now Depends on Quotas and Throughput

When Public AI Transparency Becomes Mere Paper Compliance

Rethinking AI Tutors Beyond Cloud Chatbots in Education

What AI Pricing Hides About Safety Operations

AI Paper Review Between Assistance and Official Evaluation

AI Resource Roundup (24h) - 2026-06-30

Claude Science Focuses on Scientific Research Workflows

Deploying AI for Rhythm Games by Function

How To Read AI Firms Calling For Regulation

Smaller Fast Weights Beat Bigger LSTMs in Traffic Forecasting

Why Apple Moved Security Patches Ahead of iOS

Beyond PR Passes: Governing Repositories for Coding Agents

Class Frequency Guided Noise Schedules in Diffusion Models

Cloud LLM Costs Versus Local Deployment Decisions

CoIn Rethinks 3D Scene Editing Without Precise Masks

Do Language Models Really Build Stable World Models

GRACE Rethinks VLM Quantization With QAT and Distillation

LLM Data Fusion for Single and Multi Truth

Measuring Domain Gaps in Cross-Sensor Diffusion Super-Resolution

MMG-Pop Rethinks Social Popularity Prediction Across Platforms

Model Distillation, API Control, and Sovereign AI Risks