Tag: llm

999 articles · Page 11 / 42

Using FineREX, this examines why legal-record extraction for smuggling knowledge graphs needs domain-specific schemas and review.

hardware

CommunityJun 19, 20262026-06-19

Costs And Limits Of AI Automated Sales Workflows

AI sales automation depends less on ideas than on costs, human approval workflows, and policy and channel limits.

hardware

SourceJun 19, 20262026-06-19

Evaluating Black-Box Uncertainty in Production LLM APIs

A look at how black-box methods estimate hallucination and error risk in API-only LLMs, and where their limits remain.

agi

CommunityJun 19, 20262026-06-19

How Close Chinese LLMs Are to Frontier Models

Chinese LLM progress is best judged by benchmarks, independent evaluations, and cost efficiency rather than executive claims.

hardware

SourceJun 19, 20262026-06-19

Understanding LLM Failure Modes in RTL Generation

Examines LLM failure modes in RTL generation and why simulation feedback loops matter beyond pass rates.

hardware

SourceJun 19, 20262026-06-19

Self-Review Alignment for Safer LLM Reasoning Outputs

Explores combining a conscience step with DPO so LLMs review reasoning during inference while balancing safety and performance.

hardware

CommunityJun 18, 20262026-06-18

Alignment and Safety Guardrails Shape Model Behavior

Shows with public metrics that alignment and guardrails affect instruction following, harmful output, and hallucination trade-offs.

hardware

SourceJun 18, 20262026-06-18

Decentralized Prefix Caching for P2P LLM Serving

Examines decentralized routing for prefix cache reuse in P2P LLM inference, including benefits, limits, and fit.

agi

SourceJun 18, 20262026-06-18

LLM Stories Repeat More Than Human Narratives

Research suggests LLM-generated stories resemble each other more than human-written narratives, raising concerns about repetition.

agi

SourceJun 18, 20262026-06-18

Designing Open P2P Networks for Distributed AI Agents

Why open P2P agent networks need identity, reputation, permissions, and auditability before performance claims.

hardware

SourceJun 18, 20262026-06-18

See First, Answer Later in Multimodal LLM Alignment

A paper issue on pre-aligning multimodal LLMs to use sufficient visual evidence before answering.

llm

SourceJun 18, 20262026-06-18

Tool Abstraction Shapes Optical Network Agent Performance

A study showing domain-specific composite tools improved correctness and cut token use in optical network ReAct agents.

hardware

CommunityJun 18, 20262026-06-18

Why LLM Reasoning Needs More Than Correct Answers

LLM reasoning should be judged not only by accuracy, but also by consistency, constraint tracking, and self-checking.

hardware

SourceJun 12, 20262026-06-12

AI Coding Tools and the Architecture Smell Illusion

AI coding tools lowered ASD, but total smells stayed flat. The gain may reflect LOC growth, not real architecture improvement.

k-ai-pulse

RoundupJun 12, 20262026-06-12

AI Resource Roundup (24h) - 2026-06-12

A curated link roundup from recently collected official updates and tech news.

hardware

SourceJun 12, 20262026-06-12

Can Screenshots Alone Evaluate Mobile UX Quality

A look at UXBench, a benchmark that evaluates usability, consistency, and clarity from mobile UI screenshots alone.

llm

SourceJun 12, 20262026-06-12

CAPED Reduces Privacy Exposure in Mobile GUI Agents

CAPED filters mobile screenshots before remote agents see them, reducing incidental privacy exposure while preserving task utility.

llm

SourceJun 12, 20262026-06-12

Conditional Debate Routing for Efficient Multi-Agent LLM Reasoning

A look at conditional multi-agent reasoning that stops on early agreement and debates only when answers diverge.

agi

SourceJun 12, 20262026-06-12

Designing Execution Environments for Autonomous Science Agents

EurekAgent argues execution environment design matters more than prompts for autonomous science agents.

llm

SourceJun 12, 20262026-06-12

LLM Agents for Autonomous Variational Quantum Circuit Design

A look at arXiv 2606.13380, which uses a seven-part closed-loop LLM agent system to automate variational quantum circuit design.

hardware

CommunityJun 12, 20262026-06-12

Open Weight Access Redefines AI Strategy And Deployment

Open-weight AI matters not just for cost, but for weight access, modification, redistribution, and deployment control.

hardware

SourceJun 12, 20262026-06-12

Reframing Shielded RL as Design-Time Structure Analysis

A concise look at shielded RL reinterpreted as a design-time tool for structural safety analysis, not runtime blocking.

hardware

CommunityJun 12, 20262026-06-12

Rethinking AI Job Impacts Beyond Mass Unemployment Fears

Official reports suggest AI is reshaping tasks and productivity before causing broad job losses.

agi

SourceJun 12, 20262026-06-12

Rethinking AI Loss of Control Through Operational Definitions

Examines vague AI loss-of-control language and reframes it around goals, audits, interruption, and rollback.

Aionda

Tag: llm

Can General Models Extract Legal Networks Reliably

Costs And Limits Of AI Automated Sales Workflows

Evaluating Black-Box Uncertainty in Production LLM APIs

How Close Chinese LLMs Are to Frontier Models

Understanding LLM Failure Modes in RTL Generation

Self-Review Alignment for Safer LLM Reasoning Outputs

Alignment and Safety Guardrails Shape Model Behavior

Decentralized Prefix Caching for P2P LLM Serving

LLM Stories Repeat More Than Human Narratives

Designing Open P2P Networks for Distributed AI Agents

See First, Answer Later in Multimodal LLM Alignment

Tool Abstraction Shapes Optical Network Agent Performance

Why LLM Reasoning Needs More Than Correct Answers

AI Coding Tools and the Architecture Smell Illusion

AI Resource Roundup (24h) - 2026-06-12

Can Screenshots Alone Evaluate Mobile UX Quality

CAPED Reduces Privacy Exposure in Mobile GUI Agents

Conditional Debate Routing for Efficient Multi-Agent LLM Reasoning

Designing Execution Environments for Autonomous Science Agents

LLM Agents for Autonomous Variational Quantum Circuit Design

Open Weight Access Redefines AI Strategy And Deployment

Reframing Shielded RL as Design-Time Structure Analysis

Rethinking AI Job Impacts Beyond Mass Unemployment Fears

Rethinking AI Loss of Control Through Operational Definitions