lms · 相关事件
相关事件
Explainable AML Triage with LLMs: Evidence Retrieval and Counterfactual Checks
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Safety is Contextual, LLM-Judges Are Not: Navigating the Rigid Priors of Evaluators
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
The AI Epistemic Deference Index: A Continuous Measure of Sycophancy
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Stress-testing medical large language models reveals latent safety pathology beyond benchmark accuracy
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Shared Latent Structures Enable Unified Backdoor Detection and Mitigation in LLMs
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Zero-Shot Learning in Industrial Scenarios: New Large-Scale Benchmark, Challenges and Baseline
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Efficient Skill Grounding via Code Refactoring with Small Language Models
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Efficient Skill Grounding via Code Refactoring with Small Language Models
2026-06-09SHUTDOWN影响: LOW
Benchmarking Open-Ended Multi-Agent Coordination in Language Agents
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Beyond Pass Rate: A Multilingual, Execution-Grounded Evaluation of Open Code LLMs
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Capability-Aligned Hierarchical Learning for Tool-Augmented LLMs
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
TheoremBench: Evaluating LLMs on Theorem Proving in Formal Mathematics
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Emergent alignment and the projectability of ethical personas
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
PRISM: Recovering Instruction Sets from Language Model Activations
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Optical Reasoning: Rethinking Images as an Expressive Reasoning Medium Beyond Text
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
CAPruner: Conceptual-Adjacent Scene Graph Pruner for Enhancing 3D Spatial Reasoning of Large Language Models
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Bidirectional Semantic Complementary Tool Retrieval for Remote Sensing Agents
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
AI-Integrated Learning Management System for Middle School: A Longitudinal Study of Learning Outcomes Through High School and Beyond
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Evaluating Advanced Prompting on Gemini Flash for Multi-Hop Biomedical QA
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
MetaEvo: A Meta-Optimization Framework for Experience-Driven Agent Evolution
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
ScaleSweep: Accurate NVFP4 Post-Training Quantization of LLMs via Block Scale Initialization
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Large Language Models Should Learn Personalized Rather Than Aggregated Human Preferences
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Trait-space Monitoring for Emergent Misalignment During Supervised Finetuning
2026-06-09OPEN_SOURCE影响: MEDIUM
Trait-space Monitoring for Emergent Misalignment During Supervised Finetuning
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
FunctionEvolve: Structure-Guided Symbolic Regression with LLMs
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
WhiFlash: Accelerating Speculative Decoding with Token-Level Cross-Paradigm Routing
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Rosetta Memory: Adaptive Memory for Cross-LLM Agents
2026-06-09PRODUCT_LAUNCH影 响: MEDIUM
Beyond Pass/Fail: Using Process Mining to Understand How LLMs Resist (and Fail) Red Team Attacks
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Does Persona Make LLMs K-pop Fans? A Pilot Study of LLM-Based Online Concert Audience Agents
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Neutrality Bites: Gender Representation in AI-Generated Animal Stories
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
CLASP: Language-Driven Robot Skill Selection and Composition using Task-Parameterized Learning
2026-06-09ACQUISITION影响: HIGH
CLASP: Language-Driven Robot Skill Selection and Composition using Task-Parameterized Learning
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
CLASP: Language-Driven Robot Skill Selection and Composition using Task-Parameterized Learning
2026-06-09SHUTDOWN影响: LOW
Auditing Proprietary Alignment in Large Language Models: A Comparative Framework Without a Ground-Truth Standard
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
When Video Misreads: Closed-Loop Distillation of Reading Heuristics for Exploratory Manipulation Trace QA
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
When Video Misreads: Closed-Loop Distillation of Reading Heuristics for Exploratory Manipulation Trace QA
2026-06-09BREAKTHROUGH影响: HIGH
Activation Steering Induces Emergent Misalignment: A More Comprehensive Evaluation
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Knowledge Graphs and Reasoning LLMs for Finding Simple Yet Effective Transcriptomic Perturbation Predictors
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
SafeRun: Enabling Determinism in LLM Planning for Running
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
See More, Think Deeper: Query-Expanded Visual Evidence and Answer-Clue Guided Reflection for Long Video Understanding
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
From USD Scenes to Knowledge Graphs: Zero-Shot Ontology Grounding with LLMs
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
SEF-CLGC at SemEval-2026 Task 11: Logical Notation Impact on Language Model Performance
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Unified Energy for Invariant and Independent Decoding in Diffusion Language Models
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
OmniGameArena: A Unified UE5 Benchmark for VLM Game Agents with Improvement Dynamics
2026-06-09SHUTDOWN影响: LOW
MAR:Multi-Agent Reflexion Improves Reasoning Abilities in LLMs
2026-06-09PRODUCT_LAUNCH影响: MEDIUM
Payoff scaling shapes cooperation in LLM agents across languages
2026-06-09PRODUCT_LAUNCH影响: MEDIUM