inference 产品

来源: githubOPEN_SOURCE开源PythonApache-2.0发布于 2023-06-14

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.

9346

Stars

835

Forks

4

技术栈

0

替代方案

50

相关事件

人工智能大语言模型大模型 / LLM 深度学习框架

inference · 相关事件

相关事件

What Should Agents Say? Action-state Communication for Efficient Multi-Agent Systems

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

GITCO: Gated Inference-Time Context Optimization in TSFMs

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

FIDES: Faithful Inference via Deep Evidence Signals for Retrieval-Memory Conflict in RAG

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

Beyond Output Matching: Preserving Internal Geometry in NVFP4 LLM Distillatio

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

AdaMEM: Test-Time Adaptive Memory for Language Agents

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

Memory is Reconstructed, Not Retrieved: Graph Memory for LLM Agents

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

Search-Time Contamination in Deep Research Agents: Measuring Performance Inflation in Public Benchmark Evaluation

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

Agentic Monte Carlo: Simulating Reinforcement Learning for Black-Box Agents

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

Agentic Monte Carlo: Simulating Reinforcement Learning for Black-Box Agents

2026-06-06BREAKTHROUGH影响: HIGH

Benchmarking Counterfactual Prediction in Epidemic Time Series with Time-Varying Interventions

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

TinyML-Driven Cybersecurity for Autonomous Spacecraft: Latency-Accuracy Analysis for SPARTA RF and Cyber Threat Detection

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

CLEAR: Cognition and Latent Evaluation for Adaptive Routing in End-to-End Autonomous Driving

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

MPCoT: Reward-Guided Multi-Path Latent Reasoning for Test-Time Scalable Vision-Language-Action

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

DAST: A VLM-LLM Framework for Cross-Interface Anomaly Detection in O-RAN

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

PC Layer: Polynomial Weight Preconditioning for Improving LLM Pre-Training

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

2-Step Agent: A Framework for the Interaction of a Decision Maker with AI Decision Support

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

Evaluating CUDA Tile for AI Workloads on Hopper and Blackwell GPUs

2026-06-06PRODUCT_LAUNCH影响: MEDIUM

You Only Index Once: Cross-Layer Sparse Attention with Shared Routing

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

IR3DE: A Linear Router for Large Language Models

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

Analysis of the Neglect-Zero Effect in Large Language Models

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

Improving Heart-Focused Medical Question Answering in LLMs via Variance-Aware Rubric Rewards with GRPO

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

Efficient Punctuation Restoration via Weighted Lookahead Scoring Method for Streaming ASR Systems

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

Multi-Granularity Reasoning for Natural Language Inference

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

LANTERN: Layered Archival and Temporal Episodic Retrieval Network for Long-Context LLM Conversations

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

Multilingual Detection of Alzheimer's Disease from Speech: A Cross-Linguistic Transfer Learning Approach

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

InfoShield: Privacy-Preserving Speech Representations for Mental Health Screening via Information-Theoretic Optimization

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

Beyond tokens: a unified framework for latent communication in LLM-based multi-agent systems

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

Epistemic Injustice in Language Models: An Audit of Pretraining Filters and Guardrails

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

EGTR-Review: Efficient Evidence-Grounded Scientific Peer Review Generation via Multi-Agent Teacher Distillation

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

NAVIRA: Decoupled Stochastic Remasking for Masked Diffusion Language Models

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

SkillComposer: Learning to Evolve Agent Skills for Specification and Generalization

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

SkillComposer: Learning to Evolve Agent Skills for Specification and Generalization

2026-06-05SHUTDOWN影响: LOW

Statistically Reliable LLM-Based Ranking Evaluation via Prediction-Powered Inference

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

MIRAI: Prediction and Generation of High-Impact Academic Research

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

Membrane: A Self-Evolving Contrastive Safety Memory for LLM Agent Defense

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

On Advantage Estimates for Max@K Policy Gradients

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

Unsupervised Skill Discovery for Agentic Data Analysis

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

Unsupervised Skill Discovery for Agentic Data Analysis

2026-06-05SHUTDOWN影响: LOW

Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

A Survey on Diffusion Language Models

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

A Survey on Diffusion Language Models

2026-06-05BREAKTHROUGH影响: HIGH

DocHop-QA: Towards Multi-Hop Reasoning over Multimodal Document Collections

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM

2026-06-05BREAKTHROUGH影响: HIGH

Toward Culturally Aligned LLMs through Ontology-Guided Multi-Agent Reasoning

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

Aligning Tree-Search Policies with Fixed Token Budgets in Test-Time Scaling of LLMs

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

Correcting Prompt Dependence in LLM Benchmarks: A Bayesian Hierarchical Model with Embedding-Space Clustering

2026-06-05PRODUCT_LAUNCH影响: MEDIUM

ReTreVal: Reasoning Tree with Validation and Cross-Problem Memory for Large Language Models

2026-06-05PRODUCT_LAUNCH影响: MEDIUM