inference 产品
来源: githubOPEN_SOURCE开源PythonApache-2.0发布于 2023-06-14
Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.
9346
Stars
835
Forks
4
技术栈
0
替代方案
50
相关事件
inference · 相关事件
相关事件
What Should Agents Say? Action-state Communication for Efficient Multi-Agent Systems
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
GITCO: Gated Inference-Time Context Optimization in TSFMs
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
FIDES: Faithful Inference via Deep Evidence Signals for Retrieval-Memory Conflict in RAG
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
Beyond Output Matching: Preserving Internal Geometry in NVFP4 LLM Distillatio
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
AdaMEM: Test-Time Adaptive Memory for Language Agents
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
Memory is Reconstructed, Not Retrieved: Graph Memory for LLM Agents
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
Search-Time Contamination in Deep Research Agents: Measuring Performance Inflation in Public Benchmark Evaluation
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
Agentic Monte Carlo: Simulating Reinforcement Learning for Black-Box Agents
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
Agentic Monte Carlo: Simulating Reinforcement Learning for Black-Box Agents
2026-06-06BREAKTHROUGH影响: HIGH
Benchmarking Counterfactual Prediction in Epidemic Time Series with Time-Varying Interventions
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
TinyML-Driven Cybersecurity for Autonomous Spacecraft: Latency-Accuracy Analysis for SPARTA RF and Cyber Threat Detection
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
CLEAR: Cognition and Latent Evaluation for Adaptive Routing in End-to-End Autonomous Driving
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
MPCoT: Reward-Guided Multi-Path Latent Reasoning for Test-Time Scalable Vision-Language-Action
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
DAST: A VLM-LLM Framework for Cross-Interface Anomaly Detection in O-RAN
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
PC Layer: Polynomial Weight Preconditioning for Improving LLM Pre-Training
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
2-Step Agent: A Framework for the Interaction of a Decision Maker with AI Decision Support
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
Evaluating CUDA Tile for AI Workloads on Hopper and Blackwell GPUs
2026-06-06PRODUCT_LAUNCH影响: MEDIUM
You Only Index Once: Cross-Layer Sparse Attention with Shared Routing
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
IR3DE: A Linear Router for Large Language Models
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
Analysis of the Neglect-Zero Effect in Large Language Models
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
Improving Heart-Focused Medical Question Answering in LLMs via Variance-Aware Rubric Rewards with GRPO
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
Efficient Punctuation Restoration via Weighted Lookahead Scoring Method for Streaming ASR Systems
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
Multi-Granularity Reasoning for Natural Language Inference
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
LANTERN: Layered Archival and Temporal Episodic Retrieval Network for Long-Context LLM Conversations
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
Multilingual Detection of Alzheimer's Disease from Speech: A Cross-Linguistic Transfer Learning Approach
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
InfoShield: Privacy-Preserving Speech Representations for Mental Health Screening via Information-Theoretic Optimization
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
Beyond tokens: a unified framework for latent communication in LLM-based multi-agent systems
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
Epistemic Injustice in Language Models: An Audit of Pretraining Filters and Guardrails
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
EGTR-Review: Efficient Evidence-Grounded Scientific Peer Review Generation via Multi-Agent Teacher Distillation
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
NAVIRA: Decoupled Stochastic Remasking for Masked Diffusion Language Models
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
SkillComposer: Learning to Evolve Agent Skills for Specification and Generalization
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
SkillComposer: Learning to Evolve Agent Skills for Specification and Generalization
2026-06-05SHUTDOWN影响: LOW
Statistically Reliable LLM-Based Ranking Evaluation via Prediction-Powered Inference
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
MIRAI: Prediction and Generation of High-Impact Academic Research
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
Membrane: A Self-Evolving Contrastive Safety Memory for LLM Agent Defense
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
On Advantage Estimates for Max@K Policy Gradients
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
Unsupervised Skill Discovery for Agentic Data Analysis
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
Unsupervised Skill Discovery for Agentic Data Analysis
2026-06-05SHUTDOWN影响: LOW
Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
A Survey on Diffusion Language Models
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
A Survey on Diffusion Language Models
2026-06-05BREAKTHROUGH影响: HIGH
DocHop-QA: Towards Multi-Hop Reasoning over Multimodal Document Collections
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM
2026-06-05BREAKTHROUGH影响: HIGH
Toward Culturally Aligned LLMs through Ontology-Guided Multi-Agent Reasoning
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
Aligning Tree-Search Policies with Fixed Token Budgets in Test-Time Scaling of LLMs
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
Correcting Prompt Dependence in LLM Benchmarks: A Bayesian Hierarchical Model with Embedding-Space Clustering
2026-06-05PRODUCT_LAUNCH影响: MEDIUM
ReTreVal: Reasoning Tree with Validation and Cross-Problem Memory for Large Language Models
2026-06-05PRODUCT_LAUNCH影响: MEDIUM