Publications
Full list of published works by Xuebo Liu
2026
Exposing the Cracks: Vulnerabilities of Retrieval-Augmented LLM-based Machine Translation
Proceedings of AAAI 2026
2025
AgentDropout: Dynamic Agent Elimination for Token-Efficient and High-Performance LLM-Based Multi-Agent Collaboration
Proceedings of ACL 2025
DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization
Proceedings of ACL 2025
SGIC: A Self-Guided Iterative Calibration Framework for RAG
Proceedings of ACL 2025
APT: Improving Specialist LLM Performance with Weakness Case Acquisition and Iterative Preference Training
Proceedings of ACL 2025 Findings
UnrealLLM: Towards Highly Controllable and Interactable 3D Scene Generation by LLM-powered Procedural Content Generation
Proceedings of ACL 2025 Findings
Weight-Aware Activation Sparsity with Constrained Bayesian Optimization Scheduling for Large Language Models
Proceedings of EMNLP 2025
AQuilt: Weaving Logic and Self-Inspection into Low-Cost, High-Relevance Data Synthesis for Specialist LLMs
Proceedings of EMNLP 2025
CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition, Domain, and Task
Proceedings of EMNLP 2025 Findings
DynamicKV: Task-Aware Adaptive KV Cache Compression for Long Context LLMs
Proceedings of EMNLP 2025 Findings
AgentInit: Initializing LLM-based Multi-Agent Systems via Diversity and Expertise Orchestration for Effective and Efficient Collaboration
Proceedings of EMNLP 2025 Findings
SeaPO: Strategic Error Amplification for Robust Preference Optimization of Large Language Models
Proceedings of EMNLP 2025 Findings
Runaway is Ashamed, But Helpful: On the Early-Exit Behavior of Large Language Model-based Agents in Embodied Environments
Proceedings of EMNLP 2025 Findings
DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory
Proceedings of ICLR 2025
Knowledge Editing with Dynamic Knowledge Graphs for Multi-hop Question Answering
Proceedings of AAAI 2025
Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore
Proceedings of COLING 2025
Exploiting Multimodal Knowledge Graph for Multimodal Machine Translation
IEEE Transactions on Multimedia
Orchestrating Prompt Expertise: Enhancing Knowledge Distillation via Expert-Guided Tuning
ACM Transactions on Asian and Low-Resource Language Information Processing
2024
LRQuant: Learnable and Robust Post-Training Quantization for Large Language Models
Proceedings of ACL 2024
Speech Sense Disambiguation: Tackling Homophone Ambiguity in End-to-End Speech Translation
Proceedings of ACL 2024
TasTe: Teaching Large Language Models to Translate through Self-Reflection
Proceedings of ACL 2024
Revisiting Demonstration Selection Strategies in In-Context Learning
Proceedings of ACL 2024
Towards Demonstration-Aware Large Language Models for Machine Translation
Proceedings of ACL 2024 Findings
Domain-Aware k-Nearest-Neighbor Knowledge Distillation for Machine Translation
Proceedings of ACL 2024 Findings
DB-LLM: Accurate Dual-Binarization for Efficient LLMs
Proceedings of ACL 2024 Findings
Improving Attributed Text Generation of Large Language Models via Preference Learning
Proceedings of ACL 2024 Findings
CommonIT: Commonality-aware Instruction Tuning for Large Language Models via Data Partitions
Proceedings of EMNLP 2024
Can LLMs Learn Uncertainty on Their Own? Expressing Uncertainty Effectively in A Self-Training Manner
Proceedings of EMNLP 2024
Curriculum Consistency Learning for Conditional Sentence Generation
Proceedings of EMNLP 2024
Self-Powered LLM Modality Expansion for Large Speech-Text Models
Proceedings of EMNLP 2024
LPZero: Language Model Zero-cost Proxy Search from Zero
Proceedings of EMNLP 2024 Findings
SelectIT: Selective Instruction Tuning for LLMs via Uncertainty-Aware Self-Reflection
Proceedings of NeurIPS 2024
NewTerm: Benchmarking Real-Time New Terms for Large Language Models with Annual Updates
Proceedings of NeurIPS 2024 Datasets and Benchmarks Track
EvalCrafter: Benchmarking and Evaluating Large Video Generation Models
Proceedings of CVPR 2024
Pluggable Neural Machine Translation Models via Memory-augmented Adapters
Proceedings of COLING 2024
3AM: An Ambiguity-Aware Multimodal Machine Translation Dataset
Proceedings of COLING 2024
Understanding and Improving Low-Resource Neural Machine Translation with Shallow Features
Proceedings of NLPCC 2024
Holistic Exploration on Universal Decompositional Semantic Parsing: Architecture, Data augmentation, and LLM Paradigm
Proceedings of ACL 2024 Workshop (SIGHAN-10)
2023
Test-time Adaptation for Machine Translation Evaluation by Uncertainty Minimization
Proceedings of ACL 2023
Revisiting Token Dropping Strategy in Efficient BERT Pretraining
Proceedings of ACL 2023
kNN-TL: k-Nearest-Neighbor Transfer Learning for Low-Resource Neural Machine Translation
Proceedings of ACL 2023
Revisiting Commonsense Reasoning in Machine Translation: Training, Evaluation and Challenge
Proceedings of ACL 2023
TemplateGEC: Improving Grammatical Error Correction with Detection Template
Proceedings of ACL 2023
TransGEC: Improving Grammatical Error Correction with Translationese
Proceedings of ACL 2023 Findings
Clustering Pseudo Language Family in Multilingual Translation Models with Fisher Information Matrix
Proceedings of EMNLP 2023
PromptST: Abstract Prompt Learning for End-to-End Speech Translation
Proceedings of EMNLP 2023
Can LMs Generalize to Future Data? An Empirical Analysis on Text Summarization
Proceedings of EMNLP 2023
Towards Making The Most of ChatGPT for Machine Translation
Proceedings of EMNLP 2023 Findings
Improving Simultaneous Machine Translation with Monolingual Data
Proceedings of AAAI 2023
Parameter-Efficient and Student-Friendly Knowledge Distillation
IEEE Transactions on Multimedia
2022
ODE Transformer: An Ordinary Differential Equation-Inspired Model for Sequence Generation
Proceedings of ACL 2022
Breaking the Representation Bottleneck of Chinese Characters: Neural Machine Translation with Stroke Sequence Modeling
Proceedings of EMNLP 2022
Revisiting Grammatical Error Correction Evaluation and Beyond
Proceedings of EMNLP 2022
ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation
Proceedings of EMNLP 2022
2021
Difficulty-Aware Machine Translation Evaluation
Proceedings of ACL 2021
Rejuvenating Low-Frequency Words: Making the Most of Parallel Data in Non-Autoregressive Translation
Proceedings of ACL 2021
On the Copying Behaviors of Pre-Training for Neural Machine Translation
Proceedings of ACL 2021 Findings
Progressive Multi-Granularity Training for Non-Autoregressive Translation
Proceedings of ACL 2021 Findings
On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation
Proceedings of EMNLP 2021 Findings
Exploiting Translation Model for Parallel Corpus Mining
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Variance-Aware Machine Translation Test Sets
Proceedings of NeurIPS 2021 Datasets and Benchmarks Track
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning
Proceedings of ICLR 2021
Understanding and Improving Lexical Choice in Non-Autoregressive Translation
Proceedings of ICLR 2021
Meta-Curriculum Learning for Domain Adaptation in Neural Machine Translation
Proceedings of AAAI 2021
Until 2020
Norm-Based Curriculum Learning for Neural Machine Translation
Proceedings of ACL 2020
Shared-Private Bilingual Word Embeddings for Neural Machine Translation
Proceedings of ACL 2019
Latent Attribute Based Hierarchical Decoder for Neural Machine Translation
IEEE/ACM Transactions on Audio, Speech, and Language Processing