(完成)SIGIR2020
-
(TKL)Local Self-Attention over Long Text for Efficient Document Retrieval
对Transformer-kernel的改进,主要是针对Transformer-kernel用于长文本时。之前的基于Transformer的模型用于长文本时通常的做法是直接截断。本文在Transformer-kernel模型的基础上提出了local self-attention。并且计算出每个区域的相关度之后再使用top-local-max机制组合得到全局相关度。 数据集:MARCO的document retrieval数据集
-
Training Curricula for Open Domain Answer Re-Ranking
for passage re-ranking 阶段。
提出课程学习策略,改进BERT和Conv-KNRM的训练过程,刚开始给容易的passage更大的权重,给难的passage更小的权重,后期给相同的权重。
-
ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
-
【PreTTR】Efficient Document Re-Ranking for Transformers by Precomputing Term Representations
-
SetRank: Learning a Permutation-Invariant Ranking Model for Information Retrieval
-
Open-Retrieval Conversational Question Answering
提出了一个开放域对话式QA数据集:OR-QuAC。
提出了一个baseline,包括retriever、re-ranker,reader。
-
Match$^2$: A Matching over Matching Model for Similar Question Identification
-
MarkedBERT: Integrating Traditional IR cues in pre-trained language models for passage retrieval
We proposed MarkedBERT that incorporates Exact Matching signals via a simple yet effective marking technique that only modifies the model input.
-
Context-Aware Term Weighting For First-Stage Passage Retrieval
-
Learning Term Discrimination
-
Improving Contextual Language Models for Response Retrieval in Multi-Turn Conversation
对BERT类的上下文预训练模型提出针对多轮对话任务的两种改进策略:1)Speaker Segmentation;2)Dialogue Augmentation
-
Large-scale Image Retrieval with Sparse Binary Projections
-
【EPIC】Expansion via Prediction of Importance with Contextualization
把query表示成稀疏向量,document表示成dense向量,计算点积。
-
Efficiency Implications of Term Re-Weighting for Passage Retrieval
-
DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding
-
Improving Matching Models with Hierarchical Contextualized Representations for Multi-turn Response Selection
-
Distilling Knowledge for fast retrieval-based chat-bots
we introduced an enhanced BERT cross-encoder architecture modified for the task of response retrieval. Alongside that, we utilized knowledge distillation to compress the complex BERT cross-encoder network as a teacher model into the student BERT bi-encoder model. This increases the BERT bi-encoders prediction quality without affecting its inference speed.
-
Having Your Cake and Eating it Too: Training Neural Retrieval for Language Inference without Losing Lexical Match
We presented a simple approach to infuse lexical matching using unsupervised IR methods into a state-of-the-art transformer method - RoBERTa. We show that infusing lexical-matching improves the performance on simpler retrieval based question and the (justification) retrieval task itself。
-
An analysis of BERT in document ranking
What Does BERT Look At? An Analysis of BERT’s Attention.
How Contextual are Contextualized Word Representations.
Understanding the Behaviors of BERT in Ranking(刘知远)
-
Read, Attend, and Exclude: Multi-Choice Reading Comprehension by Mimicking Human Reasoning Process
提出了一个针对多选型阅读理解任务的模型。
-
Unsupervised Text Summarization with Sentence Graph Compression
对多篇文档生成摘要。