个人杂谈之SIGIR2020

Posted by cyq on November 29, 2020

(完成)SIGIR2020

  • (TKL)Local Self-Attention over Long Text for Efficient Document Retrieval

    对Transformer-kernel的改进,主要是针对Transformer-kernel用于长文本时。之前的基于Transformer的模型用于长文本时通常的做法是直接截断。本文在Transformer-kernel模型的基础上提出了local self-attention。并且计算出每个区域的相关度之后再使用top-local-max机制组合得到全局相关度。 数据集:MARCO的document retrieval数据集

  • Training Curricula for Open Domain Answer Re-Ranking

    for passage re-ranking 阶段。

    提出课程学习策略,改进BERT和Conv-KNRM的训练过程,刚开始给容易的passage更大的权重,给难的passage更小的权重,后期给相同的权重。

  • ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT

  • 【PreTTR】Efficient Document Re-Ranking for Transformers by Precomputing Term Representations

  • SetRank: Learning a Permutation-Invariant Ranking Model for Information Retrieval

  • Open-Retrieval Conversational Question Answering

    提出了一个开放域对话式QA数据集:OR-QuAC。

    提出了一个baseline,包括retriever、re-ranker,reader。

  • Match$^2$: A Matching over Matching Model for Similar Question Identification

  • MarkedBERT: Integrating Traditional IR cues in pre-trained language models for passage retrieval

    We proposed MarkedBERT that incorporates Exact Matching signals via a simple yet effective marking technique that only modifies the model input.

  • Context-Aware Term Weighting For First-Stage Passage Retrieval

  • Learning Term Discrimination

  • Improving Contextual Language Models for Response Retrieval in Multi-Turn Conversation

    对BERT类的上下文预训练模型提出针对多轮对话任务的两种改进策略:1)Speaker Segmentation;2)Dialogue Augmentation

  • Large-scale Image Retrieval with Sparse Binary Projections

  • 【EPIC】Expansion via Prediction of Importance with Contextualization

    把query表示成稀疏向量,document表示成dense向量,计算点积。

  • Efficiency Implications of Term Re-Weighting for Passage Retrieval

  • DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding

  • Improving Matching Models with Hierarchical Contextualized Representations for Multi-turn Response Selection

  • Distilling Knowledge for fast retrieval-based chat-bots

    we introduced an enhanced BERT cross-encoder architecture modified for the task of response retrieval. Alongside that, we utilized knowledge distillation to compress the complex BERT cross-encoder network as a teacher model into the student BERT bi-encoder model. This increases the BERT bi-encoders prediction quality without affecting its inference speed.

  • Having Your Cake and Eating it Too: Training Neural Retrieval for Language Inference without Losing Lexical Match

    We presented a simple approach to infuse lexical matching using unsupervised IR methods into a state-of-the-art transformer method - RoBERTa. We show that infusing lexical-matching improves the performance on simpler retrieval based question and the (justification) retrieval task itself。

  • An analysis of BERT in document ranking

    What Does BERT Look At? An Analysis of BERT’s Attention.

    How Contextual are Contextualized Word Representations.

    Understanding the Behaviors of BERT in Ranking(刘知远)

  • Read, Attend, and Exclude: Multi-Choice Reading Comprehension by Mimicking Human Reasoning Process

    提出了一个针对多选型阅读理解任务的模型。

  • Unsupervised Text Summarization with Sentence Graph Compression

    对多篇文档生成摘要。