技术之vscode

Posted by cyq on February 13, 2020

VSCode

远程连接服务器

  1. 先检查本地(~/.ssh/id_rsa_pub)是否有ssh key,如果没有,则创建一个:ssh-keygen -t rsa -b 4096

  2. 添加自己的本地公钥到远程主机:

    ssh-copy-id 用户名@服务器ip

  3. 打开remote的配置文件,添加以下内容

    Host 起个名字
     User 用户名
        HostName 服务器ip
    

服务器上网

export http_proxy=10.61.3.141:8888

export https_proxy=10.61.3.141:8888

  1. 模型训练

    python -m colbert.train –mask-punctuation

    输入:–triples ‘./data_marco/triples.train.small.tsv’

    输出:模型的checkpoint

  2. rerank topK

    python -m colbert.test –checkpoint /data/users/caiyinqiong/colbert/ColBERT_2/ColBERT/experiments/dirty/train.py/2020-12-02_20.02.43/checkpoints/colbert.dnn

    输入:模型的checkpoint

    ​ load_qrels (注意:queries和collection不设值)

    ​ topk

    输出:ranking.tsv、ranking.metrics

  3. 编码collection,建立index

    python -m colbert.index –checkpoint /data/users/caiyinqiong/colbert/ColBERT_2/ColBERT/experiments/dirty/train.py/2020-12-02_20.02.43/checkpoints/colbert.dnn

    输入:模型的checkpoint

    ​ collection ‘./data_marco/collection.tsv’

    ​ index_root

    ​ index_name

    输出:index

  4. 由index得到faiss_index

    python -m colbert.index_faiss

    输入:index_root

    ​ index_name

    输出:faiss建立的index

  5. 第一阶段检索

    python -m colbert.retrieve –checkpoint /data/users/caiyinqiong/colbert/ColBERT_2/ColBERT/experiments/dirty/train.py/2020-12-02_20.02.43/checkpoints/colbert.dnn

    输入:模型的checkpoint

    ​ queries:’./data_marco/queries.dev.small.tsv’

    ​ collection:’./data_marco/collection.small.tsv’ (没用)

    ​ qrels:’./data_marco/qrels.dev.small.tsv’ (没用)

    ​ index_root

    ​ index_name

    输出:ranking.tsv

  6. 重排序topK

    python -m colbert.rerank –checkpoint /data/users/caiyinqiong/colbert/ColBERT_2/ColBERT/experiments/dirty/train.py/2020-12-02_20.02.43/checkpoints/colbert.dnn

    输入:模型的checkpoint

    ​ queries:’./data_marco/queries.dev.small.tsv’

    ​ collection:’./data_marco/collection.small.tsv’ (没用)

    ​ qrels:’./data_marco/qrels.dev.small.tsv’ (没用)

    ​ topk

    ​ index_root (没用)

    ​ index_name (没用)

    输出:ranking.tsv


triples.train.small.tsv:quert \t pos_passage \t neg_passage

collection.tsv:id \t passage

queries.dev.small.tsv:id \t query