NLP(自然言語処理)

Word2Vec

2013/01 Word2Vec

2014/01 GloVe

2016/07 FastText

2017/06 Transformer

2018/02 Elmo

2018/10 BERT

2019/01 Transformer-XL

2019/02 GPT-2

2019/04 ERNIE

2019/06 XLNet

2019/07 RoBERTa

2019/09 CTRL

2019/09 ALBERT

2019/10 T5

★Embedding 常见应用

・不依赖文本语法和语序的词袋模型:one-hot、tf-idf、textrank

・主题模型:LSA、pLSA、LDA

・基于词向量的固定表征:word2vec、fastText、glove

・基于词向量的动态表征:elmo、GPT、bert

★Evaluation of an NLP model (benchmarks)

BLEU — BiLingual Evaluation Understudy

It doesn’t consider meaning

It doesn’t directly consider sentence structure

It doesn’t handle morphologically rich languages

SQuAD — Stanford Question Answering Dataset

MS MACRO — MAchine Reading COmprehension Dataset

GLUE & SuperGLUE — General Language Understanding evaluation