NLP(自然言語処理)
Word2Vec
2013/01 Word2Vec
2014/01 GloVe
2016/07 FastText
2017/06 Transformer
2018/02 Elmo
2018/10 BERT
2019/01 Transformer-XL
2019/02 GPT-2
2019/04 ERNIE
2019/06 XLNet
2019/07 RoBERTa
2019/09 CTRL
2019/09 ALBERT
2019/10 T5
★Embedding 常见应用
・不依赖文本语法和语序的词袋模型:one-hot、tf-idf、textrank
・主题模型:LSA、pLSA、LDA
・基于词向量的固定表征:word2vec、fastText、glove
・基于词向量的动态表征:elmo、GPT、bert
★Evaluation of an NLP model (benchmarks)
BLEU — BiLingual Evaluation Understudy
It doesn’t consider meaning
It doesn’t directly consider sentence structure
It doesn’t handle morphologically rich languages
SQuAD — Stanford Question Answering Dataset
MS MACRO — MAchine Reading COmprehension Dataset
GLUE & SuperGLUE — General Language Understanding evaluation