Systems & Software
문자 광학 인식 기술 (Optical Character Recognition) (Developed 2020.07)
Deep Learning based Korean/English OCR Engine (KOR: 95%, ENG: 95%, General Document: 90%)
한국어 음성인식 엔진 (Korean Speech Recognition Engine) (Developed 2020.01, revised 2021.02)
Deep Learning based Korean Speech Recognition Engine (WER: 19%, CER: 9%)
Accuracy Enhancement (WER: 17.1%, CER: 6.5%)
Trained with 1000+ Hours data
Commercial Crawling System (Developed 2018~ongoing)
Scheduling multiple crawlers & Managing related processes (Admin)
Apache solr based Search (User-side)
Crawling 200+ websites concurrently
Easy management of more than 40+ machines and 200+ scrapy crawling nodes
Integration with NLP-based document management system
NSCC(National Science & Technology Code Classifier)
(한국) 과학기술표준분류기반 문서자동 분류기 (중분류 200개기준, Top1: 75%, Top2: 85%) [2020]
(한국) 과학기술표준분류기반 문서자동 분류기 (중분류 145개기준, Top1: 78%, Top3: 90%) [2021.11]
KorSci-Electra모델 개발 & Fine-tuning