Publications

Reviewed Articles

Reviewed Articles (in Japanese)

International Conference

Awards

Data and Programs

Reviewed Articles

Takumi Shibata, Masaki Uto (2025) Cross-Prompt Automated Essay Scoring via Reinforcement Learning-Based Data Valuation. IEEE Access. [Accepted]
Yuto Tomikawa, Ayaka Suzuki, Masaki Uto (2024) Adaptive Question–Answer Generation with Difficulty Control Using Item Response Theory and Pre-trained Transformer Models. IEEE Transactions on Learning Technologies, vol. 17, pp.2240-2252. (link)
Masaki Uto, Jun Tsuruta, Kouji Araki, Maomi Ueno (2024) Item response theory model highlighting rating scale of a rubric and rater-rubric interaction in objective structured clinical examination. PLOS ONE, 19 (9), e0309887, pp.1-23. (link)
Masaki Uto, Kota Aramaki (2024) Linking essay-writing tests using many-facet models and neural automated essay scoring. Behavior Research Methods, Springer, vol. 56, pp. 8450–8479. (link)
Masaki Uto, Itsuki Aomi, Emiko Tsutsumi, Maomi Ueno (2023) Integration of Prediction Scores from Various Automated Essay Scoring Models Using Item Response Theory. IEEE Transactions on Learning Technologies, vol. 16, no. 6, pp. 983-1000. (link)
Masaki Uto (2023) A Bayesian Many-Facet Rasch Model with Markov Modeling for Rater Severity Drift. Behavior Research Methods, Springer, Vol.55, 3910-3928. [IF=5.953] (link)
Minoru Nakayama, Filippo Sciarrone, Marco Temperini, Masaki Uto (2022) An Item Response Theory Approach to Enhance Peer Assessment Effectiveness in Massive Open Online Courses. International Journal of Distance Education Technologies, Vol.20, No.1, pp.1-19. (link)
Masaki Uto, Masashi Okano (2021) Learning Automated Essay Scoring Models Using Item Response Theory-Based Scores to Decrease Effects of Rater Biases. IEEE Transactions on Learning Technologies, Vol. 14, Issue 6, pp.763-776. (link) [IF: 3.720]
Masaki Uto (2021) A multidimensional generalized many-facet Rasch model for rubric-based performance assessment. Behaviormetrika, Springer, Vol.48, Issue 2, pp.425-457. (link)
Masaki Uto (2021) A review of deep-neural automated essay scoring models. Behaviormetrika, Springer, Vol.48, Issue 2, pp.459-484. (link)
Masaki Uto (2021) Accuracy of performance-test linking based on a many-facet Rasch model. Behavior Research Methods, Springer, Vol. 53, No. 4, pp. 1440-1454. [IF=6.242]. (link)
Masaki Uto, Maomi Ueno (2020) A generalized many-facet Rasch model and its Bayesian estimation using Hamiltonian Monte Carlo. Behaviormetrika, Springer, Vol. 47, Issue. 2, pp. 469-496. (link)
Masaki Uto, Yoshimitsu Miyazawa, Yoshihiro Kato, Koji Nakajima, Hajime Kuwata (2020) Time- and learner-dependent hidden Markov model for writing process analysis using keystroke log data. International Journal of Artificial Intelligence in Education, Springer, Vol. 30, No.2, pp.271-298. (link)
Masaki Uto, Duc-Thien Nguyen, Maomi Ueno (2020) Group optimization to maximize peer assessment accuracy using item response theory and integer programming, IEEE Transactions on Learning Technologies, IEEE Computer Society, Vol.13, No.1, pp.91-106. (link) [IF: 3.72]
Masaki Uto, Maomi Ueno (2018) Empirical comparison of item response theory models with rater's parameters. Heliyon, Elsevier, Vol.4, No 5, pp.1-32. (link) [IF: 1.857]
Sébastien Louvigné, Masaki Uto, Yoshihiro Kato, Takatoshi Ishii (2018) Social constructivist approach of motivation: social media messages recommendation system. Behaviormetrika, Springer. Vol.45, No.1, pp.133-155.
Masaki Uto, Sébastien Louvigné, Yoshihiro Kato, Takatoshi Ishii, Yoshimitsu Miyazawa (2017) Diverse reports recommendation system based on latent Dirichlet allocation. Behaviormetrika, Springer, Vol.44, No.2, pp.425-444. (link)
Masaki Uto, Maomi Ueno (2016) Item response theory for peer assessment. IEEE Transactions on Learning Technologies, IEEE Computer Society, Vol.9, No.2, pp.157-170. [IF: 3.72] (link)

Reviewed Articles (in Japanese)

Masaki Uto (2025) A Review of Automatic Question Generation based on Deep Neural Networks. The Japan Association for Research on Testing, Vol.21, No.1, pp.97-123.
Takumi Shibata, Masaki Uto (2025) Cross-Prompt Automated Essay Scoring Based on Data Valuation Using Reinforcement Learning. The IEICE transactions on information and systems, Vol.J108-D, No.06, pp.414-427.
Yuto Takahashi, Masaki Uto (2024) Confidence Level Estimation in Neural Automatic Scoring Using Multitask Learning of Regression and Classification. The Japan Association for Research on Testing, Vol.20, No.1, pp.1-22.
Yuto Tomikawa, Ayaka Suzuki, Masaki Uto (2024) Difficulty-Controllable Neural Question Generation for Reading Comprehension based on Item Response Theory.The IEICE transactions on information and systems, Vol.J107-D, No.02, pp.53-66.
Takumi Shibata, Masaki Uto (2022) Trait-based Automated Essay Scoring Using Multidimensional Item Response Theory and Deep Neural Networks. The IEICE transactions on information and systems, Vol.J106-D, No.01. pp.47-56.
Masaki Uto (2022) Multidimensional four facets item response theory model for rubric-based performance assessment. The IEICE transactions on information and systems, Vol.J105-D, No.07, pp.457-469.
Itsuki Aomi, Emiko Tsutsumi, Masaki Uto, Maomi Ueno (2021) Automated Essay Scoring Model Averaging by Item Response Theory. The IEICE transactions on information and systems, Vol.J104-D, No.11, pp.784-795.
Masashi Okano, Masaki Uto (2021) Deep neural network-based automated essay scoring considering rater bias effects in training data. The IEICE transactions on information and systems, Vol.J104, No.08. Vol.J104-D, No.8, pp.650-662.
Yuto Uchida, Masaki Uto (2021) A deep neural network-based automated short answer grading that considers examinee ability. The Journal of Information and Systems in Education. Vol.38, No.3, pp.218-228.
Masaki Uto, Maomi Ueno (2020) Item response theory for rubric-based assessment. The IEICE transactions on information and systems, Vol.J103, No.05. pp. 459-470.
Shudai Yagi, Masaki Uto (2019) Multidimensional item response theory model for performance assessment, The IEICE transactions on information and systems, Vol 102, No.10, pp.708-720.
Masaki Uto (2019) IRT topic model for essay type tests using rating data and text information. The IEICE transactions on information and systems, Vol 102, No.8, pp.553-566.
Emiko Tsutsumi, Masaki Uto, Maomi Ueno (2019) . Item response theory for dynamic assessment. The IEICE transactions on information and systems, Vol. 102, No. 2, pp. 79-92.
Masaki Uto (2018) Accuracy of performance test equating based on item response theory models with rater characteristic parameters. The IEICE transactions on information and systems, Vol. 101, No. 6, pp.895-905.
Yoshimitsu Miyazawa, Masaki Uto, Takatoshi Ishii, Maomi Ueno (2018) A proposal of uniform adaptive testing for reducing measurement error bias. The IEICE transactions on information and systems, Vol. 101, No. 6, pp.909-920.
Kazuki Natori, Masaki Uto, Maomi Ueno (2018) Learning huge Bayesian networks by RAI algorithm using Bayes factor. The IEICE transactions on information and systems, Vol. 101, No. 5, pp.754-768.
Nguyen Duc Thien, Masaki Uto, Maomi Ueno (2018) Group optimization using item response theory for peer assessment. The IEICE transactions on information and systems, Vol. 102, No. 1, pp.431-445.
Masaki Uto, Maomi Ueno (2018) Robust item response theory model for aberrant raters in peer assessment. The IEICE transactions on information and systems, Vol. 101, No. 1, pp.211-224.
Masaki Uto, Maomi Ueno (2016) A Review of item response models for performance assessment. The Japan Association for Research on Testing, Vol. 12, No. 1, pp. 55-75.
Masaki Uto, Maomi Ueno (2015) Item response theory with assessors' lower order parameters of peer assessment. The IEICE transactions on information and systems, Vol. 98, No. 1, pp. 3-16.
Masaki Uto, Hiroaki Suzuki, Maomi Ueno (2013) Toulmin model based argument elaboration support system using Bayesian network representation. The IEICE transactions on information and systems, Vol. 96, No. 4, 998-1011
Masaki Uto, Maomi Ueno (2011) Article structure construction support system by Bayes code. The IEICE transactions on information and systems, Vol. 94, No. 12, pp.2069-2081
Maomi Ueno, Masaki Uto (2011) ePortfolio which facilitates learning from others. Japan Society for Educational Technology, Vol. 35, No. 3, pp. 13-26

International Conference

Yusei Nagai, Masaki Uto (2025) Automatic Distractor Generation in Multiple-Choice Questions Using Large Language Models with Expert-Informed Distractor Strategies. International Conference on Computers in Education (ICCE). (to appear)
Taichi Kitajima, Masaki Uto (2025) Multimodal Trait Scoring for Video Interviews Using Neural Models with Handcrafted Features and Trait-Attention. International Conference on Computers in Education (ICCE). (to appear)
Muhammad Reiza Syaifullah, Masaki Uto (2025) Scoring Indonesian Research Proposals via LLM-based Pairwise Comparison and Summarization. The 29th International Conference on Asian Language Processing (IALP). pp. 36-41. (link)
Naoki Shindo, Masaki Uto (2025) Virtual Simulated Patients for Medical Interviews Using Large Language Models with a Self-Refinement Mechanism to Suppress Excessive Responses. International Conference on Artificial Intelligence in Education (AIED), pp 52-59. [Late-Breaking Results Track, CORE-Rank=A] (link)
Machi Shimmei, Masaki Uto, Yuichiroh Matsubayashi, Kentaro Inui, Aditi Mallavarapu, Noboru Matsuda (2025) Tell Me Who Your Students Are: GPT Can Generate Valid Multiple-Choice Questions When Students’ (Mis)Understanding Is Hinted. International Conference on Artificial Intelligence in Education (AIED), pp 126-134. [Late-Breaking Results Track, CORE-Rank=A] (link) <Best LBR Paper Award>
Masaki Uto, Yuma Ito (2025) Leveraging AI Graders for Missing Score Imputation to Achieve Accurate Ability Estimation in Constructed-Response Tests. 2nd Workshop on Automated Evaluation of Learning and Assessment Content (EvalLAC), co-located with the International Conference on Artificial Intelligence in Education (AIED). (link)
Takumi Shibata, Yuki Ito, Yuto Tomikawa, Masaki Uto (2025) Enhancing Neural Automated Essay Scoring Accuracy by Removing Noisy Data Through Data Valuation. 2nd Workshop on Automated Evaluation of Learning and Assessment Content (EvalLAC), co-located with the International Conference on Artificial Intelligence in Education (AIED). (link)
Minoru Nakayama, Satoru Kikuchi, Masaki Uto, Hiroh Yamamoto (2025) Evaluation of Essays and Comments for Developing Critical Thinking Ability during a University course. Psychology Learning Technology (PLS). Communications in Computer and Information Science, vol 2089. Springer, pp 3–17. (link)
Teruyoshi Goto, Yuto Tomikawa, Masaki Uto (2024) Enhancing Diversity in Difficulty-Controllable Question Generation for Reading Comprehension via Extended T5. International Conference on Computers in Education (ICCE). pp. 71-76. (link)
Yuto Tomikawa, Masaki Uto (2024) Difficulty-Controllable Reading Comprehension Question Generation Considering the Difficulty of Reading Passages. International Conference on Computers in Education (ICCE). pp. 151-160. (link)
Minoru Nakayama, Masaki Uto, Marco Temperini, Filippo Sciarrone (2024) Appropriate Number of Raters for IRT based Peer Assessment Evaluation of Programming Skills. 15th International Workshop on Interactive Environments and Emerging Technologies for eLearning (IEETel), IEEE International Conference on IT in Higher Education (ITHET). (link)
Minoru Nakayama, Masaki Uto, Satoru Kikuchi, Hiroh Yamamoto (2024) Predicting Factor Scores of Critical Thinking Ability from Features of Essay Texts. 15th International Workshop on Interactive Environments and Emerging Technologies for eLearning (IEETel), IEEE International Conference on IT in Higher Education (ITHET). (link)
Masaki Uto, Yuto Takahashi (2024) Neural Automated Essay Scoring for Improved Confidence Estimation and Score Prediction through Integrated Classification and Regression. International Conference on Artificial Intelligence in Education (AIED), pp 444-451. [Late-Breaking Results Track, CORE-Rank=A] (link)
Kota Aramaki, Masaki Uto (2024) Collaborative Essay Evaluation with Human and Neural Graders using Item Response Theory under a Nonequivalent Groups Design. International Conference on Artificial Intelligence in Education (AIED), pp 79–87. [Late-Breaking Results Track, CORE-Rank=A] (link)
Yuto Tomikawa, Masaki Uto (2024) Difficulty-Controllable Multiple-Choice Question Generation for Reading Comprehension Using Item Response Theory. International Conference on Artificial Intelligence in Education (AIED), pp 312–320. [Late-Breaking Results Track, CORE-Rank=A] (link)
Naoki Shindo, Masaki Uto (2024) ChatGPT-based Virtual Standardized Patient that Amends Overly Detailed Responses in Objective Structured Clinical Examinations. International Conference on Artificial Intelligence in Education (AIED), pp 263–269. [WideAIED Track, CORE-Rank=A] (link)
Masaki Uto, Ayaka Suzuki, Yuto Tomikawa (2024) Question Difficulty Prediction Based on Virtual Test-Takers and Item Response Theory. Workshop on Automated Evaluation of Learning and Assessment Content (EvalLAC), International Conference on Artificial Intelligence in Education (AIED). (link)
Takumi Shibata, Masaki Uto (2024) Enhancing Cross-prompt Automated Essay Scoring by Selecting Training Data Based on Reinforcement Learning. Workshop on Automated Evaluation of Learning and Assessment Content (EvalLAC), International Conference on Artificial Intelligence in Education (AIED). (link)
Minoru Nakayama, Masaki Uto, Satoru Kikuchi, Hiroh Yamamoto (2024) Estimating scores of critical thinking ability using essay text assessments. 28th International Conference on Information Visualisation (IV).
Minoru Nakayama, Satoru Kikuchi, Masaki Uto, Hiroh Yamamoto (2024) Predicting critical thinking ability scores using student's characteristics and learning performance. ICS Exchange Conference. (link) <Best paper award>
Masaki Uto, Yuto Tomikawa, Ayaka Suzuki (2023) Difficulty-Controllable Neural Question Generation for Reading Comprehension using Item Response Theory.18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA), Association for Computational Linguistics (ACL), pp.119-129. (link)
Misato Yamaura, Itsuki Fukuda, Masaki Uto (2023) Neural automated essay scoring considering logical structure. 24th International Conference on Artificial Intelligence in Education (AIED), pp.267-278. [Accepted as full paper, full paper acceptance rate= 21.1%, CORE-Rank=A] (link)
Masaki Uto (2023) Neural Automated Short-Answer Grading Considering Examinee-Specific Features. 23rd IEEE International Conference on Advanced Learning Technologies (ICALT), pp.336-338. [Accepted as short paper, CORE-Rank=B]
Kota Aramaki, Masaki Uto (2023) Linking method for writing tests using item response theory and automated essay scoring. International Meeting of the Psychometric Society (IMPS).
Minoru Nakayama, Masaki Uto, Satoru Kikuchi, Hiroh Yamamoto (2023) Feasibility of Prediction of Student’s Characteristics using Texts of Essays Written during a Fully Online Course. 27th International Conference on Information Visualisation (IV).
Takumi Shibata, Masaki Uto (2022) Analytic Automated Essay Scoring based on Deep Neural Networks Integrating Multidimensional Item Response Theory, International Conference on Computational Linguistics (COLING), [Accepted as full paper, full paper acceptance rate= 24.2%, CORE Rank=A]
Minoru Nakayama, Filippo Sciarrone, Marco Temperini, Masaki Uto (2022) Evaluation of Programming Skills via Peer Assessment and IRT Estimation Techniques. 20th International Conference on Information Technology Based Higher Education and Training (ITHET), pp.1-8.
Minoru Nakayama, Masaki Uto, Marco Temperini, Filippo Scarrone (2021) Estimating Ability of Programming Skills using IRT based Peer Assessments. 19th International Conference on Information Technology Based Higher Education and Training (ITHET), pp.1-6. (link)
Masaki Uto (2021) A Multidimensional Item Response Theory Model for Rubric-based Writing Assessment. International Conference on Artificial Intelligence in Education (AIED), Lecture Notes in Computer Science, vol.12748, pp.420–432 @Online [Accepted as full paper, full paper acceptance rate= 24%, CORE Rank=A] <Best paper award nominee>
Itsuki Aomi, Emiko Tsutsumi, Masaki Uto, Maomi Ueno (2021) Integration of Automated Essay Scoring Models using Item Response Theory. International Conference on Artificial Intelligence in Education (AIED), Lecture Notes in Computer Science, vol.12749, pp.54–59 @Online [Accepted as short paper, CORE Rank=A]
Masaki Uto, Yikuan Xie, Maomi Ueno (2020) Neural Automated Essay Scoring Incorporating Handcrafted Features. Proceedings of the 28th International Conference on Computational Linguistics (COLING), pp.6077-6088. [Accepted as full paper, CORE Rank=A] (link)
Masaki Uto, Masashi Okano (2020) Robust neural automated essay scoring using item response theory. International Conference on Artificial Intelligence in Education (AIED), Lecture Notes in Computer Science, vol 12164, pp.549-561. @Cyberspace [Accepted as full paper, full paper acceptance rate= 26.6%, CORE Rank=A] <Best paper runner-up award> (link)
Masaki Uto, Yuto Uchida (2020) Automated short-answer grading using deep neural networks and item response theory. International Conference on Artificial Intelligence in Education (AIED), Lecture Notes in Computer Science, vol 12164, pp.334-339. @Cyberspace [Accepted as short paper, full paper acceptance rate= 26.6%, CORE Rank=A] (link)
Minoru Nakayama, Filippo Sciarrone, Masaki Uto, Marco Temperini (2020) Impact of the number of peers on a mutual assessment as learner's performance in a simulated MOOC environment using the IRT model. 24th International Conference Information Visualization (IV). pp. 483-487. @ Online.
Minoru Nakayama, Filippo Sciarrone, Masaki Uto, Marco Temperini (2020) Estimating student's performance based on item response theory in a MOOC environment with peer assessment. International Conference in Methodologies and Intelligent Systems for Technology Enhanced Learning (MIS4TEL), Advances in Intelligent Systems and Computing, Springer, vol 1236, pp. 25-35. @ Online.
Masaki Uto (2019) Rater-effect IRT model integrating supervised LDA for accurate measurement of essay writing ability. International Conference on Artificial Intelligence in Education (AIED), pp. 494-506, @ Chicago, USA. [Accepted as full paper, full paper acceptance rate= 25%, CORE Rank=A] (link)
Shouta Sugahara, Masaki Uto, Maomi Ueno (2018) Exact learning augmented naive Bayes classifier. International Conference on Probabilistic Graphical Models (PGM), Proceedings of Machine Learning Research, vol 72, pp. 439-450, @ Prague
Masaki Uto, Maomi Ueno (2018) Item response theory without restriction of equal interval scale for rater's score. International Conference on Artificial Intelligence in Education (AIED), pp.363-368. ＠ London, UK. [Accepted as short paper, full paper acceptance rate=23%, CORE Rank=A] (link)
Kazuki Natori, Masaki Uto, Maomi Ueno (2017) Consistent learning Bayesian networks with thousands of variables. The 3rd Workshop on Advanced Methodologies for Bayesian Networks (AMBN), pp.57-68, @ Tokyo, Japan.
Taiyo Utsuhara, Masaki Uto, Asana Ishihara, Atsushi Yoshikawa, Maomi Ueno (2017). Classification of Japanese graduate schools: in terms of educational practices and the grown globalization competencies by the policies. International Federation of Classification Societies (IFCS). @ Tokyo, Japan.
Masaki Uto, Nguyen Duc Thien, Maomi Ueno (2017). Group optimization to maximize peer assessment accuracy using item response theory. International Conference on Artificial Intelligence in Education (AIED) pp.393-405. ＠ Wuhan, China. [Accepted as full paper, full paper acceptance rate = 30%, CORE Rank=A] (link)
Taiyo Utsuhara, Masaki Uto, Asana Ishihara, Koichi Ota, Ayako Hirano, Atsushi Yoshikawa, Maomi Ueno (2017) Features of globalization in Japanese graduate schools. International Conference on Education (ICE). pp.392_1-392_10. @ San Diego, USA.
Nguyen Duc Thien, Masaki Uto, Yu Abe, Maomi Ueno (2015) Reliable peer assessment for team-project-based learning using item response theory. International Conference on Computers in Education (ICCE), pp. 144-153. @ Hongzhou, China. [Accepted as full paper, full paper acceptance rate=32%, CORE Rank=B] (link)
Kazuki Natori, Masaki Uto, Yu Nishiyama, Shuichi Kawano, Maomi Ueno (2015) Constraint-based learning Bayesian networks using Bayes factor. The 2nd Workshop on Advanced Methodologies for Bayesian Networks (AMBN), pp. 15-31. @ Tokyo, Japan.
Masaki Uto, Maomi Ueno (2015) Academic writing support system using Bayesian networks. IEEE International Conference on Advanced Learning Technologies (ICALT), pp. 385-387. @ Hualien, Taiwan. [Accepted as full paper, full paper acceptance rate=28.4%, CORE Rank=B] (link)
Masaki Uto, Maomi Ueno (2015) Item response model with lower order rater parameters for peer assessment. International Conference on Artificial Intelligence in Education (AIED) , pp. 800-803. @ Madrid, Spain. [Accepted as short paper, full paper acceptance rate=29%, CORE Rank=A] (link)
Maomi Ueno, Masaki Uto (2012) Non-informative Dirichlet score for learning Bayesian networks. European Workshop on Probabilistic Graphical Models (PGM), pp. 331-338. @Granada, Spain.
Maomi Ueno, Masaki Uto (2011) Learning community using social network service. Web Based Communities and Social Media 2011 Conference (IADIS), pp. 109-119. @ Rome, Italy.

Awards

Best LBR Paper Award, International Conference on Artificial Intelligence in Education (AIED) (2025) Machi Shimmei, Masaki Uto, Yuichiroh Matsubayashi, Kentaro Inui, Aditi Mallavarapu, Noboru Matsuda: "Tell Me Who Your Students Are: GPT Can Generate Valid Multiple-Choice Questions When Students’ (Mis)Understanding Is Hinted." <<Certificate>>
Best paper award, ICS Exchange Conference (2024) Minoru Nakayama, Satoru Kikuchi, Masaki Uto, Hiroh Yamamoto: Predicting critical thinking ability scores using student's characteristics and learning performance. <<Certification>>
Best paper runner-up award, International Conference on Artificial Intelligence in Education (AIED) (2020) Masaki Uto, Masashi Okano: Robust neural automated essay scoring using item response theory.

Data and Programs

The real data and the RStan code used in "A Bayesian Many-Facet Rasch Model with Markov Modeling for Rater Severity Drift" by Masaki Uto (2022), published in Behavior Research Methods by Springer, are made available here.
The real data and the RStan code used in "A multidimensional generalized many-facet Rasch model for rubric-based performance assessment" by Masaki Uto (2021), published in Behaviormetrika by Springer, are also made available here.
The rating data used in "Empirical comparison of item response theory models with rater's parameters" by Masaki Uto & Maomi Ueno (2018), published in Heliyon by Elsevier, are made available here.
The parameter estimation program of the IRT model for peer assessment that is proposed in "Masaki Uto, Maomi Ueno (2016) Item Response Theory for Peer Assessment. IEEE Transactions on Learning Technologies, IEEE computer Society" can be downloaded from https://bitbucket.org/uto/peerassessmentirt.git. The source code was written in Java.

Google Sites

Report abuse