2025 Spring
Prof. Youjian (Eugene) Liu
Prof. Youjian (Eugene) Liu
Canvas Course Site for quiz/homework submission and solution posting
Course Content (viewable with your CU account)
Lecture notes on Onenote
Lecture notes in pdf
Shared Course Content (viewable) (lecture videos, slides, etc.)
Shared Course Content (editable) (reading materials, project upload, etc.)
The homework problems can be found here.
Non-programing assignements should be submitted to Canvas.
Reading
Create a Zotero account and send me your Zotero ID so that I can share the library with you.
Set up your local Zotero client, add-inns, browser connector according to Zotero (Instructions in Google Doc) in order to automatically download reference and generate bib files.
Read
Bratko, Ivan. 2018. “AlphaZero – What’s Missing?” Informatica 42 (1). https://informatica.si/index.php/informatica/article/view/2226.
Silver, David, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, et al. 2018. “A General Reinforcement Learning Algorithm That Masters Chess, Shogi, and Go through Self-Play.” Science 362 (6419): 1140–44. https://doi.org/10.1126/science.aar6404.
Silver, David, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, et al. 2017. “Mastering the Game of Go without Human Knowledge.” Nature 550 (7676): 354–59. https://doi.org/10.1038/nature24270.
3.25, 3.27, 3.34, 3.37
At https://d2l.ai/, download and run the Jupyter Notebook of Sections 17.2 and 17.3 by clicking Colab [pytorch] at the upper right corner of the sections.
Read user comments at the bottom of the sections.
Note mistakes in the equations in the text and compare to my notes.
Correct any mistakes in the program. Run the program to get the correct results.
Add papers that you think are interesting to discuss to Zotero.
Modify the pseudo codes of AlphaGo Zero
Create an account using school email with Overleaf.com
Send your ID to me so that I can share the pseudo code in a latex file with you.
Choose Visual Editor in Overleaf if you are not familiar with Latex.
Compare the pseudo codes with the AlphaGo Zero paper and modify the code to correct any mistakes, add missing details, and add comments to make it easier to understand.
Read the MuZero paper, J. Schrittwieser et al., “Mastering Atari, Go, chess and shogi by planning with a learned model,” Nature, vol. 588, no. 7839, pp. 604–609, Dec. 2020, in zotero.
Read and understand the muzero_pseudocode.py from Google.
It is located in \Shared_with_students_edit\ReadingMaterials and also in Zotero under the paper.
Add and read the state-of-the-art papers on reinforcement learning with a learned model to Zotero\AI\Reinforcement Learning\Learned Model.
When applying MuZero algorithms for board games, the dynamics network acts as an opponent. What prevents the dynamics network from becoming a bad opponent that always lets the player win?
Read the following.
DeepSeek-AI et al., “DeepSeek-V3 Technical Report,” Dec. 27, 2024, arXiv: arXiv:2412.19437. doi: 10.48550/arXiv.2412.19437.
List the key innovations and submit it to Canvas.
Read the assigned paper and prepare for the presentation on CoT, ToT, GoT.
If you have slides, etc., please upload to \Shared_with_students_edit\ReadingMaterials
Read and run the code of Section 11.5 of d2l.ai.
Read "Shakarian, Paulo, Chitta Baral, Gerardo I. Simari, Bowen Xi, and Lahari Pokala. Neuro Symbolic Reasoning and Learning. SpringerBriefs in Computer Science. Cham: Springer Nature Switzerland, 2023." in \ReadingMaterials\
---------------------------------------------------------------
Two project assignments have been added to \Shared_with_students_edit\Projects