Computer Answering Guide

If you think you have a robust QA system that can achieve victory over humans with these questions, please submit them to the Dynabench submission website (https://dynabench.org/tasks/RQA). The questions that were used in the OQL will be applied as test sets to your model.  

Your QA system's prediction results should return PREDICTED ANSWERS and CONFIDENCE SCORE. More details can be found below in the 5-2.Code guidelines (The format is visible under the single_evaluation function in the default submission folder). 

Steps to Submission


The resources directory is where you will put your model files, tokenizers, including the configuration file. Now, you will need to write a code that makes predictions from the model in the resource folder.


To do that, in the app/domain directory, you will find a file called model.py. There are two functions. As batch_evaluation function uses the single_evaluation function, you need to add code to the single_evaluation function. This predicts and evaluates the input question line by line and returns the prediction and the confidence score. 


Be mindful of the argument type which should be both “string”. The predicted answer will also be the “string” type.As we used a model imported from Huggingface, we used the inference code to implement their models. Make sure that your model path is set to the resource folder path. If not, your predictions will not be inferenced from your QA model and return the questions that are set as default return value. 


It might be possible that your model needs some specific packages. To add these dependencies, click on the requirements.txt file in the root folder and write the packages you need. For example, to use transformers and torch packages to run your model, as they were installed by command, you will need to add in “transformers==4.26.1” and “torch” along with other requirements. Make sure you assign the specific version of the packages, if they are not the most recent ones. 


If you think that you need to add any other documents that aren’t code, you may add them to the root folder.




To test this submission, you should be able to launch the uvicorn app and spawn a webpage at localhost:8000 to be able to test your submission. This uvicorn app allows you to test if your model successfully will run in the server upon submission. It is designed so that you can manually input questions, and eyeball the returning values of each function that you wrote in model.py. Let’s run through the commands. Go to your root folder and run:


This command lets you install all the dependencies to run the uvicorn app and your model. Then run: 

This command lets you start the app. Your command should run without an error and return a   message, “Application startup complete.”

If you were successful, go on with the next steps; but when you see an error, you know that your model will not run on the server. 

Once you are redirected to the FastAPI page, you will see two sections of single_evaluation and batch_evaluation fuctions. Both sections have the Try it out buttons. When you click each one, you can manually input the example questions and context. As populated, these value type must be in dictionary of strings. For batch evaluation, the value type is the same, but in lists. For a successful server response, it should return 200, and a dictionary with key values of “answer” and “conf”. Here, the answer is the prediction output and conf is the score. Try out several pairs of context and question in the format given and proceed further. Repeat the process for batch_evaluation as well. Again, watch out for formatting. If you see any other server response, you should be able to see the error in the terminal window from where you launched the app.

Make sure that you have the correct input format for each fuction. Or else, you will run into syntax errors and incorrect prediction. For example, in batch_evaluation, if you just have list of strings instead of list of dictionaries, then the server response would be 422 error.


Now that you have tested your model, we’ll learn how to upload, submit, and publish your model to the website. Compress the folder in which you have placed all the code and resources into a zip file. You can zip your file in your terminal by command “zip destination_folder_path dynlab-base-qa/*”


In the instruction page, Click on the “Upload Model” button and fill in the blanks as you see fit. After that, click on the box that says “drag and drop your zip model” and select the model you want to upload. You must click on the box instead of “drag and drop”. Then, you will see a message that your submission is complete and receive an email confirmation. 


To check this, click on the user profile, and check the deployment status, that should say “uploaded.” Then, to evaluate your model, you will have to wait about a few hours for the website to return the scores. Since it takes a long time, you want to start early and have a local development set. In the model card, you can see the returned scores beside the datasets. When your model is successfully published the deployment status will be changed to green “published”. To check this, you can go back to the leaderboard and check your model with your user id to login. 


Video Version of the Submission Tutorial

Below is the video that guides you to submission of your QA model. 

dynabenchtutorial_final.mp4