Computer Answering Guide

If you think you have a robust QA system that can achieve victory over humans with these questions, please submit them to the Dynabench submission website (https://dynabench.org/tasks/RQA). The questions that were used in the OQL will be applied as test sets to your model.

Your QA system's prediction results should return PREDICTED ANSWERS and CONFIDENCE SCORE. More details can be found below in the 5-2.Code guidelines (The format is visible under the single_evaluation function in the default submission folder).

Steps to Submission

Click on the Dynabench submission website (https://dynabench.org/tasks/)
First, sign up with your account. Signing up is important so you can check your model deployment status and the model outputs after you submitted your model.
Then, click on https://dynabench.org/tasks/RQA
Click on the “submit models” button
The instructions will guide you to:
1. Downloading the default folder that is specifically designed for model submissions (Put cursor on blue printed “this”)

Click on the hyperlink that will allow you to download a zip file, and unzip it in your local directory. Next, click on app directory. There are two directories that you want to make changes in: domain and resources folder. Domain folder contains the model.py file that you will need to write the code for making predictions from your model, and resources folder should be populated with the model components like weights, tokenizers, and processors. Other directories include files that configure the types of input and output of the model and etc. You do not need to worry about these other directories.

1. How you should edit the files in this folder to make a successful submission.

Model

The resources directory is where you will put your model files, tokenizers, including the configuration file. Now, you will need to write a code that makes predictions from the model in the resource folder.

Code

To do that, in the app/domain directory, you will find a file called model.py. There are two functions. As batch_evaluation function uses the single_evaluation function, you need to add code to the single_evaluation function. This predicts and evaluates the input question line by line and returns the prediction and the confidence score.

Be mindful of the argument type which should be both “string”. The predicted answer will also be the “string” type.As we used a model imported from Huggingface, we used the inference code to implement their models. Make sure that your model path is set to the resource folder path. If not, your predictions will not be inferenced from your QA model and return the questions that are set as default return value.

Dependencies

It might be possible that your model needs some specific packages. To add these dependencies, click on the requirements.txt file in the root folder and write the packages you need. For example, to use transformers and torch packages to run your model, as they were installed by command, you will need to add in “transformers==4.26.1” and “torch” along with other requirements. Make sure you assign the specific version of the packages, if they are not the most recent ones.

Others

If you think that you need to add any other documents that aren’t code, you may add them to the root folder.

1. Test if your folder is viable to run on the server.

To test this submission, you should be able to launch the uvicorn app and spawn a webpage at localhost:8000 to be able to test your submission. This uvicorn app allows you to test if your model successfully will run in the server upon submission. It is designed so that you can manually input questions, and eyeball the returning values of each function that you wrote in model.py. Let’s run through the commands. Go to your root folder and run:

python3 -m pip install -r requirements.txt

This command lets you install all the dependencies to run the uvicorn app and your model. Then run:

python3 -m uvicorn app.app:app --reload

This command lets you start the app. Your command should run without an error and return a message, “Application startup complete.”

Then, go to your Chrome browser and type, “localhost:8000/docs”. Then you will be directed to a Uvicorn browser where you can test your code and see if your model runs successfully on the server.

If you were successful, go on with the next steps; but when you see an error, you know that your model will not run on the server.

Convert to Uvicorn browser

Once you are redirected to the FastAPI page, you will see two sections of single_evaluation and batch_evaluation fuctions. Both sections have the Try it out buttons. When you click each one, you can manually input the example questions and context. As populated, these value type must be in dictionary of strings. For batch evaluation, the value type is the same, but in lists. For a successful server response, it should return 200, and a dictionary with key values of “answer” and “conf”. Here, the answer is the prediction output and conf is the score. Try out several pairs of context and question in the format given and proceed further. Repeat the process for batch_evaluation as well. Again, watch out for formatting. If you see any other server response, you should be able to see the error in the terminal window from where you launched the app.

Tips and errors

Make sure that you have the correct input format for each fuction. Or else, you will run into syntax errors and incorrect prediction. For example, in batch_evaluation, if you just have list of strings instead of list of dictionaries, then the server response would be 422 error.

1. Upload and submit

Now that you have tested your model, we’ll learn how to upload, submit, and publish your model to the website. Compress the folder in which you have placed all the code and resources into a zip file. You can zip your file in your terminal by command “zip destination_folder_path dynlab-base-qa/*”

In the instruction page, Click on the “Upload Model” button and fill in the blanks as you see fit. After that, click on the box that says “drag and drop your zip model” and select the model you want to upload. You must click on the box instead of “drag and drop”. Then, you will see a message that your submission is complete and receive an email confirmation.

1. Check your submission

To check this, click on the user profile, and check the deployment status, that should say “uploaded.” Then, to evaluate your model, you will have to wait about a few hours for the website to return the scores. Since it takes a long time, you want to start early and have a local development set. In the model card, you can see the returned scores beside the datasets. When your model is successfully published the deployment status will be changed to green “published”. To check this, you can go back to the leaderboard and check your model with your user id to login.

1. FAQ
  - Writing code in model.py: Make sure that your model path is set to the resource folder, as relative path ("./app/resources"). If not, your predictions will not be inferenced from your QA model and return the questions that are set as default return value.
  - Add dependencies: To add these dependencies, click on the requirements.txt file in the app directory and write the packages you need.
  - Testing Uvicorn app: Make sure to follow the exact input format. You will run into syntax errors and incorrect prediction if the input format is incorrect. The format here is dictionary.
  - Submission: You must click on the box instead of “drag and drop”

Video Version of the Submission Tutorial

Below is the video that guides you to submission of your QA model.

dynabenchtutorial_final.mp4