Dec. 15, 2018
Human-Computer Question Answering Competition at UMD
On Dec.15th, University of Maryland will host a series of human-computer question answering competitions. All competitions are based on quiz bowl format. The event will recognize the best human quiz bowl teams, the best computer quiz bowl systems that can defeat other computer systems and the top human teams, and the best quiz bowl question writers who can craft high-quality quiz bowl questions that entertain and challenge humans while stumping existing computer systems.
Human Quiz Bowl Teams
We invite human quiz bowl teams to join us on Dec 15th. to compete against other human teams and computer systems. We accept teams at the high school, collegiate, and open levels.
- Sign up here to get on the waiting list. If we get two additional registrations, we will expand the field.
Computer Quiz Bowl Systems
Computer teams are invited to submit question answering systems for the quiz bowl task. Systems will compete against each other, and top systems will get the unique opportunity to compete against top human systems.
Quiz Bowl Question Writers
We invite quiz bowl question writers to participate in crafting high-quality quiz bowl questions that will be used to challenge human teams and computer systems.
October 15th, 2018:
- Deadline for registering for question writing competition
November 1, 2018:
- Deadline for submitting first half of questions
November 28, 2018 (Extended from November 15, 2018):
- Deadline for registering computer systems
December 1, 2018
- Deadline for registering human teams (currently full, but sign up to get on waiting list ... we will extend schedule if we get enough registrations)
- Deadline for submitting second half of questions
- Field announcement for tournament
- Preliminary results released for computer teams that have system submissions
December 10, 2018 (Midnight, Eastern Time)
- Computer teams must have systems finalized and submitted
December 15, 2018
- Inaugural QANTA Competition
Event schedule and setup
Venue / Amenities
The event will take place at the University of Maryland, College Park. Location details will be announced early in December. Breakfast will be provided.
The December 15 event will have three components:
- A morning round-robin tournament to determine the best human teams
- Concurrently in the morning, a workshop for the creators of computer systems to discuss their systems
- In the afternoon, an eight-way final to decide the overall champion
- At 16:00, an award ceremony that will additionally recognize the best question authors
Because of limited equipment for Human-Computer matchups, we will have a limited field (initially a cap of seven teams, but could be raised to fourteen). Teams will be selected (final field announcement on December 1) on a first-come, first-served basis with the following exceptions:
- Teams writing a packet will be prioritized over teams writing half packets or no packet
- Teams writing a half packet will be prioritized over teams writing no packet
- We want to make sure we have at least 25% of the field for each of the following three categories: HS, open, and college teams. We may prioritize later registrations from groups that aren't represented in early signups.
However, if you commit to writing a packet or half-packet, we will let you know after the October 15 registration deadline if we have space for you to play.
Fees and prizes
Our goal is to bootstrap a new competition framework. If you help write questions, the tournament will be free. This is not a normal tournament.
- You are only charged a $50 for the whole team entrance fee if you play as a human team without writing a packet.
- The competition is free for computer teams
Rewards for question writers
- If you only submit a half packet without playing, we'll pay you $75
- If you submit a full packet without playing, we'll pay you $150
- If you submit a full packet and play as a team, we'll pay you $100
- If you submit a half packet and play as a team, we'll pay you $50
Prizes for winning teams
- Best Human Teams
- $250 for first place team
- $100 for second place team
- $50 for top college team (can combine with above prizes)
- $50 for top open team (can combine with above prizes)
- $50 for top high school team (can combine with above prizes)
- Best Computer Teams
- $250 for first place team
- $150 for second place team
- $100 for third place team
- Best Writers
- $250 for best packet
- $100 for second-best packet or half packet
- $50 for third-best packet or half packet
- $50 for best question (x2, can combine with above prizes)
Organizing the tournament is Jordan Boyd-Graber with logistical support from MAQT; Eric Wallace is supporting the question writing competition; Chen Zhao and Ahmed Elgohary are supporting the computer competition; and Shi Feng and Pedro Rodriguez are helping in the background. Daniel Jensen is leading question compilation, and Kurtis Droge is providing freelance questions.
Contact us and stay informed
- Please send your questions to firstname.lastname@example.org.
- You can get announcements by joining our human-computer question answering mailing list. We will also post on HSQB.
- Explore other ways for getting involved.
Q: Why pyramidal questions? You're the only idiot in the machine learning / natural language processing community using these questions.
A: primary goal of a dataset/task in machine learning is to distinguish good systems from bad systems. Single clue datasets (such as TriviaQA or SQuAD) require many more questions to discriminate between top question answerers. While it's true that quiz bowl questions are longer, it is easier to write a good pyramidal question on a single topic than five good single clue questions on five different topics.
Moreover, because quiz bowl is on a word-by-word basis, it offers far more opportunities to discriminate between question answerers. Because the questions are pyramidal, the answerer with the deeper knowledge can answer first.
Q: Why have computers and humans play against each other? Is this a gimmick?
A: Quiz Bowl is designed to be interactive. Humans play against each other in real time (for fun). While machine learning tasks need not be interactive with humans, interruptable questions allow easy comparison against human performance and enable opportunities to teach humans about machine learning, natural language processing, and question answering.
But hopefully it will be fun!
Q: What's the format of the games? Why is it structured like that?
A: We'll have 40 questions, tossups only. If there's a tie after 40 questions, we'll break the tie with three tie breaker questions. The team with the higher score after three questions will win. If the score is still tied after the tie-breaker questions, we'll read questions until the score changes. The first change in the score will decide the game.
We are not using bonus questions because tossup questions are more interesting (based on their pyramidal structure) to decide whether humans or computers are smarter. What would make bonus questions interesting would be to emphasize collaboration in human-computer hybrid teams. We're working on figuring out how to make that both interesting and fun, but we're not quite there yet.
Q: How will the questions be judged?
A: The questions will be judged by the following criteria:
- Questions should be accurate and only contain true information.
- Questions should be interesting; even if you know nothing about the subject, the question should be engaging and leave the listener to find out more about the topic.
- Questions should contain a variety of clues / information that reflect study and knowledge of a subject.
- Questions should be appropriately pyramidal for humans, effectively separating skill levels.
- Questions should be appropriately pyramidal for computers, effectively separating knowledge / comprehension.
Q: Computers involved in a trivia tournament ... will the questions suck?
A: The goal of having computers in the loop is to improve the questions from both a scientific perspective and a quiz bowl perspective. Here's how computers can help you write better questions:
- Avoid stock clues in the lead-in
- Automatically find similar questions (to find other interesting clues or avoid repetition)
- Avoid hoses (if the computer thinks the answer might be X, so might a human ... perhaps you can rephrase)
- Automate tasks like pronunciation guides and alternate answer lines (don't know if this will make it in for this iteration, but it's on our todo list!)
Again, we hope that the resulting questions are high quality from a human perspective. We hope that we'll attract good question writers, and that a reasonable reader will recognize them as good questions. We're not going for gimmicks and quirks in terms of the questions.
Q: Can we blend subcategories/categories?
A: Feel free to blend subcategories and categories. Our distribution requirements aren't super strict. But don't use blending to avoid writing about topics (e.g., "real" science) that should be covered in the set somewhere.
Q: Should the computer be able to answer all of the questions by the end?
A: No, while your answers have to be in our answer set, if the system on write.qanta.org cannot answer the question, that's fine too! It probably means that you're doing something unique and interesting. If humans will like and convert the question, then you're doing everything exactly right.
Q: Is this harder than one-line, single sentence QA?
A: From Dwight Wynne:
The pyramidal question is not inherently an easy question. Like any other question, its difficulty is determined by both the answer and clues selected. However, a one-line question is never easier than a pyramidal question containing the same one line. A question will be answered by the union of the sets of [answerers] who recognize and buzz correctly from each clue contained in the question. Since the pyramidal question contains all clues already present in the one-line question, plus additional clues, it must follow that at least as many [answerers] can answer the pyramidal question as the one-line question.
Q: What if we're interested only in complete sentences? (Or our systems can only answer complete sentences ...)
A: It's possible to only look at individual sentences; i.e., only provide answers after you have a complete sentence. If you're only interested in single sentences, you can only answer after the first sentence (concrete questions must uniquely identify the answer immediately). If you can always answer the question after the first sentence, you'll likely do quite well at the overall task.
Thus, quiz bowl is a superset of single sentence QA. While some questions to require reasoning across sentences, the vast majority of the time it's possible to only answer based on individual, complete sentences in isolation (each sentence in a question getting easier).
Q: I can't find my favorite answer in the system. Why is this?
A: This is a design decision that we made. These are the answers for which there have been three answers in mainstream quiz bowl tournaments. This is a tradeoff that we made to keep things relatively fair. We want questions that are challenging for computers not because they lack data but because they cannot understand English. By excluding rare answers and only focusing on frequently asked answers, if a computer gets it wrong, it's not because it lacks information to work off of ... it's because it didn't understand the question. We realize it's a little frustrating, so it's useful to check whether the answer is in bounds before writing the question. In many cases, you can tweak the question to ask about something more general (instead of asking about "William W. Belknap", ask about Grant, focusing on members of his cabinet).
Q: What if I already have some questions written that aren’t in the interface?
A: Due to the limitations of this competition, it will be necessary to restrict answer lines to the ones already in the answer set at write.qanta.org. It isn’t recommended, but if you have already written questions with answers outside of the set, you can email them to email@example.com to submit them. Otherwise, I would recommend that you check if an answer is available at write.qanta.org before writing a question.
Q: What is the best way to write questions?
A: We suggest the following procedure for writing questions. First, as you’re deciding on answer lines, check write.qanta.org to make sure they’re in bounds. Then, draft the question in whatever editor you’re most comfortable with. When you have a first draft, copy and paste it into the interface, edit to make sure the lead in (at least) isn’t trivially answerable by our baseline system, and then submit. We also suggest keeping your own backup just in case something goes wrong (we hope it doesn’t, but better safe than sorry). We realize this is slightly more hassle than normal question writing, but this will hopefully lead to better questions and also advance the state of the art in natural language processing.
Q: How do I submit questions? Why this craziness?
A: You submit your questions through a web form. We describe the system in this paper, and a tutorial video is below.
Q: How do I make an account on the interface?
A: Just login with a new email and password. This will create a new account. There can only be one account per email.
Q: What if I forget my password?
A: Send us an email at firstname.lastname@example.org.
Q: What are the strange highlighted colors in the interface?
A: Words which are highlighted are "important" for our Quizbowl AI system to make its predictions. If you modify those words (e.g., rephrase that sentence), there is a high chance the system will get more confused.