Shared task

Please check Important dates for deadlines of shared task registration, preliminary run, and formal run.

Background

We have been annually holding the AI werewolf contests under the AI Werewolf project. The AI werewolf contest has two divisions, the protocol division and the natural language division. The protocol division asks participants to implement an AI werewolf player agent that communicates in a middle language called the AI werewolf protocol. The natural language division asks participants to implement an AI werewolf agent that communicates in natural language. We follow the configurations in our previous AIWolfDial 2019 shared task and other annual runs with few modifications.

The Werewolf Game in this Shared Task

Agents of the AIWolfDial 2024 shared task will play the werewolf game of five players, including roles of a seer, a werewolf, a possessed, and two villagers. Players do not know other players' roles. All players, other than the werewolf, are humans.

A game consists of a couple of days, continuing until the human team or the werewolf team survives. A werewolf can specify and attack another player in the end of the day; the attached player will be eliminated from the game. All of players are required to vote to another player, and a player voted most will be eliminated from the game. When humans survive, the villager team wins. When a werewolf survives, the werewolf team wins. A possessed is a human but belongs to the werewolf team. A seer can specify another player in the end of the day, then either human or werewolf is notified.

Agent Technical Specifications

Language Requirement

A shared task participant of AIWolfDial 2024 is required to implement an AI werewolf agent that communicate either in English or Japanese. Agents of the Japanese language are required to make an English version, at least by using machine translation internally.

Execution of Agents via Network Battle Connection System

Each agent (automated player) will be placed in a standby state, waiting for connections at a fixed IP/port using the game connection system provided by the organizers. During the main competition, the organizers will specify five fixed IPs/ports corresponding to the five agents using the battle connection system for automatic execution. During the preliminary rounds, participants will specify these fixed IPs/ports themselves for automatic execution.

Note: The system has been changed starting from 2024.AIWolf Agent APIs

Creating Agents for the Natural Language Division of the AI Werewolf Competition

Sample agent code using Python, as well as remote wrapper code to achieve remote listening with fixed IP for agents, can be found at https://github.com/aiwolfdial/AIWolfNLAgentPython/blob/main/README.md

The code for the game connection system, which calls and executes remote matches with the agent set to the listening state using the above remote wrapper code, is available here: https://github.com/aiwolfdial/AIWolfNLGameServer https://github.com/aiwolfdial/AIWolfNLPServer/blob/main/README.md

Agent Specification

Day 0 has greetings only.
The end of Day 0 has an inspection by a seer, and the game starts from Day 1.
After Day 1, the end of the days have votes by all players and an attack by a werewolf. Vote, attack, inspection are made via specific APIs (network communications).
A day consists of a couple turns, where all of agents can make a talk for each turn, receiving talks of previous turns.
An agent is required to talk for each turn in the day, but the order within a talk is random. An agent could be asked to make a talk just after the previous talk, or after 8 talks (2 talks of 4 other agents) of other agents.
An agent should make an action including a talk, a vote, etc. within a specified periods (1 minute at maximum, hopefully 5 seconds in average) after an action request is sent.
During days, Agents can communicate anything in natural language. A talk should consists of normal letters and punctuations only. An agent returns "Skip" when nothing to talk, returns "Over" if nothing to talk anymore in that day.
Use Agent[0x] (e.g. Agent[05], x is 1-5) to mention other agents．
An anchor e.g. ">>Agent[0x]" could be inserted at the beginning of a talk to refer to another agent, to whom your agent with to talk with. That agent is assumed to respond something to your agent by using an anchor.

Resources

Corpus and logs
- Previous game logs https://kanolab.net/aiwolf/
- Mafiascum https://www.mafiascum.net/ An online Mafia game forum
- WolfBBS annotated corpus (in Japanese) https://github.com/aiwolf/wolfbbs_annotations

From Registration to Formal Run

Registration

A team should send a mail to aiwolf at kanolab.net (replace at by @) to register the shared task, describing your team name, a contact e-mail address, names and affiliations of your members (please mark a contact person when a team consists of multiple members), communication language (English and/or Japanese) of your agent, ssh public key and your preferred user name to connect to our game server. There is no fee required to register/participate the shared task. You will be notified our preliminary and formal run server location after registration. Please note that your game logs will be made public without any usage restriction.

Testing Your Agent System Beforehand

A shared task participant is required to implement an AI werewolf agent that connects to our AIWolf server. We will provide an AIWolf server running, where participants can try connecting with (dummy) agents to check their system behavior. Participants are required to check their systems certainly work before the shared task run.

Preliminary Run (Self-match game)

Participants should run your five agents connected to our server, running at least 10 games. Then submit your game logs to the organisers. If there are too many participants to run the formal run, organisers might select formal run teams depending on these logs. These logs will be used in the final evaluations.

Formal Run (Multi-agent game)

A participant team is required to hold thier agent ready and continue connecting to our server during the formal run days to play games with other participants.

System Evaluation

Participants should submit a system design description document to the organizers. This document and logs of the games might be used for research purpose and included and published in our overview paper without any further permission. Participants are encouraged to submit a paper to the workshop.

Games will be between the same agents, different agents, and/or human players. In addition to the win rates, Reviewers will perform subjective evaluations on the game logs, using following criteria:

A Natural utterance expressions
B Contextually natural conversation
C Coherent (not contradictory) conversation
D Coherent game actions (vote, attack, divine) with conversation contents
E Diverse utterance expressions, including coherent characterization

Please not that vague utterances, that can be used regardless of contexts, are not always natural in the werewolf game.

The top-ranking teams will be awarded prizes and gifts from SpiralAI, a company developing its own LLM for colloquial multi-turn conversations.

Paper Submission

Participants are strongly encouraged to submit a paper to this AIWolfDial 2024 workshop. It is mandatory, at least, to submit a system description to the organizers at the same deadline of the workshop paper submission.

Google Sites

Report abuse