Award Winners are announced!
Given a Qur'anic passage (that consists of consecutive verses in a specific surah in the Holy Qur'an) and a question (posed in Modern Standard Arabic (MSA)) over that passage, the system is required to extract an answer to the question, where an answer is a span of text extracted from the given passage. The span must be an exact sub-string of the passage. The question might be a factoid or non-factoid question. Examples are shown below.
The task is defined to find any correct answer, even if the question actually has more than one answer in the accompanying passage.
The system is required to return up to 5 potential answers, ranked from 1 (top/best) to 5 (lowest) from the accompanying passage for the given question. Therefore, the evaluation will reward the system for returning any of the correct answers as higher as possible in the returned list of answers.
Qur’anic Passage (38:41-44) الفقرة القرآنية
وَٱذْكُرْ عَبْدَنَآ أَيُّوبَ إِذْ نَادَىٰ رَبَّهُۥٓ أَنِّى مَسَّنِىَ ٱلشَّيْطَٰنُ بِنُصْبٍ وَعَذَابٍ. ٱرْكُضْ بِرِجْلِكَ هَٰذَا مُغْتَسَلٌۢ بَارِدٌ وَشَرَابٌ. وَوَهَبْنَا لَهُۥٓ أَهْلَهُۥ وَمِثْلَهُم مَّعَهُمْ رَحْمَةً مِّنَّا وَذِكْرَىٰ لِأُو۟لِى ٱلْأَلْبَٰبِ. وَخُذْ بِيَدِكَ ضِغْثًا فَٱضْرِب بِّهِۦ وَلَا تَحْنَثْ إِنَّا وَجَدْنَٰهُ صَابِرًا نِّعْمَ ٱلْعَبْدُ إِنَّهُۥٓ أَوَّابٌ .
السؤال/ Question: من هو النبي المعروف بالصبر؟
الإجابة الذهبية / Gold Answer:
أَيُّوبَ
Qur’anic Passage (74:32-48) الفقرة القرآنية
كَلَّا وَٱلْقَمَرِ. وَٱلَّيْلِ إِذْ أَدْبَرَ. وَٱلصُّبْحِ إِذَآ أَسْفَرَ. إِنَّهَا لَإِحْدَى ٱلْكُبَرِ. نَذِيرًا لِّلْبَشَرِ. لِمَن شَآءَ مِنكُمْ أَن يَتَقَدَّمَ أَوْ يَتَأَخَّرَ. كُلُّ نَفْسٍۭ بِمَا كَسَبَتْ رَهِينَةٌ. إِلَّآ أَصْحَٰبَ ٱلْيَمِينِ. فِى جَنَّٰتٍ يَتَسَآءَلُونَ. عَنِ ٱلْمُجْرِمِينَ. مَا سَلَكَكُمْ فِى سَقَرَ. قَالُوا۟ لَمْ نَكُ مِنَ ٱلْمُصَلِّينَ. وَلَمْ نَكُ نُطْعِمُ ٱلْمِسْكِينَ. وَكُنَّا نَخُوضُ مَعَ ٱلْخَآئِضِينَ. وَكُنَّا نُكَذِّبُ بِيَوْمِ ٱلدِّينِ. حَتَّىٰٓ أَتَىٰنَا ٱلْيَقِينُ. فَمَا تَنفَعُهُمْ شَفَٰعَةُ ٱلشَّٰفِعِينَ.
السؤال/ Question: ما هي الدلائل التي تشير بأن الانسان مخير؟
الإجابات الذهبية / Gold Answers:
لِمَن شَآءَ مِنكُمْ أَن يَتَقَدَّمَ أَوْ يَتَأَخَّرَ
كُلُّ نَفْسٍۭ بِمَا كَسَبَتْ رَهِينَةٌ
This task is evaluated as a ranking task. To give credit to a QA system that may retrieve an answer (not necessarily at the first rank) that does not fully match one of the gold answers but partially matches it, we use partial Reciprocal Rank (pRR) [1]. It is a variant of the traditional Reciprocal Rank evaluation metric that considers partial matching. pRR is the official evaluation measure of our shared task.
We will also report Exact Match (EM) and F1@1, which are evaluation metrics applied on the top predicted answer only. The EM metric is a binary measure that rewards a system only if the top predicted answer matches one of the gold answers exactly. Whereas, the F1@1 metric measures the token overlap between the top predicted answer and the best matching gold answer [2].
To get an overall evaluation score, each of the above measures is averaged over all questions.
The evaluation script is released on our main repo.
[1] Malhas, R. and Elsayed, T. AyaTEC: Building a Reusable Verse-Based Test Collection for Arabic Question Answering on the Holy Qur’an. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 19(6), pp.1-21, 2020.
[2] Rajpurkar, P., Zhang, J., Lopyrev, K. and Liang, P. SQuAD: 100, 000+ Questions for Machine Comprehension of Text. In EMNLP 2016.