Voz Project

Voz is a system that explores techniques for automatic extraction of narrative information from text. Voz combines off-the-shelf NLP tools, common sense knowledge databases and domain knowledge to extract a symbolic representation of a text and compute features related to narrative information.

Architecture

Voz implements an NLP pipeline reusing several components from open source, readily available NLP toolkits and knowledge bases. Voz is implemented in Java and Python (preview deployment, old version, unstable). Voz relies on several open source NLP toolkits (Parser Services) made available via a webservice (preview deployment, limited old version) available for download as a turnkey solution for Google App Engine (Webapp2).

Publications

For additional information or if you use any component from the system, please cite either of the papers below.
  • J. Valls-Vargas, J. Zhu, S. Ontañón (2015)Narrative Hermeneutic Circle: Improving Character Role Identification from Natural Language Text via Feedback Loops. IJCAI 2015. [PDF]
    @inproceedings{valls2015ijcai,
    Author = {Valls-Vargas, Josep and Onta{\~n}{\'o}n, Santiago and Zhu, Jichen},
    Pages = {to appear},
    Title = {Narrative Hermeneutic Circle: Improving Character Role Identification from Natural Language Text via Feedback Loops},
    Year = {2015},
    Booktitle = {Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence},
    Pages = {2517--2523},
    }
  • J. Valls-Vargas, J. Zhu, S. Ontañón (2014)Toward Automatic Role Identification in Unannotated Folk Tales. AIIDE 2014. [PDF]
    @inproceedings{valls2014aiide,
    Author = {Valls-Vargas, Josep and Zhu, Jichen and Onta{\~n}{\'o}n, Santiago},
    Booktitle = {Proceedings of the Tenth Artificial Intelligence and Interactive Digital Entertainment Conference},
    Title = {Toward Automatic Role Identification in Unannotated Folk Tales},
    Year = {2014},
    }
  • J. Valls-VargasS. Ontañón, J. Zhu (2014)Toward Automatic Character Identification in Unannotated Narrative Text. INT 7 at ELO 2014. [PDF]
    @inproceedings{valls2014int,
    Author = {Valls-Vargas, Josep and Onta{\~n}{\'o}n, Santiago and Zhu, Jichen},
    Booktitle = {Proceedings of the Seventh Workshop in Intelligent Narrative Technologies},
    Title = {Toward Automatic Character Identification in Unannotated Narrative Text},
    Year = {2014},
    }

Downloads

The system is currently under active development. Any updates will be posted in this page.
  • Voz
    • Online demo: soon
    • Source code: soon
  • Weka package implementing the continuous (or generalized) Jaccard distance [ZIP]
    • How to install? In the package manager, select unofficial [PNG]
    • How to use? Select it in an algorithm that uses a distance measure, i.e., IBk [PNG]
  • Parser Services

Data

The following packages contain the datasets used in our publications.
Please note the dataset currently does not contain the full text of the stories.
ċ
Voz-AIIDE2014.zip
(137k)
Josep Valls,
May 17, 2014, 10:07 PM
ċ
Voz-INT2014.zip
(104k)
Josep Valls,
May 17, 2014, 9:39 PM
ą
Josep Valls,
Apr 29, 2015, 9:22 AM
ą
Josep Valls,
Apr 29, 2015, 9:22 AM
ċ
WCJD.zip
(47k)
Josep Valls,
Apr 29, 2015, 9:01 AM
ċ
a_an_determiner_exceptions_list.txt
(120k)
Josep Valls,
Dec 14, 2015, 12:53 PM