REST-MEX: Recommendation System for Text Mexican Tourism


The importance of NLP in Tourism in Mexican Spanish Text Data

Tourism is a social, cultural, and economic phenomenon related to people's movement to places outside their usual residence for personal or business/professional reasons. This activity is vital in various countries, including Mexico, representing 8.7 % of the national GDP, generating around 4.5 million direct jobs.

With the pandemic generated by the SARS-COV-2 virus, which began in Mexico in mid-March 2020, tourism was one of the most affected sectors. Tourism is trying to re-establish itself through improvements in the quality and safety of touristic products and services.

Natural Language Processing (NLP) is an artificial intelligence area that can help restore tourism by generating mechanisms for detecting problems from identifying the polarities of tourists' opinions on virtual platforms. Systems can also be developed that consider the user and destination information to recommend the places where the user will have better tourist experiences. In this way, the tourism sector and the tourists themselves could be supported by the NLP.


Few recommendation systems for tourist sites are based on a user's profile's affinity compared to each place's description. The data collections to train these types of systems are from users and places in English-speaking countries. Considering the importance of Ibero-American countries in tourism, it is vitally important to generate Spanish resources that allow the generation of systems that help develop intelligent systems in tourism

On the other hand, Sentiment analysis task in tourist texts has gained relevance in the last decade; however, the most significant attention of scientific communication efforts have focused on the English language. Although some studies have focused on Spanish, few address Spanish who is not from Spain. These approaches are typically applied to collections taken from social networks such as tweets so that tourist texts have not been directly addressed.

For the 2021 edition, the rest-mex evaluation forum focuses on both tasks. First, for the recommendation system, the problem is defined as:

"Given a TripAdvisor tourist and a Mexican tourist place, the goal is to automatically obtain the degree of satisfaction (between 1 and 5) that the tourist will have when visiting that place."

For the sentiment analysis, the problem is defined as follows:

"Given an opinion about a Mexican tourist place, the goal is to determine the polarity, between 1 and 5, of the text."