Data Science 101:

Text Mining in Social Media

This workshop is presented in ArabWIC Conference 2019 in Morocco

Presented in MAR 9, 2019 from 11:00am -12:45pm

If you do not download the Python and accompanying tools. You can follow the steps below to do so

You can also watch this video to find out how to download Python

First, we need to load Python tools:

What is Jupyter Notebook?

An open source web application that lets you create and share documents that contain living codes, equations, visuals, and narrative text.


The easiest way to install the Jupyter Notebook application is to install a scientific python distribution that also includes scientific python packages. The most common distribution is called Anaconda. Download (Download 3.7)

If you have Python3 you can download Jupyter via the following:

python3 -m pip install --upgrade pip
python3 -m pip install jupyter

When the Jupyter download is complete you can now go to the next step.

Second: Create a file to contain all the files of the workshop

Create a folder in your desktop. Then name it DSTutorial, or whatever name you prefer.

Third: Run Jupyter Notebook

In the terminal, type:

cd /desktop/DSTutorial
jupyter notebook


Voila!

The browser opens with you a new browser window (or new tab) that displays a control panel that allows (among other things) to select the notebook you want to open, from which you can create code-specific pages.

Fourth: Download the required libraries

In the new browser window and on the left menu, create a new Python file. This will create a new blank page through which you can write and run your own code directly in your browser.

To download the required libraries, cut and paste the following code into the blank page and then run the downloadable page.

#Install a pip package in the current Jupyter kernel
import sys
!{sys.executable} -m pip install numpy
!{sys.executable} -m pip install nltk
!{sys.executable} -m pip install pandas
!{sys.executable} -m pip install matplotlib

You are now ready!

Please download these files before coming to the workshop:

  • Data File:
    • the data is an Arabic Corpora Annotated for Sentiment borrowed from Prof. Saif Mohammad website . it can be downloaded from here .
  • Code File:
    • download it from here