manually labeling a data set to train a model in the future
As mentioned earlier, I decided to forego a sentiment analysis because the majority of the data set was inherently positive. I chose to prepare the data by labeling with binary classifiers of "content" or "celebration".
Content labels include language that references something participants could use in or apply to digital learning environments. In figure 3, these were "Open eBooks", "Google Keep", "Quill Connect", "Quizalize" and so on. I also classified advice as content as it contributes to the DPLN with more potential for change in practice than a tweet with a shout-out. In figure 3, the example is "Pick 1 #edtech per month to really use." I've also included as content any tweets about resources, including to websites or presentations from the conference. Many presenters include shortened URLs, or web addresses, to their materials as they promoted or recapped their sessions.
Celebrations are of note and worth labeling as a classifier because they contribute to the relationship-building of the DPLN. Educators learn how to effectively praise student effort and achievement with specific feedback to propel momentum and motivation for learning. "This is as true for us teachers as for students" (Routman, 2008, p. 30). Routman includes many sentiments in her definition of celebration which I adopted for the Celebration classifier. These include any texts that imply "affirming, congratulating, showcasing, noticing, and making public the positive and specific actions and work" (Routman, 2008, p. 29). mentioned in a tweet. In figure 3, some examples are "One of my fave sessions", "Pleasure finally meeting you..." and "Of course he has the answers".
From the random sample data set, I added one more classifier based on Dey's description of text that could be coded with multiple labels (2014). One user that fits this category tweeted, "Not only am I a sub folder person but they are color coded as well!! Gold person to the core!!" with additional hashtags besides #NCTIES17. The tip about color-coding folders can be construed as advice, while the sentiment comes across as celebratory because of punctuation.
The resulting sample data set, after cleaning and preparing, consisted of 78 tweets labeled as "celebration", 82 as "content", and 44 labels of "both".
My own tweet was not included in the sample data set, but how would I label it if it were?
There is a clear shout out with mentions by username, keynote speaker George Couros and a word that I would train my model with: "Ready". This keyword and its sentiment appeared several times in my sample size, which is discussed in the next section "Discovering What the Data Said".