One central limitation was that - as visible in the chart above - the articles were not evenly balanced across each category
The largest category by far was CAT_1 (AAL / AAVL), which suggested perhaps this should just be a classifier for this category alone -- or that combining CAT_1 and CAT_2 makes sense for an initial classifier
The next largest categories were CAT_2 (African Americans) and CAT_4 (Native Americans), great candidates for being next labels to integrate into the classifier
Then, there are categories which have almost no instances:
UNK and CAT_0 (negative categories) makes sense, because we excluded most articles which were unclear or did not represent another category - we only included positive instances
CAT_3_1 (Latinx / Hispanic Diaspora) had 1 article
CAT_4_1 (Indigenous Peoples - World) had 2 articles
The remaining categories had a small number of articles, not being quite numerous enough for a reliable prediction to be formed
There were very few articles overall, some of which were quite short in length
This is a major challenge, as the labor of annotating the current articles was already immense, split over years of Dr. Lanehart's and Ayesha's manual annotating - the decision to include more annotated articles, while it would certainly improve quality of the results, would also incur a large amount of further labor, unless clustering techniques were used to automate the process (though this likely would still need review)
another limitation was my ability to spend time on the project changed dramatically, and so recurring challenges (such as setting up a stratified k-fold for cross-validation) did not receive enough time to be resolved and became a major block