DOCUMENTATION
________________
________________
Methodology
Figure 2: Architecture diagram of our methodology
Toddler-oriented videos
Given that we needed a large dataset of toddler-oriented videos for our research purpose, we decided to create our own dataset. Since it would have been difficult to manually identify videos that are appropriate for young children, we decided to identify YouTube channels that may host a certain kind of appropriate content for young audiences. Hence, we manually find toddler-oriented channels on YouTube and infer whether all the videos of such a channel are appropriate for kids or not, based on a random selection of at least 10 videos of the given channel. If all the videos of a toddler-oriented channel that we watch seem consumable content for young children, then we add all the videos of the channel to our dataset D1. If we find even one video among the randomly selected videos to be inappropriate or irrelevant for kids to watch, we drop such a channel and do not add it to the dataset D1.
Annotation
Our annotators manually review the videos in selected channel by inspecting following data descriptors: (1) channel content, (2) video content, (3) channel titles, (4) video titles, (5) thumbnails, and (6) tags in YouTube. The collected metadata is presented to five annotators who inspect the channel content and determine whether the channel is appropriate for young children or not. Finally, our dataset, D1, contains 51 toddler-oriented channels.
Five authors of the paper recommended their own list of toddler-oriented channels. Four out of these five authors independently labeled the channels that others provided. And discrepancies were resolved by the author who recommended the channel as the fifth vote. Finally, dataset D1, has 51 channels and 24.6~K videos.
Inter-annotator agreement
We compute the agreement between raters using Bennett et al.'s S Score (a measure to compute inter-rater reliability). The S score value that we get is 0.96, indicating a strong agreement between raters' labeling.
Videos Downloading
Each YouTube video and channel have a unique identification code that can be used to download all information about the channel using YouTube Data API. Thus, we extract all the video ids for every channel in dataset D1.
Dataset
Definitions for appropriate, inappropriate and irrelevant content for young children
We consult YouTube's guidelines [13] and the Children's Online Privacy Protection Act (COPPA) [13] of the Federal Trade Commission to determine whether certain content is appropriate, inappropriate, or irrelevant for young audiences.
Appropriate Content: It includes suitable content for toddlers and preschoolers (under the age of 6) and content that is relevant to their typical interests such as nursery rhymes, cartoons / animations for kids, educational videos for kids, video games without inappropriate content, kids' toys demonstrations and ratings, children's music or dance performances, reality shows made for young children, and animal videos that are not disturbing. These categories are determined by the use of animated characters, the age of characters, child-oriented activities and incentives, and simple language or content suitable for a wide audience.
Inappropriate Content: It includes content that is not suitable for toddlers and preschoolers (under the age of 6) to consume.
Violent, scary or disturbing videos are some examples of inappropriate content. Such videos may include inappropriate visual content, language or both.
Irrelevant Content: It is content that is irrelevant or uninteresting to young viewers such as building constructions, tax services, money, repair works, car purchases and services, politics, professional services, etc.
Video ads
We use Selenium [4] scripts to play the appropriate videos from dataset D1 twice, so that we can scrape the video ads from the YouTube pages of appropriate videos. The dynamically rendered HTML content of a video Web page gives us the details about the video ads, if present, that are played on the actual video. Finally, we obtained 3,517 unique video ads after playing every video from dataset D1 twice. We refer to this list of video ads as the dataset, D2.
Annotating video ads
Five authors of the paper independently label, roughly 703 ads each from the dataset D2, as appropriate, inappropriate or irrelevant. In practice, every annotator manually opened the assigned YouTube ad videos and watched the entire ad content to classify the as appropriate, inappropriate, or irrelevant (Figure 3 shows example snippets of an inappropriate and an irrelevant video ad).
Annotation Validation
Two other authors of the paper randomly selected 100 video ads each from the dataset D2. And they independently validated the annotations for the randomly selected video ads. The aggregate accuracy of the annotation work as reported in this validation step is 97%.
Figure 3: Snippts of Inappropriate Video ads
Figure 4: Snippet of Irrelevant Video ads
Sidebar ads
We use Selenium [4] scripts to play the appropriate videos from dataset D1 twice, so that we can scrape the sidebar ads from the YouTube pages of appropriate videos. The dynamically rendered HTML content gives the details of the sidebar ads including the ad (1) title, (2) description, and (3) Website URL. Finally, we obtained 1,069 sidebar ads after playing every video from dataset D1 twice. We refer to this list of sidebar ads as the dataset, D3.
Annotating sidebar ads
Five authors of the paper independently label, roughly 213 sidebar ads each from the dataset D3, as appropriate, inappropriate, or irrelevant. Every annotator manually checked the details of each sidebar ad, including the referenced Website for any given ad. They independently assign appropriate, inappropriate, or irrelevant labels for the sidebar ads: (1) as visible on the YouTube page, and (2) embedded URLs in the ad. And Figures 5,6,7,8 show example snippets of an inappropriate and an irrelevant sidebar ad. The reason why we assign separate labels for ad visibility on YouTube pages and embedded URLs in the sidebar ads, is that the sidebar ads can be deceiving. Clicking on sidebar ads may open Webpages that contain disturbing content for kids. Although a sidebar ad, as is shown on a YouTube page, may appear appropriate from its description and title, the reference embedded in the ad can be inappropriate. In this work, we refer to such sidebar ads as deceptive sidebar ads (See Figures 9 and 10).
Annotation Validation
Analogous to the annotation validation step for video ads, two other authors of the paper, who did not participate in the ads annotation work, randomly selected 100 sidebar ads each from the dataset D3. They independently validated the annotations of the randomly selected sidebar ads for their descriptions and their referenced Websites. And the aggregate accuracy of the annotation work, as reported in this annotation validation step, is 97.5%
Figure 5: Snippet of Irrelevant sidebar ads
Figure 6:URL webpage of sidebar ads in Figure 5
Figure 7: Snippet of Inappropriate sidebar ads (contain frightening element)
Figure 8:URL webpage of sidebar ads in Figure 7
Figure 9: Snippet of a "Looks" appropraite for kids sidebar ads in a toddler oriented video
Figure 10: Snippet of the website content of sidebar ad in Figure 9
Findings
Inconsistencies in ad reporting on YouTube: As per YouTube Help [2], if one finds an ad to be inappropriate
or violating Google’s ad policies, one can report such ads to YouTube directly using the “Why this ad?" button. On clicking this button, one can select the “Report this ad” option, or “Stop seeing this ad” option, or both. However, we find inconsistencies in the ad reporting feature. We visited 200 different YouTube video pages and we found that roughly for half of our total trials, the reporting options were either completely disabled or were incomplete (i.e., only "Reporting this ad" option was available and not the "Stop Seeing this ad" option) (See Figure 11 for the inconsistencies of Reporting this ad option)
Figure 11: Inconsistencies of YouTube Reporting this ad option
Ethics
We collect only data publicly available on the Web and do not (1) interact with online users in any way nor (2) simulate any logged-in activity on YouTube or the other platforms. Therefore, the IRB approval was not required.
Appendix
Figure 12: Snippt of an Inappropriate banner ad
References
[1] 2022. About video ad formats. https://support.google.com/googleads/answer/2375464
[2] 2022. Ads on videos you watch. https://support.google.com/youtube/answer/3181017
[3] 2022. Blippi - educational videos for kids. https://www.youtube.com/c/Blippi
[4] 2022. Selenium webdriver (2022). https://tinyurl.com/y6a4czhe
[5] 2022. YouTube Data API. https://developers.google.com/youtube/v3/docs/videos
[6] Camila Souza Araújo, Gabriel Magno, Wagner Meira, Virgilio Almeida, Pedro Hartung, and Danilo Doneda. 2017. Characterizing videos, audience and advertising in Youtube channels for kids. In International Conference on Social Informatics. Springer, 341–359.
[7] Mark Bergen and Lucas Shaw. 2019. Why children don’t like YouTube Kids – the most popular kids’ video site in the world. https://theprint.in/tech/why-children-dont-like-youtubekids-the-most-popular-kids-video-site-in-the-world/251074/
[8] Helen Lee Bouygues. 2019. The importance of critical thinking for students of all ages. https://edublog.scholastic.com/post/importancecritical-thinking-students-all-ages
[9] Common Sense Media and CS Mott Children’s Hospital. 2020 YOUNG KIDS AND YOUTUBE: HOW ADS, TOYS, AND GAMES DOMINATE VIEWING. https://www.commonsensemedia.org/sites/default/files/research/report/2020_youngkidsyoutube-report_finalrelease_forweb.pdf
[10] Brian Dean. 2021. How Many People Use YouTube in 2021. https://backlinko.com/youtube-users/ [Accessed December,2021].
[11] Colin Dixon. 2020. YouTube used by more children than YouTube Kids. https://nscreenmedia.com/more-kids-youtube-versus-youtubekids/
[12] Carsten Eickhoff and Arjen P de Vries. 2010. Identifying suitable YouTube videos for children. 3rd Networked and electronic media summit (NEM) (2010).
[13] Federal Trade Commission. 2020. Complying with COPPA: Frequently Asked Questions. https://www.ftc.gov/business-guidance/resources/complying-coppa-frequently-asked-questions
[14] Kilem L Gwet. 2014. Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters. Advanced Analytics, LLC.
[15] Alex Hern. 2022. YouTube kids shows videos promoting drug culture and firearms to toddlers. https://www.theguardian.com/technology/2022/may/05/youtube-kids-shows-videos-promoting-drug-culturefirearms-toddlers
[16] Akari Ishikawa, Edson Bollis, and Sandra Avila. 2019. Combating the elsagate phenomenon: Deep learning architectures for disturbing cartoons. In 2019 7th International Workshop on Biometrics and Forensics(IWBF). IEEE, 1–6.
[17] Rishabh Kaushal, Srishty Saha, Payal Bajaj, and Ponnurangam Kumaraguru. 2016. KidsTube: Detection, characterization and analysis of child unsafe content & promoters on YouTube. In 2016 14th Annual Conference on Privacy, Security and Trust (PST). IEEE, 157–164.
[18] Dale Kunkel and Brian Wilcox. 2004. Television advertising leads to unhealthy habits in children; says APA Task Force. PsycEXTRA Dataset (2004). https://doi.org/10.1037/e363652004-001
[19] Tarald O Kvålseth. 1989. Note on Cohen’s kappa. Psychological reports 65, 1 (1989), 223–226.
[20] Jeffrey Liu. 2022. YouTube inappropriate ADS project. https://sites.google.com/usc.edu/inappropriate--ads-work/home
[21] Kayla Matthews. 2020. The Complete Guide to Online Video Advertising. https://www.outbrain.com/blog/online-video-advertising-guide/
[22] Kostantinos Papadamou, Antonis Papasavva, Savvas Zannettou, Ilias Blackburn, Gianluca Stringhini, and Michael Sirivianos. 2020. Disturbed YouTube for kids: Characterizing and detecting inappropriate videos targeting young children. In Proceedings of the international AAAI Conference on web and social media, Vol. 14. 522–533.
[23] Jenny Radesky and Alexandria Schaller. 2020. 2020 young kids and YouTube. https://api.commonsensemedia.org/sites/default/files/research/report/2020_youngkidsyoutube-report_finalrelease_forweb.pdf
[24] Shubham Singh, Rishabh Kaushal, Arun Balaji Buduru, and Ponnurangam Kumaraguru. 2019. KidsGUARD: Fine grained approach for child unsafe video representation and detection. In Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. 2104–2111.
[25] Aaron Smith, Skye Toor, and Patrick van Kessel. 2020. Many turn to YouTube for children’s content, news, how-to lessons. https://www.pewresearch.org/internet/2018/11/07/many-turn-toyoutube-for-childrens-content-news-how-to-lessons/
[26] Anna Sonnenberg. 2020. How to create YouTube in-stream ads:Social Media Examiner. https://www.socialmediaexaminer.com/how-tocreate-youtube-in-stream-ads
[27] Matt G. Southern. 2019. YouTube introduces more ways to buy masthead ads. https://www.searchenginejournal.com/youtube-introducesmore-ways-to-buy-masthead-ads/291625/
[28] Statista. 2021. U.S. children YouTube negative content consumption. https://www.statista.com/statistics/1268656/youtube-videosnegative-content-watched-by-children-usa/
[29] Rashid Tahir, Faizan Ahmed, Hammas Saeed, Shiza Ali, Fareed Zaffar, and Christo Wilson. 2019. Bringing the kid back into youtube kids. In 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. IEEE, 464–469.
[30] Allison Slater Tate. 2019. Pediatrician warns parents after finding
’scary’ youtube kids content. https://www.today.com/parents/pediatrician-warns-parents-after-finding-scary-content-youtubekids-t149501
[31] Matthijs J Warrens. 2012. The effect of combining categories on Bennett, Alpert and Goldstein’s S. Statistical Methodology 9, 3 (2012),341–352.
[32] Geoff Weiss. 2020. YouTube is serving young viewers ageinappropriate ads, videos with little educational value, study
[33] Jacqueline Zote. 2020. The essential guide to YouTube ad campaigns.
https://sproutsocial.com/insights/youtube-ad-campaigns/