The term algorithm originated from the 9th century Persian mathematician, Mohammad ibn Musa al-Khwarizmi, whose name in Latinized version was Algorismi. Mohammad provided texts of arithmetic procedures based on Hindu-Arabic numerals to perform mathematical calculations such as addition, subtraction, multiplication, and division.
Around 820, Mohammad compiled an Arabic mathematical treatise on algebra known as Al-Jabr, which was later borrowed into Medieval Latin as algebraica. Up until the 16th century, Al-Jabr would become the basis for mathematical learning and computational engines in European universities.
Understanding the fundamental mathematical roots of the algorithm will help explain the process and influence of the famous and mysterious YouTube algorithm.
A Soviet postage stamp issued in 1983 commemorating al-Khwarizmi's 1200th birthday. Image retrieved from Wikimedia Commons.
A statue of al-Khwarizmi stands in Khiva (formerly known as Khawarizm), Uzbekistan, the birthplace of the ancient mathematician. Image retrieved from Flickr: Michael Zaretski.
When the YouTube Partner Program was implemented in 2007, a YouTube employee was responsible for manually curating the YouTube homepage. If the video caught the curator's attention, then it would be featured on the homepage, generating more views. However, to diversify media attention, YouTube added tabs like Most Viewed, Most Discussed, Top Favorites to provide a more organic form of presenting videos to the public.
(Right) Screenshot of YouTube homepage in 2007. The Featured Videos section was managed by the YouTube curators. Image retrieved from Flickr: arincrumley
Overview of the pipeline used in recommender systems. Videos are indexed in the catalog using descriptions that are automatically extracted or manually added. Videos are retrieved by ranking based on the user profile describing their preferences. Lubos et al., 2023.
In a 2010 article published by Google engineers, it was determined that the recommendation algorithm would use metadata (titles, tags, captions, descriptions, location, etc) to categorize videos and recommend rather than predict what viewers may want to watch. In other words, the algorithm does not know the contents of a video, but it creates mental models of which audience it may be suited for by using the information its given. Through trial and error, the YouTube algorithm can use collective user behavior to build mental models of data.
Top performing videos were measured by clicks, high ratings, comments, favorites, and share volume. However, despite the intuitive metrics implemented, the algorithm was still vulnerable to abuse. People were gaming the system by encouraging audiences to click on their videos through clickbait headlines and thumbnails which had no regard for quality video content. Watch time, which is a major indication of quality based on how long people watched the video after clicking, was implemented as a counter-measure for clickbait videos.
Long videos with an overarching storyline, like gaming playthroughs, checked all of the algorithmic boxes to be favored by the YouTube algorithm, influencing the duration and content of videos.
The YouTube algorithm system doesn't recommend videos following a single formula. It develops dynamically as user viewing habits change. Image retrieved from YouTube Official Blog.
In 2016, a neural network was implemented to optimize the many things the YouTube algorithm is constantly tweaking and improving. It uses 80 billion pieces of information signals to decide what to recommend to users: watch time, click-through rates, demographics, freshness, among others.
By blending quantitative and qualitative data together to obtain the bigger picture, the YouTube algorithm system operates as a prediction machine. According to Google engineers, it predicts how much a user will watch and enjoy a video and then compares the actual behavior against the prediction.
YouTube has sought to deprioritize borderline violative content to reduce the spread of misinformation and materials that goes against the platform's Terms of Service.
Today, the YouTube homepage recommends videos that provide general broad appeal, whereas sidebar recommended videos are geared toward relatedness. Image retrieved from Pexels: Cottonbro Studio