AI Speak: How Youtube Learns You, part 1

Colin de la Higuera; Jotsna Iyer

Personalising Learning

22 AI Speak: How Youtube Learns You, part 1

Models and Recommendation

ACTIVITY

These are the credit card transactions of John Doe and Tom Harry, two men living in Nantes, France. They are looking for things to do this weekend. What will you recommend to them?

List to choose from:

The new Burger King outlet
An olive oil tasting event
An online luggage store
A river-side concert
Baby swimming class

Recommendation systems have been around at least as long as tourist guides and top-ten lists. While The Guardian Best Books of 2022 recommends the same list to everyone, you would likely adapt it when choosing for yourself – pick a few and change the order of reading based on your personal preferences.

How to recommend options for strangers? In the activity above, you probably tried to imagine their personalities based on the given information: you made judgements and applied stereotypes. Then, once you had an idea of their type, you chose from the list things that could (or not) be relevant to them. Recommenders such as Amazon, Netflix and Youtube follow a similar process.

Nowadays, whenever someone is searching for information or looking to discover online content, they use some kind of personalised recommender system^1,2. The main function of Youtube is to suggest to its users what to watch amongst all the videos available on the platform. For signed-in users, it uses their past activity to create a model, or a personality type. Once it has a model for John, it can see who else has models similar to his. It then recommends to John videos similar to what he has watched and those similar to what others like him have watched.

What is a model?

Models can be used to mimic anything from users to videos to lessons a child has to learn. A model is a simplified representation of the world, so a machine can pretend to understand it:

How Youtube learns who you are

All recommendation problems involve asking a surrogate question: “What to recommend” is too general and vague for an algorithm. Netflix asked developers what will be the rating user A would give video B, considering their ratings for other videos. Youtube asks what the watch time would be for a given user in a particular context. The choice of what to ask – and predict – has a big impact on what recommendation is shown³. The idea is that the correct prediction will lead to a good recommendation. The prediction itself is based on other users with a history of similar tastes⁴. That is, users whose models are similar.

User models

Youtube splits the task of recommendation into two parts and uses different models for each³. We, however, will stick to a simpler explanation here.

For creating a user model, its developers have to ask, what data is relevant to video recommendation. What about what the user has watched before? What about their reviews, ratings, and explicit preferences thus far? What did they search for? Youtube uses signals that are more implicit than explicit, since the latter are more readily available³. Did a user just click a video or did they watch it? If yes, for how long? How did the user react to previous recommendations¹? Which ones were ignored? Apart from direct answers to these questions, demographic information such as gender, language, region and type of device are of great value when the user is new or not signed in³.

Once a model is available for each user, we can compare users and use that information for recommendations.

Video Models

We could also use videos that are both similar to and different from one another. Youtube looks at the information it has on a particular video – its title and description, video quality, how many people have watched it (view count), liked it, favoured it, commented on it or shared it, the time since it was uploaded and the number of users subscribed to the parent channel¹.

What a user watches next will also depend on whether one video is an episode within a series or an item in a playlist. If a user is discovering a new artist, he or she might move from the most popular songs to smaller niches. Also, a user might not click on a video whose thumbnail image is poor quality^1,3. All of this information goes into the model too.

One of the building blocks of the recommendation system is to go from one video to a list of related videos. In this context, we define related videos as those that a user is likely to watch next³. The goal is to squeeze the most value out of data to make better recommendations⁴.

¹Davidson, J., Liebald, B., Liu, J., Nandy, P., Vleet, T., The Youtube Video Recommendation System, Proceedings of the 4th ACM Conference on Recommender Systems, Barcelona, 2010.

²Spinelli, L., and Crovella, M., How YouTube Leads Privacy-Seeking Users Away from Reliable Information, In Adjunct Publication of the 28th ACM Conference on User Modeling, Adaptation and Personalization (UMAP ’20 Adjunct), Association for Computing Machinery, New York, 244–251, 2020.

³ Covington, P., Adams, J., Sargin, E., Deep neural networks for Youtube Recommendations, Proceedings of the 10th ACM Conference on Recommender Systems, ACM, New York, 2016.

⁴Konstan, J., Terveen, L., Human-centered recommender systems: Origins, advances, challenges, and opportunities, AI Magazine, 42(3), 31-42, 2021.

Licence

Icon for the Creative Commons Attribution 4.0 International License