Personalising Learning
22 AI Speak: How Youtube Learns You, part 1
Models and Recommendation
ACTIVITY
These are the credit card transactions of John Doe and Tom Harry, two men living in Nantes, France. They are looking for things to do this weekend. What will you recommend to them?
List to choose from:
- The new Burger King outlet
- An olive oil tasting event
- An online luggage store
- A river-side concert
- Baby swimming class
Recommendation systems have been around at least as long as tourist guides and top-ten lists. While The Guardian Best Books of 2022 recommends the same list to everyone, you would likely adapt it when choosing for yourself – pick a few and change the order of reading based on your personal preferences.
How to recommend options for strangers? In the activity above, you probably tried to imagine their personalities based on the given information: you made judgements and applied stereotypes. Then, once you had an idea of their type, you chose from the list things that could (or not) be relevant to them. Recommenders such as Amazon, Netflix and Youtube follow a similar process.
Nowadays, whenever someone is searching for information or looking to discover online content, they use some kind of personalised recommender system1,2. The main function of Youtube is to suggest to its users what to watch amongst all the videos available on the platform. For signed-in users, it uses their past activity to create a model, or a personality type. Once it has a model for John, it can see who else has models similar to his. It then recommends to John videos similar to what he has watched and those similar to what others like him have watched.
What is a model?
Models can be used to mimic anything from users to videos to lessons a child has to learn. A model is a simplified representation of the world, so a machine can pretend to understand it:
How Youtube learns who you are
All recommendation problems involve asking a surrogate question: “What to recommend” is too general and vague for an algorithm. Netflix asked developers what will be the rating user A would give video B, considering their ratings for other videos. Youtube asks what the watch time would be for a given user in a particular context. The choice of what to ask – and predict – has a big impact on what recommendation is shown3. The idea is that the correct prediction will lead to a good recommendation. The prediction itself is based on other users with a history of similar tastes4. That is, users whose models are similar.
User models
Youtube splits the task of recommendation into two parts and uses different models for each3. We, however, will stick to a simpler explanation here.
For creating a user model, its developers have to ask, what data is relevant to video recommendation. What about what the user has watched before? What about their reviews, ratings, and explicit preferences thus far? What did they search for? Youtube uses signals that are more implicit than explicit, since the latter are more readily available3. Did a user just click a video or did they watch it? If yes, for how long? How did the user react to previous recommendations1? Which ones were ignored? Apart from direct answers to these questions, demographic information such as gender, language, region and type of device are of great value when the user is new or not signed in3.
Once a model is available for each user, we can compare users and use that information for recommendations.
Video Models
We could also use videos that are both similar to and different from one another. Youtube looks at the information it has on a particular video – its title and description, video quality, how many people have watched it (view count), liked it, favoured it, commented on it or shared it, the time since it was uploaded and the number of users subscribed to the parent channel1.
What a user watches next will also depend on whether one video is an episode within a series or an item in a playlist. If a user is discovering a new artist, he or she might move from the most popular songs to smaller niches. Also, a user might not click on a video whose thumbnail image is poor quality1,3. All of this information goes into the model too.
One of the building blocks of the recommendation system is to go from one video to a list of related videos. In this context, we define related videos as those that a user is likely to watch next3. The goal is to squeeze the most value out of data to make better recommendations4.
1 Davidson, J., Liebald, B., Liu, J., Nandy, P., Vleet, T., The Youtube Video Recommendation System, Proceedings of the 4th ACM Conference on Recommender Systems, Barcelona, 2010.
2 Spinelli, L., and Crovella, M., How YouTube Leads Privacy-Seeking Users Away from Reliable Information, In Adjunct Publication of the 28th ACM Conference on User Modeling, Adaptation and Personalization (UMAP ’20 Adjunct), Association for Computing Machinery, New York, 244–251, 2020.
3 Covington, P., Adams, J., Sargin, E., Deep neural networks for Youtube Recommendations, Proceedings of the 10th ACM Conference on Recommender Systems, ACM, New York, 2016.
4 Konstan, J., Terveen, L., Human-centered recommender systems: Origins, advances, challenges, and opportunities, AI Magazine, 42(3), 31-42, 2021.