Conformal Recommender Systems
Conformal Calibration Plot

Conformal Recommender Systems

Overview

Conformal prediction is a framework that allows for predictions to be supported with statistical confidence. With Conformal prediction, uncertainty estimations can be produced for supervised learning models based on a user-specified confidence level. The method is generally used for classification and regression, creating sets for classification and intervals for regression problems.

The goal of this project was to extend a conformal framework to recommender systems. The recommender system constructed was aimed at providing course recommendations to University College Maastricht (UCM). UCM allows for students to select any course from available courses, contingent only on pre-requisite courses. By incorporating conformal prediction into the recommender system, we establish variable course set sizes for the student depending on the significance level specified. The models created rely on the abstraction of topics from course descriptions using LDA. The topic modeling for courses is also utilized to represent the students. The topics are combined from previous courses taken and weighted by their grades. Two variants of the course recommender system are created. The first method selects courses based on the similarity between past courses and the subsequent courses available. The second variant generates a representation of the student and then compared the similarity between the abstraction of the student and the available courses.

Our results show that our recommender systems are well-calibrated and produce course sets that are drastically reduced subsets than the total available courses (around 150). The models produced were one of the first instances of applying conformal prediction in recommender systems—the project resulted in a short paper accepted into the 2020 Education Data Mining Conference.

Technical

The models were tested and produced in a python environment. Models of the courses and the students were developed using LDA modeling, and predictive modeling was performed using Scikit-learn. The models were also validated using the conformist library and with SHAP values. Visualizations produced using Plotly.