HT2020

Source codes, raw data and complete results of paper Off-line vs. On-line Evaluation of Recommender Systems in Small E-commerce accepted for HT2020 conference

The paper as well as this repository is partially based on the preliminary version (https://arxiv.org/abs/1809.03186) presented at REVEAL 2018 vorkshop. For instructions on running the recommending algorithms and their source codes, please refer to https://github.com/lpeska/REVEAL2018 repository. This repository contains scripts to process on-line and off-line evaluation, regression models aiming to predict the on-line results based on the off-line ones and results of the second on-line experiment aiming to evaluate these models.

off-line results can be generated via offLineResultsConstructor.ipynb file and are contained in the resultsWithNovDiv_32_0dot01Temporal.rar file
on-line results are processed in OnlineResultsEvaluation.ipynb file in OnlineResults folder
several off-line to on-line prediction models are evaluated there as well
results of a follow-up on-line experiment are in the onlineResults_secondExperiment folder

Abstract

In this paper, we present our work towards comparing on-line and off-line evaluation metrics in the context of small e-commerce recommender systems. Recommending on small e-commerce enterprises is rather challenging due to the lower volume of interactions and low user loyalty, rarely extending beyond a single session. On the other hand, we usually have to deal with lower volumes of objects, which are easier to discover by users through various browsing/searching GUIs.

The main goal of this paper is to determine applicability of off-line evaluation metrics in learning true usability of recommender systems (evaluated on-line in A/B testing). In total 800 variants of recommenders were evaluated off-line w.r.t. 18 metrics covering rating-based, ranking-based, novelty and diversity evaluation. The off-line results were afterwards compared with on-line evaluation of 12 selected recommender variants and based on the results, we tried to learn and utilize an off-line to on-line results prediction model.

Off-line results shown a great variance in performance w.r.t. different metrics with the Pareto front covering 64% of the approaches. Furthermore, we observed that on-line results are considerably affected by the seniority of users. On-line metrics correlates positively with ranking-based metrics (AUC, MRR, nDCG) for novice users, while too high values of novelty had a negative impact on the on-line results for them.

Novelty compared to the previous version

Finally, this paper contains numerous extensions as compared to our preliminary work in this field (https://arxiv.org/abs/1809.03186). The main ones are re-defined VRR and novelty metrics, more thorough results analysis including the effect of user profile size, evaluated regression algorithms aiming to learn on-line CTR and VRR results from the off-line metrics and additional on-line evaluations.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
onlineResults		onlineResults
onlineResults_secondExperiment		onlineResults_secondExperiment
OfflineDataPreparation.ipynb		OfflineDataPreparation.ipynb
OfflineDataStatistics.ipynb		OfflineDataStatistics.ipynb
README.md		README.md
offLineResultsConstructor.ipynb		offLineResultsConstructor.ipynb
resultsWithNovDiv_32_0dot01Temporal.rar		resultsWithNovDiv_32_0dot01Temporal.rar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HT2020

Abstract

Novelty compared to the previous version

About

Releases

Packages

Languages

lpeska/HT2020

Folders and files

Latest commit

History

Repository files navigation

HT2020

Abstract

Novelty compared to the previous version

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages