Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/sir_elliot'
Browse files Browse the repository at this point in the history
  • Loading branch information
vitowalteranelli committed Jul 1, 2021
2 parents 3bfc068 + cff99b6 commit 57629b4
Show file tree
Hide file tree
Showing 333 changed files with 1,019,121 additions and 4,974 deletions.
12 changes: 10 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,6 @@ pip-selfcheck.json

/results/
/data/*
!/data/cat_dbpedia_movielens_1m
!/data/movielens_1m
/share/
**/__pycache__/**
*.log
Expand All @@ -47,3 +45,13 @@ MANIFEST
docs/source/_build
docs/source/_static
docs/source/_templates


!/data/cat_dbpedia_movielens_1m/features.tsv
!/data/cat_dbpedia_movielens_1m/map.tsv
!/data/cat_dbpedia_movielens_1m/properties.conf
!/data/movielens_1m/i_pop.tsv
!/data/movielens_1m/u_pop.tsv
!/data/cat_dbpedia_movielens_1m_v030/features.tsv
!/data/cat_dbpedia_movielens_1m_v030/map.tsv
!/data/cat_dbpedia_movielens_1m_v030/properties.conf
26 changes: 26 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,32 @@

All notable changes to this project will be documented in this file

## [v0.3.0] - 2020-06-30
### Changed
- early stopping strategies
- offline recommendation files evaluation (ProxyRecommender, RecommendationFolder)
- negative sampling evaluation
- improved Microsoft Windows compatibility
- binarization of explicit dataset
- automatic loading of implicit datasets
- multiple prefiltering
- managing side information with modular loaders
- alignment of side information with training data
- improved Documentation: Model creation, Side Information loading, Early Stopping, Negative Sampling
- added nDCG as formulated in Rendle's 2020 KDD paper
- visual loader with tensorflow pipeline
- added and fixing visual recsys method:
- DVBPR
- VBPR
- DeepStyle
- ACF
- VNPR
- added new recommender method
- MF (Rendle's 2020 RecSys reproducibility paper)
- EASER
- RP3beta
- iALS

## [v0.2.1] - 2020-03-27
### Changed

Expand Down
2 changes: 1 addition & 1 deletion basic_configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,6 @@ by merely passing a list of possible hyperparameter values, e.g., neighbors: [50

The reported models are selected according to nDCG@10.

|To see the full configuration file please visit the following [link](config_files/basic_configuration.yml)|
|To see the full configuration file please visit the following [link](config_files/basic_configuration_v030.yml)|
|-------------------------------------------------------------------------------------------------------------|
|**To run the experiment use the following [script](sample_basic.py)**|
21 changes: 3 additions & 18 deletions config_files/advanced_configuration.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,11 @@ experiment:
dataset: movielens_1m
data_config:
strategy: dataset
dataset_path: ../data/movielens_1m/dataset.tsv
dataset_path: ../data/{0}/dataset.tsv
prefiltering:
strategy: iterative_k_core
core: 10
splitting:
save_folder: ../data/movielens_1m/splitting/
test_splitting:
strategy: random_subsampling
folds: 1
Expand All @@ -17,28 +16,14 @@ experiment:
folds: 5
top_k: 50
evaluation:
simple_metrics: [nDCG, ACLT, APLT, ARP, PopREO]
simple_metrics: [nDCG, ACLT, ARP]
complex_metrics:
- metric: UserMADrating
clustering_name: Happiness
clustering_file: ../data/movielens_1m/u_happy.tsv
- metric: ItemMADrating
clustering_name: ItemPopularity
clustering_file: ../data/movielens_1m/i_pop.tsv
- metric: REO
clustering_name: ItemPopularity
clustering_file: ../data/movielens_1m/i_pop.tsv
- metric: RSP
clustering_name: ItemPopularity
clustering_file: ../data/movielens_1m/i_pop.tsv
- metric: BiasDisparityBD
user_clustering_name: Happiness
user_clustering_file: ../data/movielens_1m/u_happy.tsv
item_clustering_name: ItemPopularity
item_clustering_file: ../data/movielens_1m/i_pop.tsv
relevance_threshold: 1
wilcoxon_test: True
gpu: 1
gpu: 0
models:
NeuMF:
meta:
Expand Down
8 changes: 2 additions & 6 deletions config_files/basic_configuration.yml
Original file line number Diff line number Diff line change
@@ -1,24 +1,20 @@
experiment:
version: 0.2.1
dataset: cat_dbpedia_movielens_1m
data_config:
strategy: dataset
dataloader: KnowledgeChainsLoader
dataset_path: ../data/cat_dbpedia_movielens_1m/dataset.tsv
dataset_path: ../data/{0}/dataset.tsv
side_information:
map: ../data/cat_dbpedia_movielens_1m/map.tsv
features: ../data/cat_dbpedia_movielens_1m/features.tsv
properties: ../data/cat_dbpedia_movielens_1m/properties.conf
# prefiltering:
# strategy: user_average # Not applied in the paper experiments
splitting:
save_on_disk: True
save_folder: ../data/cat_dbpedia_movielens_1m/splitting/
test_splitting:
strategy: temporal_hold_out
test_ratio: 0.2
validation_splitting:
strategy: temporal_hold_out
test_ratio: 0.2
top_k: 50
evaluation:
cutoffs: [10, 5]
Expand Down
56 changes: 56 additions & 0 deletions config_files/basic_configuration_v030.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
experiment:
version: 0.3.0
dataset: cat_dbpedia_movielens_1m
data_config:
strategy: dataset
dataset_path: ../data/cat_dbpedia_movielens_1m/dataset.tsv
side_information:
- dataloader: ChainedKG
map: ../data/cat_dbpedia_movielens_1m/map.tsv
features: ../data/cat_dbpedia_movielens_1m/features.tsv
properties: ../data/cat_dbpedia_movielens_1m/properties.conf
# prefiltering:
# strategy: user_average # Not applied in the paper experiments
splitting:
save_on_disk: True
save_folder: ../data/cat_dbpedia_movielens_1m/splitting/
test_splitting:
strategy: temporal_hold_out
test_ratio: 0.2
validation_splitting:
strategy: temporal_hold_out
test_ratio: 0.2
top_k: 50
evaluation:
cutoffs: [10, 5]
simple_metrics: [nDCG,Precision,ItemCoverage,EPC,Gini]
relevance_threshold: 1
gpu: 1
external_models_path: ../external/models/__init__.py
models:
Random:
meta:
verbose: True
save_recs: True
seed: 42
external.MostPop:
meta:
verbose: True
save_recs: True
validation_metric: nDCG@10
ItemKNN:
meta:
verbose: True
save_recs: True
validation_metric: nDCG@10
neighbors: [50, 70, 100]
similarity: [cosine, euclidean]
implementation: standard
AttributeItemKNN:
meta:
verbose: True
save_recs: True
validation_metric: nDCG@10
loader: ChainedKG
neighbors: [50, 70, 100]
similarity: [braycurtis, manhattan]
133 changes: 133 additions & 0 deletions config_files/recsys_config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
experiment:
dataset: ncf_ml1m
data_config:
strategy: fixed
train_path: ../data/{0}/ml-1m.train.rating
test_path: ../data/{0}/ml-1m.test.rating
binarize: True
negative_sampling:
strategy: fixed
files: [ "/home/ironman/PycharmProjects/Elliot/data/ncf_ml1m/ml-1m.test.negative" ]
top_k: 10
evaluation:
cutoffs: 10
simple_metrics: [nDCG, Recall, HR, Precision, MAP, MRR]
gpu: 0
external_models_path: ../external/models/__init__.py
models:
# Random:
# meta:
# save_recs: True
# external.MostPop:
# meta:
# verbose: True
# save_recs: True
# external.RendleMF: # from original paper
# meta:
# hyper_max_evals: 1
# hyper_opt_alg: tpe
# validation_rate: 1
# verbose: True
# save_recs: True
# optimize_internal_loss: True
# epochs: 256 # 256 original paper but 50 comes from NeuMF paper
# factors: 192
# lr: 0.002
# reg: 0.005
# m: 8
# random_seed: 42
# external.iALS: #from TOIS
# meta:
# hyper_max_evals: 20
# hyper_opt_alg: tpe
# verbose: True
# save_recs: True
# validation_rate: 20
# epochs: [uniform, 1, 500]
# scaling: [linear, log]
# factors: [uniform, 1, 200]
# alpha: [uniform, 10e-3, 50]
# epsilon: [uniform, 10e-3, 10]
# reg: [uniform, 10e-3, 10e-2]
# external.NeuMF: #from the original paper + Rendle
# meta:
# hyper_max_evals: 1
# hyper_opt_alg: tpe
# verbose: True
# save_recs: True
# validation_rate: 1
# optimize_internal_loss: True
# mf_factors: 64
# dropout: 0
# is_mf_train: True
# is_mlp_train: True
# batch_size: 256
# epochs: 100
# lr: 0.001
# m: 4
# ItemKNN: #from TOIS
# meta:
# save_recs: True
# verbose: True
# hyper_max_evals: 20
# hyper_opt_alg: tpe
# neighbors: [uniform, 5, 1000]
# similarity: [cosine, jaccard, dice, mahalanobis, euclidean]
# UserKNN: #from TOIS
# meta:
# hyper_max_evals: 20
# hyper_opt_alg: tpe
# save_recs: True
# verbose: True
# neighbors: [ uniform, 5, 1000 ]
# similarity: [cosine, jaccard, dice, mahalanobis, euclidean]
# MultiVAE: # from original paper
# meta:
# hyper_max_evals: 20
# hyper_opt_alg: tpe
# save_recs: True
# verbose: True
# # optimize_internal_loss: True
# lr: [loguniform, -11.512925464970229, 0] # exploration taken from TOIS
# epochs: 200
# batch_size: [ 128, 256, 512 ]
# intermediate_dim: 600
# latent_dim: 200
# dropout_pkeep: 0.5
# reg_lambda: [loguniform, -11.512925464970229, 0] # exploration taken from TOIS
# Slim: #from TOIS
# meta:
# hyper_max_evals: 1
# hyper_opt_alg: tpe
# verbose: True
# save_recs: True
# l1_ratio: 0.0000119
# alpha: 0.0788
# neighborhood: 544
# external.iALS:
# meta:
# verbose: True
# save_recs: True
# factors: 46
# alpha: 50
# epsilon: 10
# reg: 0.00001
# scaling: log
# epochs: 20
external.EASER:
meta:
verbose: True
save_recs: True
hyper_max_evals: 1
hyper_opt_alg: tpe
l2_norm: 1320
external.RP3beta: #from TOIS
meta:
hyper_max_evals: 1
hyper_opt_alg: tpe
verbose: True
save_recs: True
neighborhood: 546
alpha: 1.0807
beta: 0.7029
normalize_similarity: True
Loading

0 comments on commit 57629b4

Please sign in to comment.