fix README.md

sisinflab · Mar 5, 2021 · a59578e · a59578e
1 parent 2f03233
commit a59578e
Show file tree

Hide file tree

Showing 2 changed files with 23 additions and 105 deletions.
diff --git a/advanced_configuration.md b/advanced_configuration.md
@@ -1,108 +1,23 @@
-This configuration file [link] takes movielens dataset from a specific path, then Elliot performs an exhaustive iterative k-core for both user 
-and item with a minimum number of 10 interactions. Later, a splitting strategy with test and validation solutions is adopted. 
-The test is split with a random subsampling for 1 fold and with a ratio of 20% with respect to the amount of data. Instead, 
-the validation portion is computed in cross-validation with 5 folds. In this way, models will declare in the following section are 
-trained 5 times (once per each train-validation pair) to estimate the validation performance.
+## Advanced Configuration
 
-The next section of this configuration file is devoted to declaring the evaluation metrics and which cut-off Elliot has 
-to investigate to perform this evaluation step. The framework accepts both simple metrics (metrics that do not exploit 
-external files o configurations) and complex metrics (like metrics related to bias o fairness investigation). Note that 
-Elliot has a top_k parameter useful to produce recommendation lists with a specific number of relevant items, and for 
-the evaluation could have specific cut-offs.
+The second scenario depicts a more complex experimental setting. 
+In the configuration, the user specifies an elaborate data splitting strategy, i.e., random_subsampling (for test splitting) 
+and random_cross_validation (for model selection), by setting few splitting configuration fields. 
 
-The third part of this YAML structured file declares explicitly which models Elliot has to train and evaluate. 
-This section is the most expressive one because each model could be equipped with a specific hyperparameter exploration strategy. 
-Specifically, this file shows how NeuMF and MultiVae adopt a Bayesian optimization exploration named TPE (Tree Parzen Estimator), 
-which extracts 5 different model configurations that exploit the space strategy adopted by different parameters in both models.[to be continued]
+The configuration does not provide a cut-off value, and thus a top-k field value of 50 is assumed as the cut-off. 
 
-To see the full configuration file please visit the following [link](config_files/advanced_configuration.yml).
-To run the experiment use the following [script](sample_advanced.py).
+Moreover, the evaluation section includes the UserMADrating metric.
 
+Elliot considers it as a complex metric since it requires additional arguments.
 
-```yaml
-experiment:
-  dataset: movielens_1m
-  data_config:
-    strategy: dataset
-    dataset_path: ../data/movielens_1m/dataset.tsv
-  prefiltering:
-    strategy: iterative_k_core
-    core: 10
-  splitting:
-    save_folder: ../data/movielens_1m/splitting/
-    test_splitting:
-        strategy: random_subsampling
-	    folds: 1
-        test_ratio: 0.2
-    validation_splitting:
-        strategy: random_cross_validation
-        folds: 5
-  top_k: 50
-  evaluation:
-    cutoff: 10
-    simple_metrics: [nDCG, ACLT, APLT, ARP, PopREO]
-    complex_metrics: 
-    - metric: UserMADrating
-      clustering_name: Happiness
-      clustering_file: ../data/movielens_1m/u_happy.tsv
-    - metric: ItemMADrating
-      clustering_name: ItemPopularity
-      clustering_file: ../data/movielens_1m/i_pop.tsv
-    - metric: REO
-      clustering_name: ItemPopularity
-      clustering_file: ../data/movielens_1m/i_pop.tsv
-    - metric: RSP
-      clustering_name: ItemPopularity
-      clustering_file: ../data/movielens_1m/i_pop.tsv
-    - metric: BiasDisparityBD
-      user_clustering_name: Happiness
-      user_clustering_file: ../data/movielens_1m/u_happy.tsv
-      item_clustering_name: ItemPopularity
-      item_clustering_file: ../data/movielens_1m/i_pop.tsv
-    relevance_threshold: 1
-  gpu: 1
-  models:
-    NeuMF:
-      meta:
-        hyper_max_evals: 5
-        hyper_opt_alg: tpe
-        validation_rate: 5
-      lr: [loguniform, -10, -1]
-      batch_size: [128, 256, 512]
-      epochs: 50
-      mf_factors: [quniform, 8, 32, 1]
-      mlp_factors: [8, 16]
-      mlp_hidden_size: [(32, 16, 8), (64, 32, 16)]
-      prob_keep_dropout: 0.2
-      is_mf_train: True
-      is_mlp_train: True
-    MultiVAE:
-      meta:
-        hyper_max_evals: 5
-        hyper_opt_alg: tpe
-        validation_rate: 5
-      lr: [0.0005, 0.001, 0.005, 0.01]
-      epochs: 50
-      batch_size: [128, 256, 512]
-      intermediate_dim: [300, 400, 500]
-      latent_dim: [100, 200, 300]
-      dropout_pkeep: 1
-      reg_lambda: [0.1, 0.0, 10]
-    BPRMF:
-      meta:
-        hyper_max_evals: 5
-        hyper_opt_alg: rand
-        validation_rate: 5
-      lr: [0.0005, 0.001, 0.005, 0.01]
-      batch_size: [128, 256, 512]
-      epochs: 50
-      embed_k: [10, 50, 100]
-      bias_regularization: 0
-      user_regularization: [0.0025, 0.005, 0.01]
-      positive_item_regularization: [0.0025, 0.005, 0.01]
-      negative_item_regularization: [0.00025, 0.0005, 0.001]
-      update_negative_item_factors: True
-      update_users: True
-      update_items: True
-      update_bias: True
-```
+The user also wants to implement a more advanced hyperparameter tuning optimization. For instance, regarding NeuMF, 
+Bayesian optimization using Tree of Parzen Estimators is required (i.e., hyper_opt_alg: tpe) with a logarithmic uniform 
+sampling for the learning rate search space.
+
+Moreover, Elliot allows considering complex neural architecture search spaces by inserting lists of tuples. For instance, 
+(32, 16, 8) indicates that the neural network consists of three hidden layers with 32, 16, and 8 units, respectively.
+
+
+|To see the full configuration file please visit the following [link](config_files/advanced_configuration.yml)|
+|-------------------------------------------------------------------------------------------------------------|
+|**To run the experiment use the following [script](sample_advanced.py)**|
diff --git a/basic_configuration.md b/basic_configuration.md
@@ -1,3 +1,5 @@
+## Basic Configuration
+
 In the first scenario, the experiments require comparing a group of RSs whose parameters are optimized via a grid-search. 
 
 The configuration specifies the data loading information, i.e., semantic features source files, in addition to the filtering and splitting strategies. 
@@ -15,5 +17,6 @@ by merely passing a list of possible hyperparameter values, e.g., neighbors: [50
 
 The reported models are selected according to nDCG@10.
 
-To see the full configuration file please visit the following [link](config_files/basic_configuration.yml).
-To run the experiment use the following [script](sample_basic.py).
+|To see the full configuration file please visit the following [link](config_files/basic_configuration.yml)|
+|-------------------------------------------------------------------------------------------------------------|
+|**To run the experiment use the following [script](sample_basic.py)**|