Skip to content

Commit

Permalink
change to numpy
Browse files Browse the repository at this point in the history
  • Loading branch information
weiran-huang committed Jun 6, 2022
1 parent cf91563 commit 2f0f231
Show file tree
Hide file tree
Showing 9 changed files with 150 additions and 191 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@

.DS_Store
157 changes: 83 additions & 74 deletions README.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,73 +1,68 @@
# Contents

- [LDP LinUCB Description](#ldp-linucb-description)
- [Model Architecture](#model-architecture)
- [Dataset](#dataset)
- [Environment Requirements](#environment-requirements)
- [Script Description](#script-description)
- [Script and Sample Code](#script-and-sample-code)
- [Script Parameters](#script-parameters)
- [Launch](#launch)
- [Model Description](#model-description)
- [Performance](#performance)
- [Description of Random Situation](#description-of-random-situation)
- [ModelZoo Homepage](#modelzoo-homepage)

# [LDP LinUCB Description](#contents)

[![Platform](https://img.shields.io/badge/platform-mindspore-blue)](https://www.mindspore.cn/install/en)
# LDP LinUCB

[![Platform](https://img.shields.io/badge/platform-numpy-blue)](https://numpy.org/install)
[![Top Language](https://img.shields.io/github/languages/top/huang-research-group/LDPbandit2020)](https://github.com/huang-research-group/LDPbandit2020/search?l=python)
[![Latest Release](https://img.shields.io/github/v/release/huang-research-group/LDPbandit2020)](https://github.com/huang-research-group/LDPbandit2020/releases)

Locally Differentially Private (LDP) LinUCB is a variant of LinUCB bandit algorithm with local differential privacy guarantee, which can preserve users' personal data with theoretical guarantee.

[Paper](https://arxiv.org/abs/2006.00701): Kai Zheng, Tianle Cai, [Weiran Huang](https://www.weiranhuang.com), Zhenguo Li, Liwei Wang. "Locally Differentially Private (Contextual) Bandits Learning." *Advances in Neural Information Processing Systems*. 2020.
## Description

Locally Differentially Private (LDP) LinUCB is a variant of LinUCB bandit algorithm with local differential privacy guarantee, which can preserve users' personal data with theoretical guarantee.

# [Model Architecture](#contents)
The server interacts with users in rounds. For a coming user, the server first transfers the current model parameters to the user. In the user side, the model chooses an action based on the user feature to play (e.g., choose a movie to recommend), and observes a reward (or loss) value from the user (e.g., rating of the movie). Then we perturb the data to be transferred by adding Gaussian noise. Finally, the server receives the perturbed data and updates the model. Details can be found in the [paper](https://arxiv.org/abs/2006.00701).

The server interacts with users in rounds. For a coming user, the server first transfers the current model parameters to the user. In the user side, the model chooses an action based on the user feature to play (e.g., choose a movie to recommend), and observes a reward (or loss) value from the user (e.g., rating of the movie). Then we perturb the data to be transferred by adding Gaussian noise. Finally, the server receives the perturbed data and updates the model. Details can be found in the [original paper](https://arxiv.org/abs/2006.00701).
Paper: Kai Zheng, Tianle Cai, [Weiran Huang](https://www.weiranhuang.com), Zhenguo Li, Liwei Wang, "[Locally Differentially Private (Contextual) Bandits Learning](https://arxiv.org/abs/2006.00701)", *Advances in Neural Information Processing Systems*, 2020.

# [Dataset](#contents)
Note: An earlier MindSpore-based version can be found in [MindSpore Models (Gitee)](https://gitee.com/mindspore/models/tree/master/research/rl/ldp_linucb) or [v1.0.0](https://github.com/huang-research-group/LDPbandit2020/tree/v1.0.0).

Note that you can run the scripts based on the dataset mentioned in original paper. In the following sections, we will introduce how to run the scripts using the related dataset below.
## Dataset

Dataset used: [MovieLens 100K](https://grouplens.org/datasets/movielens/100k/)
Dataset used: [MovieLens 100K](https://grouplens.org/datasets/movielens/100k/) ([download](https://files.grouplens.org/datasets/movielens/ml-100k.zip))

- Dataset size:5MB, 100,000 ratings (1-5) from 943 users on 1682 movies.
- Data format:csv/txt files

# [Environment Requirements](#contents)
We process the dataset by `src/dataset.py`:
We first pick out all the users having at least one rating score.
Then SVD is appied to complement missing ratings and full rating table is obtained.
We normalize all the ratings to [-1,1].

- Hardware (Ascend/GPU)
- Prepare hardware environment with Ascend or GPU processor.
- Framework
- [MindSpore](https://www.mindspore.cn/install/en)
- For more information, please check the resources below:
- [MindSpore Tutorials](https://www.mindspore.cn/tutorials/en/master/index.html)
- [MindSpore Python API](https://www.mindspore.cn/docs/api/en/master/index.html)

# [Script Description](#contents)
## Installation

Unzip the MovieLens dataset and place `ua.base` in the code directory.
Then run the following commands:

```bash
python -m venv venv # create a virtual environment named venv
source venv/bin/activate # activate the environment
pip install -r requirements.txt # install the dependencies
```

Code is tested in the following environment:
- numpy==1.21.6
- matplotlib==3.5.2

## [Script and Sample Code](#contents)

## Script and Sample Code

```console
├── model_zoo
├── README.md // descriptions about all the models
├── research
├── rl
├── ldp_linucb
├── README.md // descriptions about LDP LinUCB
├── scripts
│ ├── run_train_eval.sh // shell script for running on Ascend
├── src
│ ├── dataset.py // dataset for movielens
│ ├── linucb.py // model
├── train_eval.py // training script
├── result1.png // experimental result
├── result2.png // experimental result
├── LDPbandit2020
├── ua.base // downloaded data file
├── README.md // descriptions about the repo
├── requirements.txt // dependencies
├── scripts
├── run_train_eval.sh // shell script for training and evaluation
├── src
├── dataset.py // dataset processing for movielens
├── linucb.py // model
├── train_eval.py // training and evaluation script
├── result1.png // experimental result
├── result2.png // experimental result
```

## [Script Parameters](#contents)

## Script Parameters

- Parameters for preparing MovieLens 100K dataset

Expand All @@ -85,52 +80,66 @@ Dataset used: [MovieLens 100K](https://grouplens.org/datasets/movielens/100k/)
'iter_num': 1e6 # number of iterations
```

## [Launch](#contents)

- running on Ascend
## Usage

```shell
python train_eval.py > result.log 2>&1 &
```bash
python train_eval.py --epsilon=8e5 --delta=1e-1 --alpha=1e-1
```

The python command above will run in the background, you can view the results through the file `result.log`.

The regret value will be achieved as follows:

```console
--> Step: 0, diff: 348.662, current_regret: 0.000, cumulative regret: 0.000
--> Step: 1, diff: 338.457, current_regret: 0.000, cumulative regret: 0.000
--> Step: 2, diff: 336.465, current_regret: 2.000, cumulative regret: 2.000
--> Step: 3, diff: 327.337, current_regret: 0.000, cumulative regret: 2.000
--> Step: 4, diff: 325.039, current_regret: 2.000, cumulative regret: 4.000
--> Step: 0, diff: 350.346, current regret: 0.000, cumulative regret: 0.000
--> Step: 1, diff: 344.916, current regret: 0.400, cumulative regret: 0.400
--> Step: 2, diff: 340.463, current regret: 0.000, cumulative regret: 0.400
--> Step: 3, diff: 344.849, current regret: 0.800, cumulative regret: 1.200
--> Step: 4, diff: 337.587, current regret: 0.000, cumulative regret: 1.200
...
--> Step: 999997, diff: 54.873, current regret: 0.000, cumulative regret: 962.400
--> Step: 999998, diff: 54.873, current regret: 0.000, cumulative regret: 962.400
--> Step: 999999, diff: 54.873, current regret: 0.000, cumulative regret: 962.400
Regret: 962.3999795913696, cost time: 562.508s
Theta: [64.96814 26.639004 21.260265 19.860786 18.405128 16.73249 15.778397 14.784237 13.298004 12.329174 12.149574 11.159462 10.170071 9.662151 8.269745 7.794155 7.3355427 7.3690567 5.790653 3.9999294]
Ground-truth theta: [88.59274289 28.4110571 22.59921103 21.77239171 20.3727694 19.27781873 17.40422888 16.8321811 15.52599173 14.62141299 14.21670515 12.55781785 11.29962158 10.97902155 10.32499178 9.33040444 8.88399318 8.28387461 6.86420729 4.47880342]
```

# [Model Description](#contents)

The [original paper](https://arxiv.org/abs/2006.00701) assumes that the norm of user features is bounded by 1 and the norm of rating scores is bounded by 2. For the MovieLens dataset, we normalize rating scores to [-1,1]. Thus, we set `sigma` in Algorithm 5 to be $4/\epsilon \times \sqrt{2 \times ln(1.25/\delta)}$.
## Algorithm Modification

The [original paper](https://arxiv.org/abs/2006.00701) assumes that the norm of user features is bounded by 1 and the norm of rating scores is bounded by 2. For the MovieLens dataset, we normalize rating scores to [-1,1]. Thus, we set `sigma` in Algorithm 5 to $4/\epsilon \cdot \sqrt{2 \ln(1.25/\delta)}$.

## [Performance](#contents)

## Performance

The performance for different privacy parameters:

- x: number of iterations
- y: cumulative regret
- X axis: number of iterations
- Y axis: cumulative regret

![Result1](result1.png)

The performance compared with optimal non-private regret O(sqrt(T)):
The performance compared with optimal non-private regret $O(\sqrt{T})$:

- x: number of iterations
- y: cumulative regret divided by sqrt(T)
- X axis: number of iterations
- Y axis: cumulative regret divided by $\sqrt{T}$

![Result2](result2.png)

# [Description of Random Situation](#contents)
It can be seen that our privacy-preserved performance is close to the optimal non-private performance.


In `train_eval.py`, we randomly sample a user at each round. We also add Gaussian noise to the date being transferred.
## Citation

# [ModelZoo Homepage](#contents)
If you find our work useful in your research, please consider citing:

Please check the official
[homepage](https://gitee.com/mindspore/models).
```
@article{zheng2020locally,
title={Locally differentially private (contextual) bandits learning},
author={Zheng, Kai and Cai, Tianle and Huang, Weiran and Li, Zhenguo and Wang, Liwei},
journal={Advances in Neural Information Processing Systems},
volume={33},
pages={12300--12310},
year={2020}
}
```
Empty file modified requirements.txt
100644 → 100755
Empty file.
Empty file modified result1.png
100644 → 100755
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified result2.png
100644 → 100755
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
30 changes: 15 additions & 15 deletions scripts/run_train_eval.sh
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
#!/bin/bash
# Copyright 2020 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================

python3 train_eval.py > result.log 2>&1 &
# parameters
epsilon=8e5
delta=1e-1
alpha=1e-1
iter_num=1e6
timestamp=`date '+%s'`

# preparation
SHELL_FOLDER=$(cd "$(dirname "$0")";pwd)
cd ${SHELL_FOLDER}/../
if [ ! -d "./results" ]; then
mkdir ./results
fi

python train_eval.py --epsilon=${epsilon} --delta=${delta} --alpha=${alpha} --iter_num=${iter_num}| tee ./results/"${timestamp}_epsilon_${epsilon}_delta_${delta}_alpha_${alpha}_iter_num_${iter_num}".txt
22 changes: 4 additions & 18 deletions src/dataset.py
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,24 +1,10 @@
# Copyright 2020 Huawei Technologies Co., Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ============================================================================
"""
MovieLens Environment.
Datasets can be downloaded at https://files.grouplens.org/datasets/movielens/ml-100k.zip
"""

import random
import numpy as np
from mindspore import Tensor

_MAX_NUM_ACTIONS = 1682
_NUM_USERS = 943
Expand Down Expand Up @@ -58,7 +44,7 @@ def __init__(self, data_file, num_movies, rank_k):
self._data_matrix = load_movielens_data(data_file)
# Keep only the first items
self._data_matrix = self._data_matrix[:, :num_movies]
# Filter the users with at least one rating score
# Pick out the users with at least one rating score
nonzero_users = list(
np.nonzero(
np.sum(
Expand Down Expand Up @@ -92,8 +78,8 @@ def observation(self):
"""random select a user and return its feature."""
sampled_user = random.randint(0, self._data_matrix.shape[0] - 1)
self._current_user = sampled_user
return Tensor(self._feature[sampled_user])
return self._feature[sampled_user]

def current_rewards(self):
"""rewards for current user."""
return Tensor(self._approx_ratings_matrix[self._current_user])
return self._approx_ratings_matrix[self._current_user]
Loading

0 comments on commit 2f0f231

Please sign in to comment.