APPCorp

This repository contains the corpus and code for APPCorp.

APPCorp: A Corpus for Android Privacy Policy Document Structure Analysis

Shuang Liu, Fan Zhang, Baiyang Zhao, Renjie Guo, Tao Chen and Meishan Zhang

Introduction

APPCorp is a manually labelled corpus containing 231 privacy policies (of more than 566K words and 7, 748 annotated paragraphs). We benchmark the corpus with 3 different document classification models, i.e., Support Vector Machine (SVM), Hierarchical Attention Network (HAN) and Hierarchical Graph Attention Network (HGAT), with two different word representations, i.e., GloVe and BERT.

Reqirements

pip install -r requirements.txt

Quick Start

# SVM
python train-svm.py --fold 9

# GloVe
## HAN
python train.py --config_file glove.cfg --emb glove --gpu 0 --fold 9

## HGAT
python train.py --config_file glove.cfg --emb glove --use_graph --gpu 0 --fold 9

# BERT
## HAN
python train.py --config_file bert.cfg --emb bert --gpu 0 --fold 9

## HGAT
python train.py --config_file bert.cfg --emb bert --use_graph --gpu 0 --fold 9

Classification Models

Download classification models from Baidu Cloud (Code: siin).

Citation

@article{liu2021appcorp,
  title={APPCorp: A Corpus for Android Privacy Policy Document Structure Analysis}, 
  author={Shuang Liu, Fan Zhang, Baiyang Zhao, Renjie Guo, Tao Chen and Meishan Zhang}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
config		config
data		data
module		module
src		src
README.md		README.md
preprocessing.py		preprocessing.py
requirements.txt		requirements.txt
train-svm.py		train-svm.py
train.py		train.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

APPCorp

Introduction

Reqirements

Quick Start

Classification Models

Citation

About

Releases

Packages

Languages

zhangfanTJU/APPCorp

Folders and files

Latest commit

History

Repository files navigation

APPCorp

Introduction

Reqirements

Quick Start

Classification Models

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages