Get start

Install necessary package for train

pip install datsets transformers tokenizers

and install pytorch here

Install utility package

pip install pandas tqdm github

How to use

Preprocessing

data/preprocessing.ipynb

Training and masking testing

train_issue_bert.ipynb

Data Format

Project data

id	name	language	total	issues	forked
1334	rails	Ruby	37188	12833	24355
1018	node	NaN	11155	7211	3944
4095	elasticsearch	Java	10157	6587	3570
340	netty	Java	9313	4647	4666
6815	vagrant	Ruby	9018	6618	2400

It was used to filter prject names

Train data

body	id	title
Bumps [rfc3986](https://github.com/python-hype...	596220186	Bump rfc3986 from 1.3.2 to 1.4.0
Bumps [boto3](https://github.com/boto/boto3...	596136922	Bump boto3 from 1.12.36 to 1.12.38
Bumps [botocore](https://github.com/boto/botoc...	596133846	Bump botocore from 1.15.36 to 1.15.38
Translations update from [Weblate](https://hos...	596128315	Translations update from Weblate
this will help to ensure that new projects are...	596064726	add rel="nofollow" to trending/latest on index...

We extracted this data from our issue database with the name we chose from the project data.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
README.md		README.md
masking test.pptx		masking test.pptx
train_issue_bert.ipynb		train_issue_bert.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Get start

Install necessary package for train

Install utility package

How to use

Data Format

Project data

Train data

About

Releases

Packages

Languages

qja1998/pretrain_issue_bert

Folders and files

Latest commit

History

Repository files navigation

Get start

Install necessary package for train

Install utility package

How to use

Data Format

Project data

Train data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages