Skip to content

Commit

Permalink
tokenizer: the new wi17k_base tokenizer using the wikitext
Browse files Browse the repository at this point in the history
  • Loading branch information
Hk669 committed Jun 4, 2024
1 parent a8a5d66 commit 4f713b6
Show file tree
Hide file tree
Showing 2 changed files with 34,387 additions and 1 deletion.
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -153,4 +153,8 @@ logs
.DS_Store

output/
*.pkl
*.pkl

# Ignore training files
wk100k_base.py
datasets/
Loading

0 comments on commit 4f713b6

Please sign in to comment.