edgar-10k-mda

Here are the scripts I used to download then extract the MDA section ofForm 10k filings from EDGAR database.

Workflow

Download the index file of form 10k filings, raw index files will be saved to './data/index', and create an aggregated index file './year2016-2016.10k.index'

python formindex.py --year_start 2016 --year_end 2016 --index_dir ./data/index --out_file ./year2016-2016.10k.index

Download Form 10k filings using the previously generated index file and save to text directory './data/txt'

python form10k.py --index_path ./year2016-2016.10k.index --txt_dir ./data/txt

Parse the MDA section of the downloaded text and save to mda directory './data/mda'

python mdaparser.py --txt_dir ./data/txt --mda_dir ./data/mda

python: 3.5

pip install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
10k_to_mda.py		10k_to_mda.py
10k_to_rf.py		10k_to_rf.py
DB_to_10k.py		DB_to_10k.py
README.md		README.md
form10k.py		form10k.py
formindex.py		formindex.py
mdaparser.py		mdaparser.py
requirements.txt		requirements.txt
similarity.py		similarity.py