Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mining EDGAR #6

Open
ebolyen opened this issue Feb 19, 2021 · 3 comments
Open

Mining EDGAR #6

ebolyen opened this issue Feb 19, 2021 · 3 comments

Comments

@ebolyen
Copy link

ebolyen commented Feb 19, 2021

EDGAR maintains indices of all SEC filings, which you can find documentation for here:
https://www.sec.gov/edgar/searchedgar/accessing-edgar-data.htm

Something I have learned is you can append -index.html to the end of a CIK which will give you a far more parse-able HTML file than the SGML you get as a reference from one of the indices (also I trust that SGML about as far as I can throw it, it contains PDFs and other blobs in it... so it's not far). That said, parsing the SGML would give you the contents of the index with a single download.

example line from an index:

10-K        3COM CORP                                                     738076      2000-08-17  edgar/data/738076/0001005477-00-005922.txt          

URL-hacked index:
https://www.sec.gov/Archives/edgar/data/738076/0001005477-00-005922-index.html


disclaimer, I have no idea what this company is, I just grabbed a random line with a 10-K filing

@michael-watson
Copy link

@itsclaireh is it possible to get added to this repo and maybe make assignment stuff for this? I would like to explore exposing this data

@DrewMcArthur
Copy link

@itsclaireh is it possible to get added to this repo and maybe make assignment stuff for this? I would like to explore exposing this data

@michael-watson you should be able to fork the repo, and once this gets off the ground and has some organizational stuff setup, then you could file a pull request!

@pdeneka
Copy link

pdeneka commented Feb 21, 2021

EDGAR is a subset of #25 but could definitely use some help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants