To empower the financial and research community, we are thrilled to open source one year of exchange filings for the top 500 US companies!

What’s Included

Three Formats
Three Formats
  • Original PDFs for reference.
  • Machine-readable text for quick analysis.
  • Pre-processed vectors for AI and semantic search.
Massive Dataset
Massive Dataset
  • Over 5,713 files, spanning ~314 GB of data.
  • Updated monthly to ensure timely insights.

Easy Access
Easy Access
  • Full details on how to fetch the data from our S3 bucket are available on GitHub.

Why Are We Doing This?

We believe that open data drives innovation. By providing access to this dataset, we aim to help:

Researchers and analysts

Researchers and analysts build their own RAG (Retrieval-Augmented Generation) solutions.

Teams

Teams streamline workflows for investment research, risk management, and ESG analysis.

Innovators unlock new possibilities in AI-driven insights.