Extracting Financial Data from Corporate Filings with the SEC

By

Melvin Ma, Jian Xu and YuNong Wu

Python

Machine Learning

Our project is to create an application using machine learning, specifically Naive Bayes, that can parse a text document and identify phrases that include a disclosure of an SEC investigation. The target accuracy of this model of 95% accuracy was achieved in a 85:15 training/testing split. In the application, users can search for companies by their CIK to find a list of SEC filings from the SEC database, which can then be extracted and processed by the machine learning model. The goal of this project is to improve upon the original work flow (script that grabbed any sentence that contained key words like investigation) by optimizing the processing and extraction of results while also providing a more robust solution to text classification.

❮ ❯

0 Lifts

Artifacts

Name	Description
SEC Team's Presentation	This video shows how we designed, developed this software, and shows some features about this software.	Link

Project Showcase

Project
Showcase

Extracting Financial Data from Corporate Filings with the SEC

By

Melvin Ma, Jian Xu and YuNong Wu

Artifacts