Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR]: use SEC company facts for equity fundamental commands #6654

Open
dijonkitchen opened this issue Sep 5, 2024 · 13 comments
Open

[FR]: use SEC company facts for equity fundamental commands #6654

dijonkitchen opened this issue Sep 5, 2024 · 13 comments

Comments

@dijonkitchen
Copy link

dijonkitchen commented Sep 5, 2024

What's the problem of not having this feature?
The SEC filings already have a ton of financial statement data/facts.

These are free for the public, but are not organized according to OpenBB's data model for wide usage.

This would allow users to have a good, default, free alternative to all the other data providers.

Describe the solution you would like
Since SEC company facts are now incorporated: https://docs.openbb.co/platform/reference/equity/compare/company_facts, the equity fundamental commands can be implemented for any company: https://docs.openbb.co/platform/reference/equity/fundamental

There may be some nuances with the naming of things like Revenues vs RevenueFromContractWithCustomerExcludingAssessedTax, but OpenBB can handle that in the transform.

Describe alternatives you've considered
Manually using company facts.

Additional information
N/A

@dijonkitchen dijonkitchen changed the title [FR]: use SEC company facts for company fundamental commands [FR]: use SEC company facts for equity fundamental commands Sep 5, 2024
@deeleeramone
Copy link
Contributor

deeleeramone commented Sep 11, 2024

There may be some nuances with the naming of things like Revenues vs RevenueFromContractWithCustomerExcludingAssessedTax, but OpenBB can handle that in the transform.

This is an understatement, but I can appreciate the sentiment and agree that there is a general need for this type of data access within the open source community.

There are many critical nuances that go along with the financial statement items, and this isn't something you can just "ask ChatGPT" expecting an answer that is factually correct and usable in the real world. I'll highlight some of the complex challenges associated with standardizing raw SEC data for use as a continuous time series that is directly comparable across companies and industries. If you have expertise in any particular area, feel free to jump in and help solve some of the larger problems.

  • Rendering individual reports is no problem, but the complications arise when attempting to track items across time, or when comparing similar companies.
  • RevenueFromContractWithCustomerExcludingAssessedTax is only the correct item for some companies, and then all companies will have a different Tag in previous years. What that tag is, will vary by company and will not necessarily be a continuation of the same "field". You will need some base knowledge in GAAP accounting to reliably determine field equivalents between companies and over time.
  • How things are reported change by the reporting company's industry, the Tags will be different and have a different context. A company may even find itself reclassified in the NAICS classification and would have to adjust to reflect.
  • The definitions within a company's financial statements are somewhat arbitrary, and will not be consistent over time. Creating a line item that works over the entire history is not a simple task and requires a hierarchy of conditional candidates.
  • Most quarterly numbers have to be derived from the "as-reported" numbers to reflect pure Q/Q change.

Inputs we need to make this happen:

  • A hierarchal dictionary of standardized line items based on tags, and/or adding/subtracting of several tags, that form templates for each of 3 financial statements mapping single companies across time over the various reporting styles.

If anyone would like to help out for the greater good, please indicate in the comments and we can divide-and-conquer.

@gtkacz
Copy link

gtkacz commented Sep 13, 2024

@deeleeramone I'd be willing to help out however I can!

@deeleeramone
Copy link
Contributor

@deeleeramone I'd be willing to help out however I can!

Awesome! What's the best way to leverage your strengths and areas of expertise?

@gtkacz
Copy link

gtkacz commented Sep 20, 2024

@deeleeramone I'd be willing to help out however I can!

Awesome! What's the best way to leverage your strengths and areas of expertise?

Honestly just list out whatever you need for us to get started and I could point out whichever portion I feel most comfortable doing. I have experience in web-scraping and software engineering if that helps.

@deeleeramone
Copy link
Contributor

Honestly just list out whatever you need for us to get started and I could point out whichever portion I feel most comfortable doing. I have experience in web-scraping and software engineering if that helps.

Scraping the web is not really applicable to what needs to happen here. What we need is:

A hierarchal dictionary of standardized line items based on tags, and/or adding/subtracting of several tags, that form templates for each of 3 financial statements mapping single companies across time over the various reporting styles.

This requires a lot of background knowledge specific to US-GAAP accounting, SEC filings, and the XBRL language. Where we're going, there is no "follow these simple steps...", some assembly required.

@gtkacz
Copy link

gtkacz commented Sep 21, 2024

The only thing of those I'm familiar with is XBRL, but I'd be willing to learn to help out however I can!

@dijonkitchen
Copy link
Author

dijonkitchen commented Sep 22, 2024

Perhaps we're still missing a piece of the SEC API that'd be more focused: Single company concepts.

https://www.sec.gov/search-filings/edgar-application-programming-interfaces#:~:text=and%20across%20time.-,data.sec.gov/api/xbrl/companyconcept/,-The%20company%2Dconcept

This way, people can get the all the historical data for one line item within a financial statement. We can then build up whole financial statements from there.

For example:

Just the Accounts Payable amounts for Alphabet,
https://data.sec.gov/api/xbrl/companyconcept/CIK0001652044/us-gaap/AccountsPayableCurrent.json
rather than everything for Alphabet:
https://data.sec.gov/api/xbrl/companyfacts/CIK0001652044.json

There seem to be only a limited number of company facts, so when there are multiple for one, we can merge them together and use the latest one.

Thoughts?

@dijonkitchen
Copy link
Author

dijonkitchen commented Sep 22, 2024

Actually, I do see there is company concepts already used by company facts: https://github.com/OpenBB-finance/OpenBB/blob/develop/openbb_platform/providers/sec/openbb_sec/utils/frames.py#L186

But we'd need it to be able to take in a fiscal_period so that we can get one row of an income statement at a time: https://github.com/OpenBB-finance/OpenBB/blob/develop/openbb_platform/providers/sec/openbb_sec/models/compare_company_facts.py#L161-L170

Made a draft PR in #6685 if y'all want to take a look and/or improve.

@deeleeramone
Copy link
Contributor

There seem to be only a limited number of company facts, so when there are multiple for one, we can merge them together and use the latest one.

This list is somewhat comprehensive, but will not be complete. It was compiled manually by cross-examining a selection of recent XBRL filings, extracting the GAAP facts, and checking the Frames API for support. I could easily have missed several hundred facts, but I believe I covered the general broad strokes. Enough, at least, to justify providing choices so the user does not have to guess what they might be.

@deeleeramone
Copy link
Contributor

The only thing of those I'm familiar with is XBRL, but I'd be willing to learn to help out however I can!

Have a look through this - http://www.xbrlsite.com/2015/fro/us-gaap/html/ReportFrames/ - which contains XBRL schemas and mappings for the various types of reporting entity. What we are particularly interested in is how the "Try Order" can be used to build our hierarchical dictionary for extracting the fundamental accounting concept to the XBRL US-GAAP Taxonomy Concept.
Screenshot 2024-09-27 at 11 43 40 AM

The potential workflow would look something like this:

  • The user enters a ticker symbol, it then gets mapped to the CIK number.
  • That CIK number is used to map with the Report Frame Code.
  • The Report Frame data would then be parsed for the schema and mapping rules.

This would result in a dictionary that would be sectioned into "balance", "income", "cash", and probably "supplementary".

Within each of these, we would have the fundamental accounting concepts as ordered keys (which would represent an item from the particular statement) and the values would be a list of the US-GAAP Taxonomy Concept(s), ordered in the "Try Order".

With this information, we would be able to reliably structure financial statements directly from the CompanyFacts API output. Getting full year tables would be a first reasonable goal after creating this dynamic mapping workflow.

What are you thoughts?

@gtkacz
Copy link

gtkacz commented Sep 28, 2024

@deeleeramone seems perfect for me, will get started on mapping as soon as I have the time! Couple of questions:

  • How would we store the map of tickers to CIK?
  • Could you please provide a simple schema of exactly how you want the data to be outputted to?

@deeleeramone
Copy link
Contributor

deeleeramone commented Sep 30, 2024

@gtkacz:

How would we store the map of tickers to CIK?

This one already exists, and can be imported:

from openbb_sec.utils.helpers import symbol_map

cik = await symbol_map("AAPL")
cik
'0000320193'
  • Could you please provide a simple schema of exactly how you want the data to be outputted to?

This may need some tweaking, and I'm totally open to suggestions, but here's the general idea. "RollUp" indicates that item is an "Abstract" - which means it has children and is displayed as an indented level.

From: http://www.xbrlsite.com/2015/fro/us-gaap/html/ReportFrames/COMID-BSC-CF1-ISM-IEMIB-OILY-SPEC6/index.html

{
    "balance": [
        {
            "line_item": "fac:AssetsRollUp",
            "order": 1,
            "level": 1,
            "children": [
                {
                    "line_item": "fac:CurrentAssets",
                    "order": 1.1,
                    "level": 2,
                    "period_type": "As Of",
                    "balance": "Debit",
                    "try_order": ["us-gaap:AssetsCurrent"],
                    "children": [],
                },
                {
                    "line_item": "fac:NonCurrentAssets",
                    "order": 1.2,
                    "level": 2,
                    "period_type": "As Of",
                    "balance": "Debit",
                    "try_order": ["us-gaap:AssetsNoncurrent"],
                    "children": [],
                },
                {
                    "line_item": "fac:Assets",
                    "order": 1.3,
                    "level": 2,
                    "period_type": "As Of",
                    "balance": "Debit",
                    "try_order": ["us-gaap:Assets", "us-gaap:AssetsCurrent"],
                    "children": [],
                },
            ],
        },
        {
            "line_item": "fac:LiabilitiesEquityRollUp",
            "order": 2,
            "level": 1,
            "children": [
                {
                    "line_item": "fac:LiabilitiesRollUp",
                    "order": 2.1,
                    "level": 2,
                    "children": [
                        {
                            "line_item": "fac:CurrentLiabilities",
                            "order": 2.101,
                            "level": 3,
                            "period_type": "As Of",
                            "balance": "Credit",
                            "try_order": ["us-gaap:LiabilitiesCurrent"],
                            "children": [],
                        },
                        {
                            "line_item": "fac:NoncurrentLiabilities",
                            "order": 2.102,
                            "level": 3,
                            "period_type": "As Of",
                            "balance": "Credit",
                            "try_order": ["us-gaap:LiabilitiesNoncurrent"],
                            "children": [],
                        },
                        {
                            "line_item": "fac:Liabilities",
                            "order": 2.103,
                            "level": 3,
                            "period_type": "As Of",
                            "balance": "Credit",
                            "try_order": ["us-gaap:Liabilities"],
                            "children": [],
                        },
                    ],
                },
                {
                    "line_item": "fac:CommitmentsAndContingencies",
                    "order": 2.2,
                    "level": 2,
                    "period_type": "As Of",
                    "balance": "Credit",
                    "try_order": ["us-gaap:CommitmentsAndContingencies"],
                    "children": [],
                },
                {
                    "line_item": "fac:TemporaryEquity",
                    "order": 2.3,
                    "level": 2,
                    "period_type": "As Of",
                    "balance": "Credit",
                    "try_order": [
                        "us-gaap:TemporaryEquityCarryingAmountIncludingPortionAttributableToNoncontrollingInterests",
                        "us-gaap:RedeemablePreferredStockCarryingAmount",
                        "us-gaap:TemporaryEquityValueExcludingAdditionalPaidInCapital",
                    ],
                    "children": [],
                },
                {
                    "line_item": "fac:EquityRollUp",
                    "order": 2.4,
                    "level": 2,
                    "children": [
                        {
                            "line_item": "fac:EquityAttributableToParent",
                            "order": 2.401,
                            "level": 3,
                            "period_type": "As Of",
                            "balance": "Credit",
                            "try_order": [
                                "us-gaap:StockholdersEquity",
                                "us-gaap:PartnersCapital",
                                "us-gaap:MembersEquity",
                            ],
                            "children": [],
                        },
                        {
                            "line_item": "fac:EquityAttributableToNoncontrollingInterest",
                            "order": 2.402,
                            "level": 3,
                            "period_type": "As Of",
                            "balance": "Credit",
                            "try_order": [
                                "us-gaap:MinorityInterest",
                                "us-gaap:PartnersCapitalAttributableToNoncontrollingInterest",
                                "us-gaap:MembersEquityAttributableToNoncontrollingInterest",
                            ],
                            "children": [],
                        },
                        {
                            "line_item": "fac:Equity",
                            "order": 2.403,
                            "level": 3,
                            "period_type": "As Of",
                            "balance": "Credit",
                            "try_order": [
                                "us-gaap:StockholdersEquityIncludingPortionAttributableToNoncontrollingInterest",
                                "us-gaap:PartnersCapitalIncludingPortionAttributableToNoncontrollingInterest",
                                "us-gaap:LimitedLiabilityCompanyLlcMembersEquityIncludingPortionAttributableToNoncontrollingInterest",
                            ],
                            "children": [],
                        },
                    ],
                },
                {
                    "line_item": "fac:LiabilitiesAndEquity",
                    "order": 2.5,
                    "level": 2,
                    "period_type": "As Of",
                    "balance": "Credit",
                    "try_order": [
                        "us-gaap:LiabilitiesAndStockholdersEquity",
                        "us-gaap:LiabilitiesAndPartnersCapital",
                    ],
                    "children": [],
                },
            ],
        },
    ]
}

@mudokenking
Copy link

mudokenking commented Jan 5, 2025

edit: organized taxonomy links a bit, and bracketed links (is that markup? i told you...i suck at coding :/ plz forgive me).

I'm ready to pull my hair out trying to do this without openbb. Then I find this. Great. I can't code for my life, but Here are some links I gathered. Hope they help. Sorry couldn't contribute more.

Arelle: official xbrl, xsd validation platform from sec.gov devs
QXmledit: alternative
xsddiagram (needs mono....or windows)

TAXONOMY STUFF: definitely look at these
yeti links:
DIRECT (allows you to choose from different year, and different reports)
SRT_TAXONOMY_2024
GAAP_TAXONOMY_2024
DQC_GAAP_TAXONOMY_2024

FASB.ORG - main taxonomy page
from https://fasb.org/projects/fasb-taxonomies : 2024-gaap., the first link on that page is the yeti database, the links after give you excel files, and the final zip file has everything but needs xbrl software.
us-gaap rules from XBRL.US: 2024

[International Financial Reporting Standards Foundation , International Accounting Standards Board:
IFRS.org, IASB (non-gaap)

a kaggle project made by someone to use edgar's api
a python snippet/project from https://quant-trading.co to use the sec api. the code definitely needs refining/tweaking.

Best topic I found on discussing and trying to untangle all the different taxonomy standardizing (differences with facts/frames/concepts in different companies, and from year to year.
forum discussion from xbrl.us

SECEDGAR GITS: maybe some useful scripts here to inspire someone.
https://github.com/jadchaar/sec-edgar-api
https://github.com/janlukasschroeder/sec-api-python
https://github.com/gaulinmp/pyedgar
https://github.com/dgunning/edgartools
https://github.com/bellingcat/EDGAR
https://github.com/jadchaar/sec-cik-mapper
https://github.com/dfwcnj/edgarquery
https://github.com/sec-edgar/sec-edgar/blob/master/secedgar/cik_lookup.py
https://github.com/janlukasschroeder/sec-api
https://github.com/nlpaueb/edgar-crawler
https://github.com/joeyism/py-edgar
https://github.com/jadchaar/sec-edgar-downloader
https://github.com/Elijas/sec-downloader
https://github.com/sec-edgar/sec-edgar
https://law.mit.edu/pub/openedgar/release/1
https://sec-edgar.github.io/sec-edgar/
https://github.com/edgarminers/python-edgar
https://github.com/sachin-sankar/edgarpython
https://github.com/McKalvan/secpy
https://github.com/areed1192/python-sec
https://github.com/cthurber/sec-edgar-10k
https://github.com/farhadab/sec-edgar-financials
https://github.com/pratikrelekar/EdgarDSRS
https://github.com/sd3v/openinsiderData
https://github.com/john-friedman/datamule-python
https://github.com/cran/tidyedgar
https://github.com/janlukasschroeder/sec-api-python

I take no credit and don't attempt to receive credit for any of the work provided from these sources, only here to share sources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants