HTML table extraction from SEC filings.
Extracts <table> elements from filing HTML and converts them to tech.ml.dataset objects. Handles iXBRL inline-tagged content, colspan attributes, and mixed numeric/string columns.
Usage: (require '[edgar.tables :as tables]) (tables/extract-tables filing) ; => seq of datasets (tables/extract-tables filing :nth 0) ; => first table as dataset (tables/extract-tables filing :min-rows 5) ; => only tables with >=5 data rows
HTML table extraction from SEC filings. Extracts <table> elements from filing HTML and converts them to tech.ml.dataset objects. Handles iXBRL inline-tagged content, colspan attributes, and mixed numeric/string columns. Usage: (require '[edgar.tables :as tables]) (tables/extract-tables filing) ; => seq of datasets (tables/extract-tables filing :nth 0) ; => first table as dataset (tables/extract-tables filing :min-rows 5) ; => only tables with >=5 data rows
(extract-tables filing
&
{:keys [nth min-rows min-cols] :or {min-rows 2 min-cols 2}})Extract HTML tables from a filing as a seq of tech.ml.dataset objects.
filing — a filing map (from e/filing, filings/get-filing, etc.) Options: :nth — return only the nth table (0-indexed); returns a single dataset or nil :min-rows — only return tables with at least this many data rows (default 2) :min-cols — only return tables with at least this many columns (default 2)
Returns a seq of datasets (or a single dataset when :nth is used). Each dataset is named "table-N" where N is the original index in the HTML.
Tables that appear to be layout/navigation (single-column, <2 data rows) are automatically filtered out.
Example: (require '[edgar.tables :as tables] '[edgar.api :as e]) (def f (e/filing "AAPL" :form "10-K"))
;; All data tables (tables/extract-tables f)
;; Only tables with at least 5 data rows (tables/extract-tables f :min-rows 5)
;; First table (tables/extract-tables f :nth 0)
;; Third table (tables/extract-tables f :nth 2)
Extract HTML tables from a filing as a seq of tech.ml.dataset objects.
filing — a filing map (from e/filing, filings/get-filing, etc.)
Options:
:nth — return only the nth table (0-indexed); returns a single dataset or nil
:min-rows — only return tables with at least this many data rows (default 2)
:min-cols — only return tables with at least this many columns (default 2)
Returns a seq of datasets (or a single dataset when :nth is used).
Each dataset is named "table-N" where N is the original index in the HTML.
Tables that appear to be layout/navigation (single-column, <2 data rows)
are automatically filtered out.
Example:
(require '[edgar.tables :as tables]
'[edgar.api :as e])
(def f (e/filing "AAPL" :form "10-K"))
;; All data tables
(tables/extract-tables f)
;; Only tables with at least 5 data rows
(tables/extract-tables f :min-rows 5)
;; First table
(tables/extract-tables f :nth 0)
;; Third table
(tables/extract-tables f :nth 2)cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |