downloader¶
Download manager for asynchronic parallel downloading.
Functions
|
Download a file synchronously. |
|
Download a file asynchronously. |
|
Download multiple files in parallel. |
|
Download multiple files in parallel, return asynchronous iterator. |
|
Validate parameter |
Classes
Dataclass for result information from a finished download. |
|
Dataclass of parameter values for |
Exceptions
SHA-256 checksum does not match expected value from API. |
- xbrl_filings_api.downloader.download(url, to_dir, *, stem_pattern=None, filename=None, sha256=None, timeout=30.0)¶
Download a file synchronously.
See documentation of
download_async.
- async xbrl_filings_api.downloader.download_async(url, to_dir, *, stem_pattern=None, filename=None, sha256=None, timeout=30.0)¶
Download a file asynchronously.
The directories in parameter
to_dirwill be created if they do not exist. If nofilenameis given, name is derived from parameterurl. If file already exists, it will be overwritten.If the
sha256does not match with the checksum of the downloaded file,xbrl_filings_api.downloader.exceptions.CorruptDownloadErrorwill be raised and the name of the downloaded file will be appended with".corrupt".If download is interrupted, the file will be left with a suffix
".unfinished".If no name could be derived from
url, the file will be namedfile0001,file0002, etc. In this case a new file is always created.- Parameters:
url (str) – URL to download.
to_dir (path-like) – Directory to save the file.
stem_pattern (str, optional) – Pattern to add to the filename stems. Placeholder
"/name/"is always required.filename (str, optional) – Name to be used for the saved file.
sha256 (str, optional) – Expected SHA-256 checksum as a hex string. Case-insensitive. No checksum is calculated if this parameter is not given.
timeout (float, default 30.0) – Maximum timeout for getting an initial response from the server in seconds.
- Returns:
Local path where the downloaded file was saved.
- Return type:
- Raises:
xbrl_filings_api.downloader.exceptions.CorruptDownloadError – Attribute
Filing.package_sha256does not match the calculated hash of package file.requests.HTTPError – HTTP status error occurs.
requests.ConnectionError – Connection fails.
- xbrl_filings_api.downloader.download_parallel(items, *, max_concurrent=None, timeout=30.0)¶
Download multiple files in parallel.
The order in parameter
itemsis not guaranteed on the returned list.See documentation of
download_parallel_aiter.- Parameters:
items (list of DownloadSpecs)
max_concurrent (int or None, default None)
timeout (float, default 30.0)
- Returns:
Contains information on the finished download.
- Return type:
- async xbrl_filings_api.downloader.download_parallel_aiter(items, *, max_concurrent=None, timeout=30.0)¶
Download multiple files in parallel, return asynchronous iterator.
The ordering in parameter
itemsdefines the order in which the requests will be started. As the downloads take arbitrary periods of time to finish, it does not guarantee the same order in the yielded results. For this purpose, an additional any-typed attributeinfoof bothDownloadSpecsandDownloadResultis provided to keep track of individual downloads.Yielded
DownloadResultobjects will not have thepathattribute value when thesha256check fails even though the file is in fact saved with filename suffix".corrupt".Calls function
download_asyncvia parameteritems.- Parameters:
items (list of DownloadSpecs) – Instances of
DownloadSpecsaccept the same parameters as functiondownload_asyncwith an additional no-op attributeinfo.max_concurrent (int or None, default None) – Maximum number of simultaneous downloads allowed at any moment. If
None, all downloads will be started immediately. If1, downloading will be sequential.timeout (float, default 30.0) – Maximum timeout for getting the initial response for a single download from the server in seconds.
- Yields:
DownloadResult – Contains information on the finished download.
- Return type:
- xbrl_filings_api.downloader.validate_stem_pattern(stem_pattern)¶
Validate parameter
stem_patternof module functions.- Parameters:
stem_pattern (str or None) – Stem pattern parameter.
- Raises:
ValueError – When stem pattern is invalid.
- class xbrl_filings_api.downloader.DownloadResult¶
Bases:
objectDataclass for result information from a finished download.
- __hash__()¶
Return hash(self).
- __repr__()¶
Return repr(self).
- info: Any = None¶
Value of
DownloadSpecs.infofor parallel downloads.
- class xbrl_filings_api.downloader.DownloadSpecs¶
Bases:
objectDataclass of parameter values for
downloader.download_async().Used as download instructions in lists for parallel download functions which eventually end up as parameters for
download_async(). Attributeinfois only for keeping track of downloads and is not used as a function parameter.- stem_pattern: str | None = None¶
Pattern to add to the filename stems.
Placeholder
"/name/"is always required.
- __hash__()¶
Return hash(self).
- __repr__()¶
Return repr(self).