downloader¶
Download manager for asynchronic parallel downloading.
Functions
|
Download a file synchronously. |
|
Download a file asynchronously. |
|
Download multiple files in parallel. |
|
Download multiple files in parallel, return asynchronous iterator. |
|
Validate parameter |
Classes
Dataclass for result information from a finished download. |
|
Dataclass of parameter values for |
Exceptions
SHA-256 checksum does not match expected value from API. |
- xbrl_filings_api.downloader.download(url, to_dir, *, stem_pattern=None, filename=None, sha256=None, timeout=30.0)¶
Download a file synchronously.
See documentation of
download_async
.
- async xbrl_filings_api.downloader.download_async(url, to_dir, *, stem_pattern=None, filename=None, sha256=None, timeout=30.0)¶
Download a file asynchronously.
The directories in parameter
to_dir
will be created if they do not exist. If nofilename
is given, name is derived from parameterurl
. If file already exists, it will be overwritten.If the
sha256
does not match with the checksum of the downloaded file,xbrl_filings_api.downloader.exceptions.CorruptDownloadError
will be raised and the name of the downloaded file will be appended with".corrupt"
.If download is interrupted, the file will be left with a suffix
".unfinished"
.If no name could be derived from
url
, the file will be namedfile0001
,file0002
, etc. In this case a new file is always created.- Parameters:
url (str) – URL to download.
to_dir (path-like) – Directory to save the file.
stem_pattern (str, optional) – Pattern to add to the filename stems. Placeholder
"/name/"
is always required.filename (str, optional) – Name to be used for the saved file.
sha256 (str, optional) – Expected SHA-256 checksum as a hex string. Case-insensitive. No checksum is calculated if this parameter is not given.
timeout (float, default 30.0) – Maximum timeout for getting an initial response from the server in seconds.
- Returns:
Local path where the downloaded file was saved.
- Return type:
- Raises:
xbrl_filings_api.downloader.exceptions.CorruptDownloadError – Attribute
Filing.package_sha256
does not match the calculated hash of package file.requests.HTTPError – HTTP status error occurs.
requests.ConnectionError – Connection fails.
- xbrl_filings_api.downloader.download_parallel(items, *, max_concurrent=None, timeout=30.0)¶
Download multiple files in parallel.
The order in parameter
items
is not guaranteed on the returned list.See documentation of
download_parallel_aiter
.- Parameters:
items (list of DownloadSpecs)
max_concurrent (int or None, default None)
timeout (float, default 30.0)
- Returns:
Contains information on the finished download.
- Return type:
- async xbrl_filings_api.downloader.download_parallel_aiter(items, *, max_concurrent=None, timeout=30.0)¶
Download multiple files in parallel, return asynchronous iterator.
The ordering in parameter
items
defines the order in which the requests will be started. As the downloads take arbitrary periods of time to finish, it does not guarantee the same order in the yielded results. For this purpose, an additional any-typed attributeinfo
of bothDownloadSpecs
andDownloadResult
is provided to keep track of individual downloads.Yielded
DownloadResult
objects will not have thepath
attribute value when thesha256
check fails even though the file is in fact saved with filename suffix".corrupt"
.Calls function
download_async
via parameteritems
.- Parameters:
items (list of DownloadSpecs) – Instances of
DownloadSpecs
accept the same parameters as functiondownload_async
with an additional no-op attributeinfo
.max_concurrent (int or None, default None) – Maximum number of simultaneous downloads allowed at any moment. If
None
, all downloads will be started immediately. If1
, downloading will be sequential.timeout (float, default 30.0) – Maximum timeout for getting the initial response for a single download from the server in seconds.
- Yields:
DownloadResult – Contains information on the finished download.
- Return type:
- xbrl_filings_api.downloader.validate_stem_pattern(stem_pattern)¶
Validate parameter
stem_pattern
of module functions.- Parameters:
stem_pattern (str or None) – Stem pattern parameter.
- Raises:
ValueError – When stem pattern is invalid.
- class xbrl_filings_api.downloader.DownloadResult¶
Bases:
object
Dataclass for result information from a finished download.
- __hash__()¶
Return hash(self).
- __repr__()¶
Return repr(self).
- info: Any = None¶
Value of
DownloadSpecs.info
for parallel downloads.
- class xbrl_filings_api.downloader.DownloadSpecs¶
Bases:
object
Dataclass of parameter values for
downloader.download_async()
.Used as download instructions in lists for parallel download functions which eventually end up as parameters for
download_async()
. Attributeinfo
is only for keeping track of downloads and is not used as a function parameter.- stem_pattern: str | None = None¶
Pattern to add to the filename stems.
Placeholder
"/name/"
is always required.
- __hash__()¶
Return hash(self).
- __repr__()¶
Return repr(self).