core.utils
==========
.. py:module:: core.utils
Attributes
----------
.. autoapisummary::
core.utils._T
core.utils._KT
core.utils._unwanted_url_chars
core.utils._double_dash
core.utils._number_suffix
core.utils._repeated_spaces
core.utils._uuid
core.utils._email_regex
core.utils._multiple_newlines
core.utils._phone_inside_a_tags
core.utils._phone_ch_country_code
core.utils._phone_ch
core.utils._phone_ch_html_safe
core.utils.ALPHABET
core.utils.ALPHABET_RE
Classes
-------
.. autoapisummary::
core.utils.Bunch
core.utils.PostThread
Functions
---------
.. autoapisummary::
core.utils.local_lock
core.utils.normalize_for_url
core.utils.increment_name
core.utils.remove_repeated_spaces
core.utils.profile
core.utils.timing
core.utils.module_path_root
core.utils.module_path
core.utils.touch
core.utils.render_file
core.utils.hash_dictionary
core.utils.groupbylist
core.utils.linkify_phone
core.utils.top_level_domains
core.utils.linkify
core.utils.paragraphify
core.utils.to_html_ul
core.utils.ensure_scheme
core.utils.is_uuid
core.utils.is_non_string_iterable
core.utils.relative_url
core.utils.is_subpath
core.utils.is_sorted
core.utils.morepath_modules
core.utils.scan_morepath_modules
core.utils.get_unique_hstore_keys
core.utils.makeopendir
core.utils.append_query_param
core.utils.toggle
core.utils.binary_to_dictionary
core.utils.dictionary_to_binary
core.utils.safe_format
core.utils.safe_format_keys
core.utils.is_valid_yubikey
core.utils.is_valid_yubikey_format
core.utils.yubikey_otp_to_serial
core.utils.yubikey_public_id
core.utils.dict_path
core.utils.safe_move
core.utils.batched
Module Contents
---------------
.. py:data:: _T
.. py:data:: _KT
.. py:data:: _unwanted_url_chars
.. py:data:: _double_dash
.. py:data:: _number_suffix
.. py:data:: _repeated_spaces
.. py:data:: _uuid
.. py:data:: _email_regex
.. py:data:: _multiple_newlines
.. py:data:: _phone_inside_a_tags
:value: '(\\">|href=\\"tel:)?'
.. py:data:: _phone_ch_country_code
:value: '(\\+41|0041|0[0-9]{2})'
.. py:data:: _phone_ch
.. py:data:: _phone_ch_html_safe
.. py:data:: ALPHABET
:value: 'cbdefghijklnrtuv'
.. py:data:: ALPHABET_RE
.. py:function:: local_lock(namespace: str, key: str) -> collections.abc.Iterator[None]
Locks the given namespace/key combination on the current system,
automatically freeing it after the with statement has been completed or
once the process is killed.
Usage::
with lock('namespace', 'key'):
pass
.. py:function:: normalize_for_url(text: str) -> str
Takes the given text and makes it fit to be used for an url.
That means replacing spaces and other unwanted characters with '-',
lowercasing everything and turning unicode characters into their closest
ascii equivalent using Unidecode.
See https://pypi.python.org/pypi/Unidecode
.. py:function:: increment_name(name: str) -> str
Takes the given name and adds a numbered suffix beginning at 1.
For example::
foo => foo-1
foo-1 => foo-2
.. py:function:: remove_repeated_spaces(text: str) -> str
Removes repeated spaces in the text ('a b' -> 'a b').
.. py:function:: profile(filename: str) -> collections.abc.Iterator[None]
Profiles the wrapped code and stores the result in the profiles folder
with the given filename.
.. py:function:: timing(name: str | None = None) -> collections.abc.Iterator[None]
Runs the wrapped code and prints the time in ms it took to run it.
The name is printed in front of the time, if given.
.. py:function:: module_path_root(module: types.ModuleType | str) -> str
.. py:function:: module_path(module: types.ModuleType | str, subpath: str) -> str
Returns a subdirectory in the given python module.
:mod:
A python module (actual module or string)
:subpath:
Subpath below that python module. Leading slashes ('/') are ignored.
.. py:function:: touch(file_path: str) -> None
Touches the file on the given path.
.. py:class:: Bunch(**kwargs: Any)
A simple but handy "collector of a bunch of named stuff" class.
See ``_.
For example::
point = Bunch(x=1, y=2)
assert point.x == 1
assert point.y == 2
point.z = 3
assert point.z == 3
Allows the creation of simple nested bunches, for example::
request = Bunch(**{'app.settings.org.my_setting': True})
assert request.app.settings.org.my_setting is True
.. py:method:: __getattr__(name: str) -> Any
.. py:method:: __eq__(other: object) -> bool
.. py:method:: __ne__(other: object) -> bool
.. py:function:: render_file(file_path: str, request: core.request.CoreRequest) -> webob.Response
Takes the given file_path (content) and renders it to the browser.
The file must exist on the local system and be readable by the current
process.
.. py:function:: hash_dictionary(dictionary: dict[str, Any]) -> str
Computes a sha256 hash for the given dictionary. The dictionary
is expected to only contain values that can be serialized by json.
That includes int, decimal, string, boolean.
Note that this function is not meant to be used for hashing secrets. Do
not include data in this dictionary that is secret!
.. py:function:: groupbylist(iterable: collections.abc.Iterable[_T], key: None = ...) -> list[tuple[_T, list[_T]]]
groupbylist(iterable: collections.abc.Iterable[_T], key: collections.abc.Callable[[_T], _KT]) -> list[tuple[_KT, list[_T]]]
Works just like Python's ``itertools.groupby`` function, but instead
of returning generators, it returns lists.
.. py:function:: linkify_phone(text: str) -> markupsafe.Markup
Takes a string and replaces valid phone numbers with html links. If a
phone number is matched, it will be replaced by the result of a callback
function, that does further checks on the regex match. If these checks do
not pass, the matched number will remain unchanged.
.. py:function:: top_level_domains() -> set[str]
.. py:function:: linkify(text: str | None) -> markupsafe.Markup
Takes plain text and injects html links for urls and email addresses.
By default the text is html escaped before it is linkified. This accounts
for the fact that we usually use this for text blocks that we mean to
extend with email addresses and urls.
If html is already possible, why linkify it?
Note: We need to clean the html after we've created it (linkify
parses escaped html and turns it into real html). As a consequence it
is possible to have html urls in the text that won't be escaped.
.. py:function:: paragraphify(text: str) -> markupsafe.Markup
Takes a text with newlines groups them into paragraphs according to the
following rules:
If there's a single newline between two lines, a
will replace that
newline.
If there are multiple newlines between two lines, each line will become
a paragraph and the extra newlines are discarded.
.. py:function:: to_html_ul(value: str | None, convert_dashes: bool = True, with_title: bool = False) -> markupsafe.Markup
Linkify and convert to text to one or multiple ul's or paragraphs.
.. py:function:: ensure_scheme(url: str, default: str = 'http') -> str
ensure_scheme(url: None, default: str = 'http') -> None
Makes sure that the given url has a scheme in front, if none
was provided.
.. py:function:: is_uuid(value: str | uuid.UUID) -> bool
Returns true if the given value is a uuid. The value may be a string
or of type UUID. If it's a string, the uuid is checked with a regex.
.. py:function:: is_non_string_iterable(obj: object) -> bool
Returns true if the given obj is an iterable, but not a string.
.. py:function:: relative_url(absolute_url: str | None) -> str
Removes everything in front of the path, including scheme, host,
username, password and port.
.. py:function:: is_subpath(directory: str, path: str) -> bool
Returns true if the given path is inside the given directory.
.. py:function:: is_sorted(iterable: collections.abc.Iterable[_typeshed.SupportsRichComparison], key: collections.abc.Callable[[_typeshed.SupportsRichComparison], _typeshed.SupportsRichComparison] = ..., reverse: bool = ...) -> bool
is_sorted(iterable: collections.abc.Iterable[_T], key: collections.abc.Callable[[_T], _typeshed.SupportsRichComparison], reverse: bool = ...) -> bool
Returns True if the iterable is sorted.
.. py:function:: morepath_modules(cls: type[morepath.App]) -> collections.abc.Iterator[str]
Returns all morepath modules which should be scanned for the given
morepath application class.
We can't reliably know the actual morepath modules that
need to be scanned, which is why we assume that each module has
one namespace (like 'more.transaction' or 'onegov.core').
.. py:function:: scan_morepath_modules(cls: type[morepath.App]) -> None
Tries to scan all the morepath modules required for the given
application class. This is not guaranteed to stay reliable as there is
no sure way to discover all modules required by the application class.
.. py:function:: get_unique_hstore_keys(session: sqlalchemy.orm.Session, column: sqlalchemy.Column[dict[str, Any]]) -> set[str]
Returns a set of keys found in an hstore column over all records
of its table.
.. py:function:: makeopendir(fs: fs.base.FS, directory: str) -> fs.base.SubFS[fs.base.FS]
Creates and opens the given directory in the given PyFilesystem.
.. py:function:: append_query_param(url: str, key: str, value: str) -> str
Appends a single query parameter to an url. This is faster than
using Purl, if and only if we only add one query param.
Also this function assumes that the value is already url encoded.
.. py:class:: PostThread(url: str, data: bytes, headers: collections.abc.Collection[tuple[str, str]], timeout: float = 30)
Bases: :py:obj:`threading.Thread`
POSTs the given data with the headers to the URL.
Example::
data = {'a': 1, 'b': 2}
data = json.dumps(data).encode('utf-8')
PostThread(
'https://example.com/post',
data,
(
('Content-Type', 'application/json; charset=utf-8'),
('Content-Length', len(data))
)
).start()
This only works for external URLs! If posting to server itself is
needed, use a process instead of the thread!
.. py:attribute:: url
.. py:attribute:: data
.. py:attribute:: headers
.. py:attribute:: timeout
:value: 30
.. py:method:: run() -> None
Method representing the thread's activity.
You may override this method in a subclass. The standard run() method
invokes the callable object passed to the object's constructor as the
target argument, if any, with sequential and keyword arguments taken
from the args and kwargs arguments, respectively.
.. py:function:: toggle(collection: set[_T], item: _T | None) -> set[_T]
Returns a new set where the item has been toggled.
.. py:function:: binary_to_dictionary(binary: bytes, filename: str | None = None) -> core.types.FileDict
Takes raw binary filedata and stores it in a dictionary together
with metadata information.
The data is compressed before it is stored int he dictionary. Use
:func:`dictionary_to_binary` to get the original binary data back.
.. py:function:: dictionary_to_binary(dictionary: core.types.LaxFileDict) -> bytes
Takes a dictionary created by :func:`binary_to_dictionary` and returns
the original binary data.
.. py:function:: safe_format(format: str, dictionary: dict[str, str | int | float], types: None = ..., adapt: collections.abc.Callable[[str], str] | None = ..., raise_on_missing: bool = ...) -> str
safe_format(format: str, dictionary: dict[str, _T], types: set[type[_T]] = ..., adapt: collections.abc.Callable[[str], str] | None = ..., raise_on_missing: bool = ...) -> str
Takes a user-supplied string with format blocks and returns a string
where those blocks are replaced by values in a dictionary.
For example::
>>> safe_format('[user] has logged in', {'user': 'admin'})
'admin has logged in'
:param format:
The format to use. Square brackets denote dictionary keys. To
literally print square bracktes, mask them by doubling ('[[' -> '[')
:param dictionary:
The dictionary holding the variables to use. If the key is not found
in the dictionary, the bracket is replaced with an empty string.
:param types:
A set of types supported by the dictionary. Limiting this to safe
types like builtins (str, int, float) ensure that no values are
accidentally leaked through faulty __str__ representations.
Note that inheritance is ignored. Supported types need to be
whitelisted explicitly.
:param adapt:
An optional callable that receives the key before it is used. Returns
the same key or an altered version.
:param raise_on_missing:
True if missing keys should result in a runtime error (defaults to
False).
This is strictly meant for formats provided by users. Python's string
formatting options are clearly superior to this, however it is less
secure!
.. py:function:: safe_format_keys(format: str, adapt: collections.abc.Callable[[str], str] | None = None) -> list[str]
Takes a :func:`safe_format` string and returns the found keys.
.. py:function:: is_valid_yubikey(client_id: str, secret_key: str, expected_yubikey_id: str, yubikey: str) -> bool
Asks the yubico validation servers if the given yubikey OTP is valid.
:client_id:
The yubico API client id.
:secret_key:
The yubico API secret key.
:expected_yubikey_id:
The expected yubikey id. The yubikey id is defined as the first twelve
characters of any yubikey value. Each user should have a yubikey
associated with it's account. If the yubikey value comes from a
different key, the key is invalid.
:yubikey:
The actual yubikey value that should be verified.
:return: True if yubico confirmed the validity of the key.
.. py:function:: is_valid_yubikey_format(otp: str) -> bool
Returns True if the given OTP has the correct format. Does not actually
contact Yubico, so this function may return true, for some invalid keys.
.. py:function:: yubikey_otp_to_serial(otp: str) -> int | None
Takes a Yubikey OTP and calculates the serial number of the key.
The serial key is printed on the yubikey, in decimal and as a QR code.
Example::
>>> yubikey_otp_to_serial(
'ccccccdefghdefghdefghdefghdefghdefghdefghklv')
2311522
Adapted from Java::
https://github.com/Yubico/yubikey-salesforce-client/blob/
e38e46ee90296a852374a8b744555e99d16b6ca7/src/classes/Modhex.cls
If the key cannot be calculated, None is returned. This can happen if
they key is malformed.
.. py:function:: yubikey_public_id(otp: str) -> str
Returns the yubikey identity given a token.
.. py:function:: dict_path(dictionary: dict[str, _T], path: str) -> _T
Gets the value of the given dictionary at the given path.
For example::
>>> data = {'foo': {'bar': True}}
>>> dict_path(data, 'foo.bar')
True
.. py:function:: safe_move(src: str, dst: str, tmp_dst: str | None = None) -> None
Rename a file from ``src`` to ``dst``.
Optionally provide a ``tmp_dst`` where the file will be copied to
before being renamed. This needs to be on the same filesystem as
``tmp``, otherwise this will fail.
* Moves must be atomic. ``shutil.move()`` is not atomic.
* Moves must work across filesystems. Often temp directories and the
cache directories live on different filesystems. ``os.rename()`` can
throw errors if run across filesystems.
So we try ``os.rename()``, but if we detect a cross-filesystem copy, we
switch to ``shutil.move()`` with some wrappers to make it atomic.
Via https://alexwlchan.net/2019/03/atomic-cross-filesystem-moves-in-python
.. py:function:: batched(iterable: collections.abc.Iterable[_T], batch_size: int, container_factory: type[tuple] = ...) -> collections.abc.Iterator[tuple[_T, Ellipsis]]
batched(iterable: collections.abc.Iterable[_T], batch_size: int, container_factory: type[list]) -> collections.abc.Iterator[list[_T]]
batched(iterable: collections.abc.Iterable[_T], batch_size: int, container_factory: collections.abc.Callable[[collections.abc.Iterator[_T]], collections.abc.Collection[_T]]) -> collections.abc.Iterator[collections.abc.Collection[_T]]
Splits an iterable into batches of batch_size and puts them
inside a given collection (tuple by default).
The container_factory is necessary in order to consume the iterator
returned by islice. Otherwise this function would never return.