core.utils
==========

.. py:module:: core.utils


Attributes
----------

.. autoapisummary::

   core.utils._T
   core.utils._KT
   core.utils._unwanted_url_chars
   core.utils._double_dash
   core.utils._number_suffix
   core.utils._repeated_spaces
   core.utils._uuid
   core.utils._email_regex
   core.utils._multiple_newlines
   core.utils._phone_inside_a_tags
   core.utils._phone_ch_country_code
   core.utils._phone_ch
   core.utils._phone_ch_html_safe
   core.utils.ALPHABET
   core.utils.ALPHABET_RE


Classes
-------

.. autoapisummary::

   core.utils.Bunch
   core.utils.PostThread


Functions
---------

.. autoapisummary::

   core.utils.local_lock
   core.utils.normalize_for_url
   core.utils.increment_name
   core.utils.remove_repeated_spaces
   core.utils.profile
   core.utils.timing
   core.utils.module_path_root
   core.utils.module_path
   core.utils.touch
   core.utils.render_file
   core.utils.hash_dictionary
   core.utils.groupbylist
   core.utils.linkify_phone
   core.utils.top_level_domains
   core.utils.linkify
   core.utils.paragraphify
   core.utils.to_html_ul
   core.utils.ensure_scheme
   core.utils.is_uuid
   core.utils.is_non_string_iterable
   core.utils.relative_url
   core.utils.is_subpath
   core.utils.is_sorted
   core.utils.morepath_modules
   core.utils.scan_morepath_modules
   core.utils.get_unique_hstore_keys
   core.utils.makeopendir
   core.utils.append_query_param
   core.utils.toggle
   core.utils.binary_to_dictionary
   core.utils.dictionary_to_binary
   core.utils.safe_format
   core.utils.safe_format_keys
   core.utils.is_valid_yubikey
   core.utils.is_valid_yubikey_format
   core.utils.yubikey_otp_to_serial
   core.utils.yubikey_public_id
   core.utils.dict_path
   core.utils.safe_move
   core.utils.batched
   core.utils.generate_fts_phonenumbers


Module Contents
---------------

.. py:data:: _T

.. py:data:: _KT

.. py:data:: _unwanted_url_chars

.. py:data:: _double_dash

.. py:data:: _number_suffix

.. py:data:: _repeated_spaces

.. py:data:: _uuid

.. py:data:: _email_regex

.. py:data:: _multiple_newlines

.. py:data:: _phone_inside_a_tags
   :value: '(\\">|href=\\"tel:)?'


.. py:data:: _phone_ch_country_code
   :value: '(\\+41|0041|0[0-9]{2})'


.. py:data:: _phone_ch

.. py:data:: _phone_ch_html_safe

.. py:data:: ALPHABET
   :value: 'cbdefghijklnrtuv'


.. py:data:: ALPHABET_RE

.. py:function:: local_lock(namespace: str, key: str) -> collections.abc.Iterator[None]

   Locks the given namespace/key combination on the current system,
   automatically freeing it after the with statement has been completed or
   once the process is killed.

   Usage::

       with lock('namespace', 'key'):
           pass


.. py:function:: normalize_for_url(text: str) -> str

   Takes the given text and makes it fit to be used for an url.

   That means replacing spaces and other unwanted characters with '-',
   lowercasing everything and turning unicode characters into their closest
   ascii equivalent using Unidecode.

   See https://pypi.python.org/pypi/Unidecode


.. py:function:: increment_name(name: str) -> str

   Takes the given name and adds a numbered suffix beginning at 1.

   For example::

       foo => foo-1
       foo-1 => foo-2


.. py:function:: remove_repeated_spaces(text: str) -> str

   Removes repeated spaces in the text ('a  b' -> 'a b'). 


.. py:function:: profile(filename: str) -> collections.abc.Iterator[None]

   Profiles the wrapped code and stores the result in the profiles folder
   with the given filename.


.. py:function:: timing(name: str | None = None) -> collections.abc.Iterator[None]

   Runs the wrapped code and prints the time in ms it took to run it.
   The name is printed in front of the time, if given.


.. py:function:: module_path_root(module: types.ModuleType | str) -> str

.. py:function:: module_path(module: types.ModuleType | str, subpath: str) -> str

   Returns a subdirectory in the given python module.

   :mod:
       A python module (actual module or string)

   :subpath:
       Subpath below that python module. Leading slashes ('/') are ignored.


.. py:function:: touch(file_path: str) -> None

   Touches the file on the given path. 


.. py:class:: Bunch(**kwargs: Any)

   A simple but handy "collector of a bunch of named stuff" class.

   See `<https://code.activestate.com/recipes/    52308-the-simple-but-handy-collector-of-a-bunch-of-named/>`_.

   For example::

       point = Bunch(x=1, y=2)
       assert point.x == 1
       assert point.y == 2

       point.z = 3
       assert point.z == 3

   Allows the creation of simple nested bunches, for example::

       request = Bunch(**{'app.settings.org.my_setting': True})
       assert request.app.settings.org.my_setting is True


   .. py:method:: __getattr__(name: str) -> Any


   .. py:method:: __eq__(other: object) -> bool


   .. py:method:: __ne__(other: object) -> bool


.. py:function:: render_file(file_path: str, request: core.request.CoreRequest) -> webob.Response

   Takes the given file_path (content) and renders it to the browser.
   The file must exist on the local system and be readable by the current
   process.


.. py:function:: hash_dictionary(dictionary: dict[str, Any]) -> str

   Computes a sha256 hash for the given dictionary. The dictionary
   is expected to only contain values that can be serialized by json.

   That includes int, decimal, string, boolean.

   Note that this function is not meant to be used for hashing secrets. Do
   not include data in this dictionary that is secret!


.. py:function:: groupbylist(iterable: collections.abc.Iterable[_T], key: None = ...) -> list[tuple[_T, list[_T]]]
                 groupbylist(iterable: collections.abc.Iterable[_T], key: collections.abc.Callable[[_T], _KT]) -> list[tuple[_KT, list[_T]]]

   Works just like Python's ``itertools.groupby`` function, but instead
   of returning generators, it returns lists.


.. py:function:: linkify_phone(text: str) -> markupsafe.Markup

   Takes a string and replaces valid phone numbers with html links. If a
   phone number is matched, it will be replaced by the result of a callback
   function, that does further checks on the regex match. If these checks do
   not pass, the matched number will remain unchanged.


.. py:function:: top_level_domains() -> set[str]

.. py:function:: linkify(text: str | None) -> markupsafe.Markup

   Takes plain text and injects html links for urls and email addresses.

   By default the text is html escaped before it is linkified. This accounts
   for the fact that we usually use this for text blocks that we mean to
   extend with email addresses and urls.

   If html is already possible, why linkify it?

   Note: We need to clean the html after we've created it (linkify
   parses escaped html and turns it into real html). As a consequence it
   is possible to have html urls in the text that won't be escaped.


.. py:function:: paragraphify(text: str) -> markupsafe.Markup

   Takes a text with newlines groups them into paragraphs according to the
   following rules:

   If there's a single newline between two lines, a <br> will replace that
   newline.

   If there are multiple newlines between two lines, each line will become
   a paragraph and the extra newlines are discarded.


.. py:function:: to_html_ul(value: str | None, convert_dashes: bool = True, with_title: bool = False) -> markupsafe.Markup

   Linkify and convert to text to one or multiple ul's or paragraphs.
       

.. py:function:: ensure_scheme(url: str, default: str = 'http') -> str
                 ensure_scheme(url: None, default: str = 'http') -> None

   Makes sure that the given url has a scheme in front, if none
   was provided.


.. py:function:: is_uuid(value: str | uuid.UUID) -> bool

   Returns true if the given value is a uuid. The value may be a string
   or of type UUID. If it's a string, the uuid is checked with a regex.


.. py:function:: is_non_string_iterable(obj: object) -> bool

   Returns true if the given obj is an iterable, but not a string. 


.. py:function:: relative_url(absolute_url: str | None) -> str

   Removes everything in front of the path, including scheme, host,
   username, password and port.


.. py:function:: is_subpath(directory: str, path: str) -> bool

   Returns true if the given path is inside the given directory. 


.. py:function:: is_sorted(iterable: collections.abc.Iterable[_typeshed.SupportsRichComparison], key: collections.abc.Callable[[_typeshed.SupportsRichComparison], _typeshed.SupportsRichComparison] = ..., reverse: bool = ...) -> bool
                 is_sorted(iterable: collections.abc.Iterable[_T], key: collections.abc.Callable[[_T], _typeshed.SupportsRichComparison], reverse: bool = ...) -> bool

   Returns True if the iterable is sorted. 


.. py:function:: morepath_modules(cls: type[morepath.App]) -> collections.abc.Iterator[str]

   Returns all morepath modules which should be scanned for the given
   morepath application class.

   We can't reliably know the actual morepath modules that
   need to be scanned, which is why we assume that each module has
   one namespace (like 'more.transaction' or 'onegov.core').


.. py:function:: scan_morepath_modules(cls: type[morepath.App]) -> None

   Tries to scan all the morepath modules required for the given
   application class. This is not guaranteed to stay reliable as there is
   no sure way to discover all modules required by the application class.


.. py:function:: get_unique_hstore_keys(session: sqlalchemy.orm.Session, column: sqlalchemy.Column[dict[str, Any]]) -> set[str]

   Returns a set of keys found in an hstore column over all records
   of its table.


.. py:function:: makeopendir(fs: fs.base.FS, directory: str) -> fs.base.SubFS[fs.base.FS]

   Creates and opens the given directory in the given PyFilesystem. 


.. py:function:: append_query_param(url: str, key: str, value: str) -> str

   Appends a single query parameter to an url. This is faster than
   using Purl, if and only if we only add one query param.

   Also this function assumes that the value is already url encoded.


.. py:class:: PostThread(url: str, data: bytes, headers: collections.abc.Collection[tuple[str, str]], timeout: float = 30)

   Bases: :py:obj:`threading.Thread`


   POSTs the given data with the headers to the URL.

   Example::

       data = {'a': 1, 'b': 2}
       data = json.dumps(data).encode('utf-8')
       PostThread(
           'https://example.com/post',
           data,
           (
               ('Content-Type', 'application/json; charset=utf-8'),
               ('Content-Length', len(data))
           )
       ).start()

   This only works for external URLs! If posting to server itself is
   needed, use a process instead of the thread!


   .. py:attribute:: url


   .. py:attribute:: data


   .. py:attribute:: headers


   .. py:attribute:: timeout
      :value: 30


   .. py:method:: run() -> None

      Method representing the thread's activity.

      You may override this method in a subclass. The standard run() method
      invokes the callable object passed to the object's constructor as the
      target argument, if any, with sequential and keyword arguments taken
      from the args and kwargs arguments, respectively.


.. py:function:: toggle(collection: set[_T], item: _T | None) -> set[_T]

   Returns a new set where the item has been toggled. 


.. py:function:: binary_to_dictionary(binary: bytes, filename: str | None = None) -> core.types.FileDict

   Takes raw binary filedata and stores it in a dictionary together
   with metadata information.

   The data is compressed before it is stored int he dictionary. Use
   :func:`dictionary_to_binary` to get the original binary data back.


.. py:function:: dictionary_to_binary(dictionary: core.types.LaxFileDict) -> bytes

   Takes a dictionary created by :func:`binary_to_dictionary` and returns
   the original binary data.


.. py:function:: safe_format(format: str, dictionary: dict[str, str | int | float], types: None = ..., adapt: collections.abc.Callable[[str], str] | None = ..., raise_on_missing: bool = ...) -> str
                 safe_format(format: str, dictionary: dict[str, _T], types: set[type[_T]] = ..., adapt: collections.abc.Callable[[str], str] | None = ..., raise_on_missing: bool = ...) -> str

   Takes a user-supplied string with format blocks and returns a string
   where those blocks are replaced by values in a dictionary.

   For example::

       >>> safe_format('[user] has logged in', {'user': 'admin'})
       'admin has logged in'

   :param format:
       The format to use. Square brackets denote dictionary keys. To
       literally print square bracktes, mask them by doubling ('[[' -> '[')

   :param dictionary:
       The dictionary holding the variables to use. If the key is not found
       in the dictionary, the bracket is replaced with an empty string.

   :param types:
       A set of types supported by the dictionary. Limiting this to safe
       types like builtins (str, int, float) ensure that no values are
       accidentally leaked through faulty __str__ representations.

       Note that inheritance is ignored. Supported types need to be
       whitelisted explicitly.

   :param adapt:
       An optional callable that receives the key before it is used. Returns
       the same key or an altered version.

   :param raise_on_missing:
       True if missing keys should result in a runtime error (defaults to
       False).

   This is strictly meant for formats provided by users. Python's string
   formatting options are clearly superior to this, however it is less
   secure!


.. py:function:: safe_format_keys(format: str, adapt: collections.abc.Callable[[str], str] | None = None) -> list[str]

   Takes a :func:`safe_format` string and returns the found keys. 


.. py:function:: is_valid_yubikey(client_id: str, secret_key: str, expected_yubikey_id: str, yubikey: str) -> bool

   Asks the yubico validation servers if the given yubikey OTP is valid.

   :client_id:
       The yubico API client id.

   :secret_key:
       The yubico API secret key.

   :expected_yubikey_id:
       The expected yubikey id. The yubikey id is defined as the first twelve
       characters of any yubikey value. Each user should have a yubikey
       associated with it's account. If the yubikey value comes from a
       different key, the key is invalid.

   :yubikey:
       The actual yubikey value that should be verified.

   :return: True if yubico confirmed the validity of the key.


.. py:function:: is_valid_yubikey_format(otp: str) -> bool

   Returns True if the given OTP has the correct format. Does not actually
   contact Yubico, so this function may return true, for some invalid keys.


.. py:function:: yubikey_otp_to_serial(otp: str) -> int | None

   Takes a Yubikey OTP and calculates the serial number of the key.

   The serial key is printed on the yubikey, in decimal and as a QR code.

   Example::

       >>> yubikey_otp_to_serial(
           'ccccccdefghdefghdefghdefghdefghdefghdefghklv')
       2311522

   Adapted from Java::

       https://github.com/Yubico/yubikey-salesforce-client/blob/
       e38e46ee90296a852374a8b744555e99d16b6ca7/src/classes/Modhex.cls

   If the key cannot be calculated, None is returned. This can happen if
   they key is malformed.


.. py:function:: yubikey_public_id(otp: str) -> str

   Returns the yubikey identity given a token. 


.. py:function:: dict_path(dictionary: dict[str, _T], path: str) -> _T

   Gets the value of the given dictionary at the given path.

   For example::

       >>> data = {'foo': {'bar': True}}
       >>> dict_path(data, 'foo.bar')
       True


.. py:function:: safe_move(src: str, dst: str, tmp_dst: str | None = None) -> None

   Rename a file from ``src`` to ``dst``.

   Optionally provide a ``tmp_dst`` where the file will be copied to
   before being renamed. This needs to be on the same filesystem as
   ``tmp``, otherwise this will fail.

   * Moves must be atomic.  ``shutil.move()`` is not atomic.

   * Moves must work across filesystems.  Often temp directories and the
     cache directories live on different filesystems.  ``os.rename()`` can
     throw errors if run across filesystems.

   So we try ``os.rename()``, but if we detect a cross-filesystem copy, we
   switch to ``shutil.move()`` with some wrappers to make it atomic.

   Via https://alexwlchan.net/2019/03/atomic-cross-filesystem-moves-in-python


.. py:function:: batched(iterable: collections.abc.Iterable[_T], batch_size: int, container_factory: type[tuple] = ...) -> collections.abc.Iterator[tuple[_T, Ellipsis]]
                 batched(iterable: collections.abc.Iterable[_T], batch_size: int, container_factory: type[list]) -> collections.abc.Iterator[list[_T]]
                 batched(iterable: collections.abc.Iterable[_T], batch_size: int, container_factory: collections.abc.Callable[[collections.abc.Iterator[_T]], collections.abc.Collection[_T]]) -> collections.abc.Iterator[collections.abc.Collection[_T]]

   Splits an iterable into batches of batch_size and puts them
   inside a given collection (tuple by default).

   The container_factory is necessary in order to consume the iterator
   returned by islice. Otherwise this function would never return.


.. py:function:: generate_fts_phonenumbers(numbers: collections.abc.Iterable[str | None]) -> list[str]

   Generates a list of phonenumbers in various formats for full text search.
   The international, the national and the local format as well as the
   extension.