search.indexer

Classes

IndexTask

dict() -> new empty dictionary

Indexer

TypeMapping

TypeMappingRegistry

ORMLanguageDetector

Detects languages with the help of lingua-language-detector.

ORMEventTranslator

Handles the onegov.core orm events, translates them into indexing

Module Contents

class search.indexer.IndexTask[source]

Bases: TypedDict

dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s

(key, value) pairs

dict(iterable) -> new dictionary initialized as if via:

d = {} for k, v in iterable:

d[k] = v

dict(**kwargs) -> new dictionary initialized with the name=value pairs

in the keyword argument list. For example: dict(one=1, two=2)

action: Literal['index'][source]
id: uuid.UUID | str | int[source]
id_key: str[source]
schema: str[source]
tablename: str[source]
owner_type: str[source]
language: str[source]
access: str[source]
public: bool[source]
suggestion: list[str][source]
tags: list[str][source]
last_change: datetime.datetime | None[source]
publication_start: datetime.datetime | None[source]
publication_end: datetime.datetime | None[source]
properties: dict[str, str][source]
class search.indexer.Indexer(mappings: TypeMappingRegistry, queue: queue.Queue[Task], engine: sqlalchemy.engine.Engine, languages: set[str] | None = None)[source]
queue: queue.Queue[Task][source]
mappings[source]
engine[source]
languages[source]
index(tasks: list[IndexTask] | IndexTask, session: sqlalchemy.orm.Session | None = None) bool[source]

Update the ‘search_index’ table (full text search index) of the given object(s)/task(s).

In case of a bunch of tasks we are assuming they are all from the same schema and table in order to optimize the indexing process.

When a session is passed we use that session’s transaction context and use a savepoint instead of our own transaction to perform the action.

Parameters:
  • tasks – A list of tasks to index

  • session – Supply an active session

Returns:

True if the indexing was successful, False otherwise

execute_statement(session: sqlalchemy.orm.Session | None, schema: str, stmt: sqlalchemy.sql.expression.Executable, params: list[dict[str, Any]] | None = None) None[source]
delete(tasks: list[IndexTask] | IndexTask, session: sqlalchemy.orm.Session | None = None) bool[source]
process(session: sqlalchemy.orm.Session | None = None) int[source]

Processes the queue in bulk.

Gathers all tasks and groups them by action and owner type.

Returns the number of successfully processed batches.

delete_search_index(schema: str) None[source]

Delete all records in search index table of the given schema.

class search.indexer.TypeMapping(name: str, mapping: dict[str, Any], model: type[onegov.search.Searchable] | None = None)[source]
__slots__ = ('name', 'mapping', 'model')[source]
name[source]
mapping[source]
model = None[source]
class search.indexer.TypeMappingRegistry[source]
mappings: dict[str, TypeMapping][source]
__getitem__(key: str) TypeMapping[source]
__iter__() collections.abc.Iterator[TypeMapping][source]
register_orm_base(base: type[object]) None[source]

Takes the given SQLAlchemy base and registers all Searchable objects.

register_type(type_name: str, mapping: dict[str, Any], model: type[onegov.search.Searchable] | None = None) None[source]

Registers the given type with the given mapping. The mapping is as dictionary representing the part below the mappings/type_name.

See:

https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#mappings

When the mapping changes, a new index is created internally and the alias to this index (the external name of the index) is pointed to this new index.

As a consequence, a change in the mapping requires a reindex.

property registered_fields: set[str][source]

Goes through all the registered types and returns the a set with all fields used by the mappings.

class search.indexer.ORMLanguageDetector(supported_languages: collections.abc.Sequence[str])[source]

Bases: onegov.search.utils.LanguageDetector

Detects languages with the help of lingua-language-detector.

html_strip_expression[source]
localized_properties(obj: onegov.search.Searchable) collections.abc.Iterator[str][source]
localized_texts(obj: onegov.search.Searchable, max_chars: int | None = None) collections.abc.Iterator[str][source]
detect_object_language(obj: onegov.search.Searchable) str[source]
class search.indexer.ORMEventTranslator(mappings: TypeMappingRegistry, max_queue_size: int = 0, languages: collections.abc.Sequence[str] = ('de', 'fr', 'en'))[source]

Handles the onegov.core orm events, translates them into indexing actions and puts the result into a queue for the indexer to consume.

The queue may be limited. Once the limit is reached, new events are no longer processed and an error is logged.

queue: queue.Queue[Task][source]
mappings[source]
detector[source]
stopped = False[source]
on_insert(schema: str, obj: object) None[source]
on_update(schema: str, obj: object) None[source]
on_delete(schema: str, obj: object) None[source]
put(translation: Task) None[source]
index(schema: str, obj: onegov.search.Searchable) None[source]

Creates or updates index for the given object

delete(schema: str, obj: onegov.search.Searchable) None[source]

Deletes index of the given object