repoze.catalog.catalog

class repoze.catalog.catalog.Catalog(family=None)
clear()
Clear all indexes in this catalog.
index_doc(docid, obj)
Register the document represented by obj in indexes of this catalog using docid docid.
reindex_doc(docid, obj)
Reindex the document referenced by docid using the object passed in as obj (typically just does the equivalent of index_doc, then unindex_doc, but specialized indexes can override the method that this API calls to do less work.
search(**query)
Use the query terms to perform a query. Return a tuple of (num, resultseq) based on the merging of results from individual indexes.
unindex_doc(docid)
Unregister the document id from indexes of this catalog.

repoze.catalog.indexes.field

class repoze.catalog.indexes.field.CatalogFieldIndex(discriminator)
unindex_doc(docid)

See interface IInjection.

Base class overridden to be able to unindex None values.

repoze.catalog.indexes.keyword

class repoze.catalog.indexes.keyword.CatalogKeywordIndex(discriminator)

repoze.catalog.indexes.text

class repoze.catalog.indexes.text.CatalogTextIndex(discriminator, lexicon=None, index=None)
sort(result, reverse=False, limit=None, sort_type=None)

Sort by text relevance.

This only works if the query includes at least one text query, leading to a weighted result. This method raises TypeError if the result is not weighted.

A weighted result is a dictionary-ish object that has docids as keys and floating point weights as values. This method sorts the dictionary by weight and returns the sorted docids as a list.

repoze.catalog.indexes.facet

class repoze.catalog.indexes.facet.CatalogFacetIndex(discriminator, facets, family=None)
counts(docids, omit_facets=())
Given a set of docids (usually returned from query), provide count information for further facet narrowing. Optionally omit count information for facets and their ancestors that are in ‘omit_facets’ (a sequence of facets)
index_doc(docid, object)
Pass in an integer document id and an object supporting a sequence of facet specifiers ala [‘style:gucci:handbag’] via the discriminator

repoze.catalog.indexes.path

class repoze.catalog.indexes.path.CatalogPathIndex(discriminator)

Index for model paths (tokens separated by ‘/’ characters)

A path index stores all path components of the physical path of an object.

Internal datastructure:

  • a physical path of an object is split into its components
  • every component is kept as a key of a OOBTree in self._indexes
  • the value is a mapping ‘level of the path component’ to ‘all docids with this path component on this level’
apply(query)
getEntryForObject(docid)
Takes a document ID and returns all the information we have on that specific object.
insertEntry(comp, id, level)

Insert an entry.

comp is a path component id is the docid level is the level of the component inside the path

numObjects()
return the number distinct values
search(path, default_level=0)

path is either a string representing a relative URL or a part of a relative URL or a tuple (path,level).

level >= 0 starts searching at the given level level < 0 not implemented yet

repoze.catalog.indexes.path2

class repoze.catalog.indexes.path2.CatalogPathIndex2(discriminator, attr_discriminator=None)

Index for model paths (tokens separated by ‘/’ characters or tuples representing a model path).

A path index may be queried to obtain all subobjects (optionally limited by depth) of a certain path.

This index differs from the original repoze.catalog.indexes.path.CatalogPath index inasmuch as it actually retains a graph representation of the objects in the path space instead of relying on ‘level’ information; query results relying on this level information may or may not be correct for any given tree. Use of this index is suggested rather than the path index.

apply(query)
Search the path index using the query. If query is a string, a tuple, or a list, it is treated as the path argument to use to search. If it is any other object, it is assumed to be a dictionary with at least a value for the query key, which is treated as a path. The dictionary can also optionally specify the depth and whether to include the docid referenced by the path argument (the query key) in the set of docids returned (include_path). See the documentation for the search method of this class to understand paths, depths, and the include_path argument.
apply_intersect(query, docids)
Default apply_intersect implementation
search(path, depth=None, include_path=False, attr_checker=None)

Provided a path string (e.g. /path/to/object) or a path tuple (e.g. ('', 'path', 'to', 'object'), or a path list (e.g. ['', 'path', 'to' object'])), search the index for document ids representing subelements of the path specified by the path argument.

If the path argment is specified as a tuple or list, its first element must be the empty string. If the path argument is specified as a string, it must begin with a / character. In other words, paths passed to the search method must be absolute.

If the depth argument is specified, return only documents at this depth and below. Depth 0 will returns the empty set (or only the docid for the path specified if include_path is also True). Depth 1 will return docids related to direct subobjects of the path (plus the docid for the path specified if include_path is also True). Depth 2 will return docids related to direct subobjects and the docids of the children of those subobjects, and so on.

If include_path is False, the docid of the object specified by the path argument is not returned as part of the search results. If include_path is True, the object specified by the path argument is returned as part of the search results.

If attr_checker is not None, it must be a callback that accepts two arguments: the first argument will be the attribute value found, the second argument is a sequence of all previous attributes encountered during this search (in path order). If attr_checker returns True, traversal will continue; otherwise, traversal will cease.

repoze.catalog.document

class repoze.catalog.document.DocumentMap

A two-way map between addresses (e.g. location paths) and document ids.

The map is a persistent object meant to live in a ZODB storage.

Additionally, the map is capable of mapping ‘metadata’ to docids.

add(address, docid=())

Add a new document to the document map.

address is a string or other hashable object which represents a token known by the application.

docid, if passed, must be an int. In this case, remove any previous address stored for it before mapping it to the new address. Preserve any metadata for docid in this case.

If docid is not passed, generate a new docid.

Return the integer document id mapped to address.

add_metadata(docid, data)

Add metadata related to a given document id.

data must be a mapping, such as a dictionary.

For each key/value pair in data insert a metadata key/value pair into the metadata stored for docid.

Overwrite any existing values for the keys in data, leaving values unchanged for other existing keys.

Raise a KeyError If docid doesn’t relate to an address in the document map,

address_for_docid(docid)

Retrieve an address for a given document id.

docid is an integer document id.

Return the address corresponding to docid.

If docid doesn’t exist in the document map, return None.

docid_for_address(address)

Retrieve a document id for a given address.

address is a string or other hashable object which represents a token known by the application.

Return the integer document id corresponding to address.

If address doesn’t exist in the document map, return None.

get_metadata(docid)

Return the metadata for docid.

Return a mapping of the keys and values set using add_metadata.

Raise a KeyError If metadata does not exist for docid.

new_docid()

Return a new document id.

The returned value is guaranteed not to be used already in this document map.

remove_address(address)

Remove a document from the document map using an address.

address is a string or other hashable object which represents a token known by the application.

Remove any corresponding metadata for address as well.

Return a True if address existed in the map, else return False.

remove_docid(docid)

Remove a document from the document map for the given document ID.

docid is an integer document id.

Remove any corresponding metadata for docid as well.

Return a True if docid existed in the map, else return False.

remove_metadata(docid, *keys)

Remove metadata related to a given document id.

If docid doesn’t exist in the metadata map, raise a KeyError.

For each key in keys, remove the metadata value for the docid related to that key.

Do not raise any error if no value exists for a given key.

If no keys are specified, remove all metadata related to the docid.