lakesuperior.model.rdf package

Model for RDF entities: Term, Triple, Graph.

Members of this package are the core building blocks of the Lakesuperior RDF model. They are C extensions mostly used in higher layers of the application, but some of them also have a public Python API to allow efficient manipulation of large RDF datasets.

See individual modules for detailed documentation:

Submodules

lakesuperior.model.rdf.graph module

Graph class and factories.

class lakesuperior.model.rdf.graph.Graph

Bases: object

Fast implementation of a graph.

Most functions should mimic RDFLib’s graph with less overhead. It uses the same funny but functional slicing notation.

A Graph contains a lakesuperior.model.structures.keyset.Keyset at its core and is bound to a LmdbTriplestore. This makes lookups and boolean operations very efficient because all these operations are performed on an array of integers.

In order to retrieve RDF values from a Graph, the underlying store must be looked up. This can be done in a different transaction than the one used to create or otherwise manipulate the graph.

Similarly, any operation such as adding, changing or looking up triples needs a store transaction.

Boolean operations between graphs (union, intersection, etc) and other operations that don’t require an explicit term as an input or output (e.g. __repr__ or size calculation) don’t require a transaction to be opened.

Every time a term is looked up or added to even a temporary graph, that term is added to the store and creates a key. This is because in the majority of cases that term is likely to be stored permanently anyway, and it’s more efficient to hash it and allocate it immediately. A cleanup function to remove all orphaned terms (not in any triple or context index) can be later devised to compact the database.

Even though any operation may involve adding new terms to the store, a read-only transaction is sufficient. Lakesuperior will open a write transaction automatically only if necessary and only for the time needed to enter the new terms.

An instance of this class can be created from a RDF python string with the from_rdf() factory function or converted to a rdflib.Graph instance.

add

Add triples to the graph.

This method checks for duplicates.

Parameters:triples (iterable) – iterable of 3-tuple triples.
as_rdflib

Return the data set as an RDFLib Graph.

Return type:rdflib.Graph
capacity
copy

Create copy of the graph with a different (or no) URI.

Parameters:uri (str) – URI of the new graph. This should be different from the original.
data
empty_copy

Create an empty copy with same capacity and store binding.

Parameters:uri (str) – URI of the new graph. This should be different from the original.
keys

keys: lakesuperior.model.structures.keyset.Keyset

lookup

Look up triples by a pattern.

This function converts RDFLib terms into the serialized format stored in the graph’s internal structure and compares them bytewise.

Any and all of the lookup terms may be None.

Return type:Graph
Returns:New Graph instance with matching triples.
remove

Remove triples by pattern.

The pattern used is similar to LmdbTripleStore.delete().

set

Set a single value for subject and predicate.

Remove all triples matching s and p before adding s p o.

store
terms_by_type

Get all terms of a type: subject, predicate or object.

Parameters:type (str) – One of s, p or o.
txn_ctx
uri

uri: object

value

Get an individual value for a given predicate.

Parameters:
  • p (rdflib.termNode) – Predicate to search for.
  • strict (bool) – If set to True the method raises an error if more than one value is found. If False (the default) only the first found result is returned.
Return type:

rdflib.term.Node

lakesuperior.model.rdf.graph.from_rdf

Create a Graph from a serialized RDF string.

This factory function takes the same arguments as rdflib.Graph.parse().

Parameters:
  • store – see Graph.__cinit__().
  • uri – see Graph.__cinit__().
  • *args – Positional arguments passed to RDFlib’s parse.
  • **kwargs – Keyword arguments passed to RDFlib’s parse.
Return type:

Graph

lakesuperior.model.rdf.term module

Term model.

Term is not defined as a Cython or Python class. It is a C structure, hence only visible by the Cython layer of the application.

Terms can be converted from/to RDFlib terms, and deserialized from, or serialized to, binary buffer structures. This is the form that terms are stored in the data store.

If uses require a public API, a proper Term Cython class with a Python API could be developed in the future.

lakesuperior.model.rdf.triple module

Triple model.

This is a very light-weight implementation of a Triple model, available as C structures only. Two types of structures are defined: Triple, with pointers to :py:model:`lakesuperior.model.rdf.term` objects, and BufferTriple, with pointers to byte buffers of serialized terms.