lakesuperior.model.rdf package¶
Model for RDF entities: Term, Triple, Graph.
Members of this package are the core building blocks of the Lakesuperior RDF model. They are C extensions mostly used in higher layers of the application, but some of them also have a public Python API to allow efficient manipulation of large RDF datasets.
See individual modules for detailed documentation:
Submodules¶
lakesuperior.model.rdf.graph module¶
Graph class and factories.
-
class
lakesuperior.model.rdf.graph.
Graph
¶ Bases:
object
Fast implementation of a graph.
Most functions should mimic RDFLib’s graph with less overhead. It uses the same funny but functional slicing notation.
A Graph contains a
lakesuperior.model.structures.keyset.Keyset
at its core and is bound to aLmdbTriplestore
. This makes lookups and boolean operations very efficient because all these operations are performed on an array of integers.In order to retrieve RDF values from a
Graph
, the underlying store must be looked up. This can be done in a different transaction than the one used to create or otherwise manipulate the graph.Similarly, any operation such as adding, changing or looking up triples needs a store transaction.
Boolean operations between graphs (union, intersection, etc) and other operations that don’t require an explicit term as an input or output (e.g.
__repr__
or size calculation) don’t require a transaction to be opened.Every time a term is looked up or added to even a temporary graph, that term is added to the store and creates a key. This is because in the majority of cases that term is likely to be stored permanently anyway, and it’s more efficient to hash it and allocate it immediately. A cleanup function to remove all orphaned terms (not in any triple or context index) can be later devised to compact the database.
Even though any operation may involve adding new terms to the store, a read-only transaction is sufficient. Lakesuperior will open a write transaction automatically only if necessary and only for the time needed to enter the new terms.
An instance of this class can be created from a RDF python string with the
from_rdf()
factory function or converted to ardflib.Graph
instance.-
add
¶ Add triples to the graph.
This method checks for duplicates.
Parameters: triples (iterable) – iterable of 3-tuple triples.
-
as_rdflib
¶ Return the data set as an RDFLib Graph.
Return type: rdflib.Graph
-
capacity
¶
-
copy
¶ Create copy of the graph with a different (or no) URI.
Parameters: uri (str) – URI of the new graph. This should be different from the original.
-
data
¶
-
empty_copy
¶ Create an empty copy with same capacity and store binding.
Parameters: uri (str) – URI of the new graph. This should be different from the original.
-
keys
¶ keys: lakesuperior.model.structures.keyset.Keyset
-
lookup
¶ Look up triples by a pattern.
This function converts RDFLib terms into the serialized format stored in the graph’s internal structure and compares them bytewise.
Any and all of the lookup terms may be
None
.Return type: Graph Returns: New Graph instance with matching triples.
-
remove
¶ Remove triples by pattern.
The pattern used is similar to
LmdbTripleStore.delete()
.
-
set
¶ Set a single value for subject and predicate.
Remove all triples matching
s
andp
before addings p o
.
-
store
¶
-
terms_by_type
¶ Get all terms of a type: subject, predicate or object.
Parameters: type (str) – One of s
,p
oro
.
-
txn_ctx
¶
-
uri
¶ uri: object
-
-
lakesuperior.model.rdf.graph.
from_rdf
¶ Create a Graph from a serialized RDF string.
This factory function takes the same arguments as
rdflib.Graph.parse()
.Parameters: - store – see
Graph.__cinit__()
. - uri – see
Graph.__cinit__()
. - *args – Positional arguments passed to RDFlib’s
parse
. - **kwargs – Keyword arguments passed to RDFlib’s
parse
.
Return type: - store – see
lakesuperior.model.rdf.term module¶
Term model.
Term
is not defined as a Cython or Python class. It is a C structure,
hence only visible by the Cython layer of the application.
Terms can be converted from/to RDFlib terms, and deserialized from, or serialized to, binary buffer structures. This is the form that terms are stored in the data store.
If uses require a public API, a proper Term Cython class with a Python API could be developed in the future.
lakesuperior.model.rdf.triple module¶
Triple model.
This is a very light-weight implementation of a Triple model, available as
C structures only. Two types of structures are defined: Triple
, with
pointers to :py:model:`lakesuperior.model.rdf.term` objects, and
BufferTriple
, with pointers to byte buffers of serialized terms.