lakesuperior.store.ldp_rs package

lakesuperior.store.ldp_rs.ROOT_RSRC_URI = rdflib.term.URIRef('info:fcres/')

Internal URI of root resource.

lakesuperior.store.ldp_rs.ROOT_UID = '/'

Root node UID.

Submodules

lakesuperior.store.ldp_rs.lmdb_store module

class lakesuperior.store.ldp_rs.lmdb_store.LmdbStore(path, identifier=None, create=True)[source]

Bases: lakesuperior.store.ldp_rs.lmdb_triplestore.LmdbTriplestore, rdflib.store.Store

LMDB-backed store.

This is an implementation of the RDFLib Store interface: https://github.com/RDFLib/rdflib/blob/master/rdflib/store.py

Handles the interaction with a LMDB store and builds an abstraction layer for triples.

This store class uses two LMDB environments (i.e. two files): one for the main (preservation-worthy) data and the other for the index data which can be rebuilt from the main database.

There are 4 main data sets (preservation worthy data):

  • t:st (term key: serialized term; 1:1)
  • spo:c (joined S, P, O keys: context key; dupsort, dupfixed)
  • c: (context keys only, values are the empty bytestring; 1:1)
  • pfx:ns (prefix: pickled namespace; 1:1)

And 6 indices to optimize lookup for all possible bound/unbound term combination in a triple:

  • th:t (term hash: term key; 1:1)
  • s:po (S key: joined P, O keys; dupsort, dupfixed)
  • p:so (P key: joined S, O keys; dupsort, dupfixed)
  • o:sp (O key: joined S, P keys; dupsort, dupfixed)
  • c:spo (context → triple association; dupsort, dupfixed)
  • ns:pfx (pickled namespace: prefix; 1:1)

The default graph is defined in rdflib.graph.RDFLIB_DEFAULT_GRAPH_URI. Adding triples without context will add to this graph. Looking up triples without context (also in a SPARQL query) will look in the union graph instead of in the default graph. Also, removing triples without specifying a context will remove triples from all contexts.

bind(prefix, namespace)[source]

Bind a prefix to a namespace.

Parameters:
  • prefix (str) – Namespace prefix.
  • namespace (rdflib.URIRef) – Fully qualified URI of namespace.
close(commit_pending_transaction=False)[source]

Close the database connection.

Do this at server shutdown.

context_aware = True
formula_aware = False
graph_aware = True
namespace(prefix)[source]

Get the namespace for a prefix. :param str prefix: Namespace prefix.

namespaces()[source]

Get an iterator of all prefix: namespace bindings.

Return type:Iterator(tuple(str, rdflib.Namespace))
open(configuration=None, create=True)[source]

Open the store environment.

Parameters:
  • configuration (str) – If not specified on init, indicate the path to use for the store.
  • create (bool) – Create the file and folder structure for the store environment.
prefix(namespace)[source]

Get the prefix associated with a namespace.

Note: A namespace can be only bound to one prefix in this implementation.

Parameters:namespace (rdflib.Namespace) – Fully qualified namespace.
Return type:str or None
remove(triple_pattern, context=None)[source]

Remove triples by a pattern.

Parameters:
  • triple_pattern (tuple) – 3-tuple of either RDF terms or None, indicating the triple(s) to be removed. None is used as a wildcard.
  • context (rdflib.term.Identifier or None) – Context to remove the triples from. If None (the default) the matching triples are removed from all contexts.
remove_graph(graph)[source]

Remove all triples from graph and the graph itself.

Parameters:graph (rdflib.URIRef) – URI of the named graph to remove.
transaction_aware = True

lakesuperior.store.ldp_rs.lmdb_triplestore module

class lakesuperior.store.ldp_rs.lmdb_triplestore.LmdbTriplestore

Bases: lakesuperior.store.base_lmdb_store.BaseLmdbStore

Low-level triplestore layer.

This class extends the general-purpose BaseLmdbStore and maps triples and contexts to key-value records in LMDB. It can be used in the application context (env.app_globals.rdf_store), or an independent instance can be spun up in an arbitrary disk location.

This class provides the base for the RDFlib-compatible backend in the lakesuperior.store.ldp_rs.lmdb_store.LmdbStore.

add

Add a triple and start indexing.

Parameters:
  • triple (tuple(rdflib.Identifier)) – Tuple of three identifiers.
  • context (rdflib.Identifier or None) – Context identifier. None inserts in the default graph.
  • quoted (bool) – Not used.
add_graph

Add a graph (context) to the database.

This creates an empty graph by associating the graph URI with the pickled None value. This prevents from removing the graph when all triples are removed.

Parameters:graph (rdflib.URIRef) – URI of the named graph to add.
all_namespaces

Return all registered namespaces.

all_terms

Return all terms of a type (s, p, or o) in the store.

contexts

Get a list of all contexts.

Return type:set(URIRef)
dbi_flags = {b'c:_____': 10, b'c:spo__': 94, b'o:sp___': 94, b'p:so___': 94, b'po:s___': 118, b's:po___': 94, b'so:p___': 118, b'sp:o___': 118, b'spo:c__': 118, b't:st___': 10}
dbi_labels = [b't:st___', b'spo:c__', b'c:_____', b'pfx:ns_', b'ns:pfx_', b'th:t___', b's:po___', b'p:so___', b'o:sp___', b'po:s___', b'so:p___', b'sp:o___', b'c:spo__']
flags = 0
options = {'map_size': 1099511627776}
stats

Gather statistics about the database.

triple_keys

Top-level lookup method.

This method is used by triples which returns native Python tuples, as well as by other methods that need to iterate and filter triple keys without incurring in the overhead of converting them to triples.

Parameters:
  • triple_pattern (tuple) – 3 RDFLib terms
  • context (rdflib.term.Identifier or None) – Context graph or URI, or None.
triples

Generator over matching triples.

Parameters:
  • triple_pattern (tuple) – 3 RDFLib terms
  • context (rdflib.Graph or None) – Context graph, if available.
Return type:

Iterator

Returns:

Generator over triples and contexts in which each result has the following format:

(s, p, o), generator(contexts)

Where the contexts generator lists all context that the triple appears in.

lakesuperior.store.ldp_rs.rsrc_centric_layout module

class lakesuperior.store.ldp_rs.rsrc_centric_layout.RsrcCentricLayout(config)[source]

Bases: object

This class exposes an interface to build graph store layouts. It also provides the basics of the triplestore connection.

Some store layouts are provided. New ones aimed at specific uses and optimizations of the repository may be developed by extending this class and implementing all its abstract methods.

A layout is implemented via application configuration. However, once contents are ingested in a repository, changing a layout will most likely require a migration.

The custom layout must be in the lakesuperior.store.rdf package and the class implementing the layout must be called StoreLayout. The module name is the one defined in the app configuration.

E.g. if the configuration indicates simple_layout the application will look for lakesuperior.store.rdf.simple_layout.SimpleLayout.

ask_rsrc_exists(uid)[source]

See base_rdf_layout.ask_rsrc_exists.

attr_map = {Namespace('info:fcsystem/graph/admin'): {'p': {rdflib.term.URIRef('http://fedora.info/definitions/fcrepo#created'), rdflib.term.URIRef('http://fedora.info/definitions/fcrepo#createdBy'), rdflib.term.URIRef('http://fedora.info/definitions/fcrepo#hasParent'), rdflib.term.URIRef('http://fedora.info/definitions/fcrepo#hasVersion'), rdflib.term.URIRef('http://fedora.info/definitions/fcrepo#lastModified'), rdflib.term.URIRef('http://fedora.info/definitions/fcrepo#lastModifiedBy'), rdflib.term.URIRef('http://www.ebu.ch/metadata/ontologies/ebucore/ebucore#hasMimeType'), rdflib.term.URIRef('http://www.iana.org/assignments/relation/describedBy'), rdflib.term.URIRef('http://www.loc.gov/premis/rdf/v1#hasMessageDigest'), rdflib.term.URIRef('http://www.loc.gov/premis/rdf/v1#hasSize'), rdflib.term.URIRef('http://www.w3.org/ns/ldp#hasMemberRelation'), rdflib.term.URIRef('http://www.w3.org/ns/ldp#insertedContentRelation'), rdflib.term.URIRef('http://www.w3.org/ns/ldp#membershipResource'), rdflib.term.URIRef('info:fcsystem/tombstone')}, 't': {rdflib.term.URIRef('http://fedora.info/definitions/fcrepo#Binary'), rdflib.term.URIRef('http://fedora.info/definitions/fcrepo#Container'), rdflib.term.URIRef('http://fedora.info/definitions/fcrepo#Pairtree'), rdflib.term.URIRef('http://fedora.info/definitions/fcrepo#Resource'), rdflib.term.URIRef('http://fedora.info/definitions/fcrepo#Version'), rdflib.term.URIRef('http://www.w3.org/ns/ldp#BasicContainer'), rdflib.term.URIRef('http://www.w3.org/ns/ldp#Container'), rdflib.term.URIRef('http://www.w3.org/ns/ldp#DirectContainer'), rdflib.term.URIRef('http://www.w3.org/ns/ldp#IndirectContainer'), rdflib.term.URIRef('http://www.w3.org/ns/ldp#NonRDFSource'), rdflib.term.URIRef('http://www.w3.org/ns/ldp#RDFSource'), rdflib.term.URIRef('http://www.w3.org/ns/ldp#Resource'), rdflib.term.URIRef('info:fcsystem/Tombstone')}}, Namespace('info:fcsystem/graph/structure'): {'p': {rdflib.term.URIRef('http://pcdm.org/models#hasMember'), rdflib.term.URIRef('http://www.w3.org/ns/ldp#contains')}}}

Human-manageable map of attribute routes.

This serves as the source for attr_routes.

attr_routes

This is a map that allows specific triples to go to certain graphs. It is a machine-friendly version of the static attribute attr_map which is formatted for human readability and to avoid repetition. The attributes not mapped here (usually user-provided triples with no special meaning to the application) go to the fcmain: graph.

The output of this is a dict with a similar structure:

{
    'p': {
        <Predicate P1>: <destination graph G1>,
        <Predicate P2>: <destination graph G1>,
        <Predicate P3>: <destination graph G1>,
        <Predicate P4>: <destination graph G2>,
        [...]
    },
    't': {
        <RDF Type T1>: <destination graph G1>,
        <RDF Type T2>: <destination graph G3>,
        [...]
    }
}
bootstrap()[source]

Delete all graphs and insert the basic triples.

count_rsrc()[source]

Return a count of first-class resources, subdivided in “live” and historic snapshots.

delete_rsrc(uid, historic=False)[source]

Delete all aspect graphs of an individual resource.

Parameters:
  • uid – Resource UID.
  • historic (bool) – Whether the UID is of a historic version.
find_refint_violations()[source]

Find all referential integrity violations.

This method looks for dangling relationships within a repository by checking the objects of each triple; if the object is an in-repo resource reference, and no resource with that URI results to be in the repo, that triple is reported.

Return type:set
Returns:Triples referencing a repository URI that is not a resource.
forget_rsrc(uid, inbound=True, children=True)[source]

Completely delete a resource and (optionally) its children and inbound references.

NOTE: inbound references in historic versions are not affected.

get_descendants(uid, recurse=True)[source]

Get descendants (recursive children) of a resource.

Parameters:uid (str) – Resource UID.
Return type:Iterator(rdflib.URIRef)
Returns:Subjects of descendant resources.
get_imr(uid, ver_uid=None, strict=True, incl_inbound=False, incl_children=True, **kwargs)[source]

See base_rdf_layout.get_imr.

get_inbound_rel(subj_uri, full_triple=True)[source]

Query inbound relationships for a subject.

This can be a list of either complete triples, or of subjects referring to the given URI. It excludes historic version snapshots.

Parameters:
  • subj_uri (rdflib.URIRef) – Subject URI.
  • full_triple (boolean) – Whether to return the full triples found or only the subjects. By default, full triples are returned.
Return type:

Iterator(tuple(rdflib.term.Identifier) or rdflib.URIRef)

Returns:

Inbound triples or subjects.

get_last_version_uid(uid)[source]

Get the UID of the last version of a resource.

This can be used for tombstones too.

get_metadata(uid, ver_uid=None, strict=True)[source]

This is an optimized query to get only the administrative metadata.

get_raw(subject, ctx=None)[source]

Get a raw graph of a non-LDP resource.

The graph is queried across all contexts or within a specific one.

Parameters:
  • subject (rdflib.term.URIRef) – URI of the subject.
  • ctx (rdflib.term.URIRef) – URI of the optional context. If None, all named graphs are queried.
Return type:

Graph

get_user_data(uid)[source]

Get all the user-provided data.

Parameters:uid (string) – Resource UID.
Return type:rdflib.Graph
get_version_info(uid)[source]

Get all metadata about a resource’s versions.

Parameters:uid (string) – Resource UID.
Return type:Graph
graph_ns_types = {Namespace('info:fcsystem/graph/admin'): rdflib.term.URIRef('info:fcsystem/AdminGraph'), Namespace('info:fcsystem/graph/structure'): rdflib.term.URIRef('info:fcsystem/StructureGraph'), Namespace('info:fcsystem/graph/userdata/_main'): rdflib.term.URIRef('info:fcsystem/UserProvidedGraph')}

RDF types of graphs by prefix.

ignore_vmeta_preds = {rdflib.term.URIRef('http://xmlns.com/foaf/0.1/primaryTopic')}

Predicates of version metadata to be ignored in output.

ignore_vmeta_types = {rdflib.term.URIRef('info:fcsystem/AdminGraph'), rdflib.term.URIRef('info:fcsystem/UserProvidedGraph')}

RDF types of version metadata to be ignored in output.

modify_rsrc(uid, remove_trp={}, add_trp={})[source]

Modify triples about a subject.

This method adds and removes triple sets from specific graphs, indicated by the term router. It also adds metadata about the changed graphs.

patch_rsrc(uid, qry)[source]

Patch a resource with SPARQL-Update statements.

The statement(s) is/are executed on the user-provided graph only to ensure that the scope is limited to the resource.

Parameters:
  • uid (str) – UID of the resource to be patched.
  • qry (dict) – Parsed and translated query, or query string.
raw_query(qry_str)[source]

Perform a straight query to the graph store.

snapshot_uid(uid, ver_uid)[source]

Create a versioned UID string from a main UID and a version UID.

truncate_rsrc(uid)[source]

Remove all user-provided data from a resource and only leave admin and structure data.

uri_to_uid(uri)[source]

Convert an internal URI to a UID.