lakesuperior.store.ldp_nr package

Submodules

lakesuperior.store.ldp_nr.base_non_rdf_layout module

class lakesuperior.store.ldp_nr.base_non_rdf_layout.BaseNonRdfLayout(config)[source]

Bases: object

Abstract class for setting the non-RDF (bitstream) store layout.

Differerent layouts can be created by implementing all the abstract methods of this class. A non-RDF layout is not necessarily restricted to a traditional filesystem—e.g. a layout persisting to HDFS can be written too.

delete(id)[source]

Delete a stream by its identifier (i.e. checksum).

file_ct

Calculated the store size on disk.

local_path(uuid)[source]

Return the local path of a file.

persist(stream)[source]

Store the stream in the designated persistence layer.

store_size

Calculated the store size on disk.

lakesuperior.store.ldp_nr.default_layout module

class lakesuperior.store.ldp_nr.default_layout.DefaultLayout(*args, **kwargs)[source]

Bases: lakesuperior.store.ldp_nr.base_non_rdf_layout.BaseNonRdfLayout

Default file layout.

This is a simple filesystem layout that stores binaries in pairtree folders in a local filesystem. Parameters can be specified for the

bootstrap()[source]

Initialize binary file store.

delete(uuid)[source]

See BaseNonRdfLayout.delete.

static local_path(root, uuid, bl=4, bc=4)[source]

Generate the resource path splitting the resource checksum according to configuration parameters.

Parameters:uuid (str) – The resource UUID. This corresponds to the content checksum.
persist(uid, stream, bufsize=8192, prov_cksum=None, prov_cksum_algo=None)[source]

Store the stream in the file system.

This method handles the file in chunks. for each chunk it writes to a temp file and adds to a checksum. Once the whole file is written out to disk and hashed, the temp file is moved to its final location which is determined by the hash value.

Parameters:
  • uid (str) – UID of the resource.
  • stream (IOstream) – file-like object to persist.
  • bufsize (int) – Chunk size. 2**12 to 2**15 is a good range.
  • prov_cksum (str) – Checksum provided by the client to verify that the content received matches what has been sent. If None (the default) no verification will take place.
  • prov_cksum_algo (str) – Verification algorithm to validate the integrity of the user-provided data. If this is different from the default hash algorithm set in the application configuration, which is used to calclate the checksum of the file for storing purposes, a separate hash is calculated specifically for validation purposes. Clearly it’s more efficient to use the same algorithm and avoid a second checksum calculation.