Blob Storage
Introduction
The Blob Storage feature allows Solutions to store Large Binary Data (that is, large files) with a minimal impact on memory and disk space of FNZ Studio. This feature stores the **Blob** (Binary Large Object) outside the FNZ Studio Platform in external services such as network disks or cloud storage, but still exposes the same interfaces to Solution, no matter where the Blobs are persisted. Another key characteristics is that the Blobs can be accessed transparently from any machine which forms the FNZ Studio cluster.
Blob Storage is based on the concept of Silo, that is, the place where the Blob is stored. A silo is mapped to a concrete location using the configuration properties. This approach decouples the solution from the underlying storage, thus giving the possibility to change the storage without changing the solution, if necessary.
Finally, the Blob Storage allows managing not only Blobs, but also the metadata associated to them. For every Blob, three types of metadata can be managed:
- System Metadata: used to stores generic information.
- Filename: the original name of the file that was copied.
- Created: the time at which the Blob was created.
- Platform Metadata: used to store FNZ Studio-specific information.
- ProcessInstanceId: the ID of the Process Instance that created the Blob.
- UserId: the user who created the Blob.
- Tags: information that the Solution finds useful and wants to store with the Blob.
Consider that System Metadata and Platform Metadata are created by the Platform and are read-only for Solutions.
Configuration
In the architecture of the Blob Storage, a Provider is the software entity that ‘knows’ how to interact with the specific service where the Blobs are stored and makes this capability available. FNZ Studio comes with an out-of-the-box provider to store Blobs on a filesystem, but other providers can be added using extensions (e.g. the AWSBlobStorage extension). To use a Filesystem to store the Blobs, the following configuration property needs to be set:
nm.blob.storage.filesystem.silos = tmp-archive=tmp, archive=/mnt/blobs
The value of this property represents a comma separated list of key-value pairs. The first part of the pair represents the name of the silo (e.g. tmp-archive, archive), while the second part is its mapping on the filesystem. This means that, if a Blob needs to be stored in the archive silo at Solution level, the Blob is stored in the /mnt/blobs directory ‘behind the scenes’. The directories in the property can be either relative (e.g. tmp), and in this case they are relative to FNZ Studio data home, or absolute (e.g. /mnt/blobs).
Usage
A set of Script Functions in the Blob namespace are available to interact with the Blob Storage. They are divided in two main categories:
- Script Functions to manage the Blobs (see more)
- Script Functions to manage the metadata of the Blobs (see more)
Note: All the Script Functions described below throw an exception if the silo specified in the parameter does not exist or is not accessible.
Blob Script Functions
Blob:CopyLocalFileToBlob
Blob:CopyLocalFileToBlob('archive', 'work/tmp/scan.pdf');
Blob:CopyBlobToLocalFile
String blobKey : Blob:CopyLocalFileToBlob(String silo, String filePath);
This function takes a local file localized by the filePath and copies it to the desired silo. The blobKey is returned if the operation is successful. Overwriting a Blob is not possible because the blobKey depends on the current timestamp and a counter. While copying the Blob, the function also sets its metadata:
- Filename: the name of the file which is deducted from the
filePathparameter. - Created: Blob creation timestamp in ISO8601 format string (e.g. 2020-04-22T14:53:29+0200).
- CreatedTimestamp: Blob creation timestamp in epoch format.
- ProcessInstanceId: the Process Instance ID from where the Blob was copied, if it can be retrieved from the execution context.
- UserId: the ID of the user who copied the Blob, if it can be retrieved.
Example:
Blob:CopyLocalFileToBlob('archive', 'work/tmp/scan.pdf');
Blob:CopyBlobToLocalFile
Nothing : Blob:CopyBlobToLocalFile(String silo, String blobKey, String filePath);
CopyBlobToLocalFile is the dual of the previous function, it copies a Blob identified by a silo and the blobKey to a local file in filePath. If the local file already exists, a BlobStorageException is thrown.
Example:
Blob:CopyBlobToLocalFile('archive', '2020-04/08/11/114524_n10.pdf.dat', 'work/tmp/downloaded.pdf');Blob:Delete
Nothing: Blob:Delete(String silo, String blobKey);
This function allows deleting a Blob, when it is no longer useful. Deleting the Blob deletes all its medatadata.
Example:
Blob:Delete('archive', '2020-04/08/11/114524_n10.pdf.dat');Blob:Exists
Boolean exists : Blob:Exists(String silo, String blobKey);
This function allows checking whether a Blob exists or not. The Blob to be checked is identified by the silo and the blobKey. True is returned if the Blob exists, false otherwise.
Example:
Blob:Exists('archive', '2020-04/08/11/114524_n10.pdf.dat');Blob:List
Index String blobKeys : Blob:List(String silo, Blob:Filter filter);
This function allows searching for Blobs that match some given characteristics. It returns an Indexed Collection of blobKeys matching all the metadata and tags specified in the Blob:Filter. The filter allows specifying:
- filename: the name of the file associated to the Blob.
- createdDay: the day in which the Blob was created in the format yyyy-MM-dd.
- processInstanceId: the Process Instance ID from which the Blob was created.
- userId: the ID of the user who created the Blob.
- tags: Named Collection of strings where the user can specify key-value pairs of tags.
Example:
Blob:Filter $filter := new Blob:Filter;
$filter.createdDay := '2020-04-08';
//
Blob:List('archive', $filter);Blob Metadata Script Functions
Blob:SetMetadata
Nothing : Blob:SetMetadata(String silo, String blobKey, String metadataName, String metadataValue);
This function allows setting metadata on an existing Blob. The Blob is referenced using silo and blobKey while metadataName and metadataValue are used to define the name and the value of the metadata to set, respectively. If the associated Blob does not exist, setting the Metadata is prevented and a BlobStorageException is thrown. The metadataName must start with Custom: (e.g. Custom:PassportNumber) to avoid that System and Platform metadata are overwritten.
Example:
Blob:SetMetadata('archive', '2020-04/08/11/114524_n10.pdf.dat', 'Custom:PassportNumber', 'AZ1234');
Blob:GetMetadata
String metadataValue: Blob:GetMetadata(String silo, String blobKey, String metadataName);
GetMetadata is the dual of SetMetadata. The metadataValue parameter is not necessary, but the returned value is the value of the metadata to be retrieved, if it exists. Otherwise, it returns null. All metadata is accessible if fully and correctly specified (e.g. Filename, Custom:PassportNumber).
Example:
Blob:GetMetadata('archive', '2020-04/08/11/114524_n10.pdf.dat', 'Custom:PassportNumber');
Blob:DeleteMetadata
Nothing : Blob:DeleteMetadata(String silo, String blobKey, String metadataName);
This function must be used to explicitly remove metadata. Nothing is returned, and metadata is specified as in the previous function.
Example:
DeleteMetadata('archive', '2020-04/08/11/114524_n10.pdf.dat', 'Custom:PassportNumber');Blob:ListMetadata
Indexed String metadataNames: Blob:ListMetadata(String silo, String blobKey);
This function allows retrieving all existing metadata for a Blob. It returns an Indexed Collection of Strings representing the sets of metadata for the Blob. If there is no metadata or the Blob does not exist, an empty list is returned.
Example:
Blob:ListMetadata('archive', '2020-04/08/11/114524_n10.pdf.dat');Blob:SetTag
Nothing : Blob:SetTag(String silo, String blobKey, String tagName, String tagValue);
This function allows setting tags on an existing Blob. The signature is similar to the SetMetadata function, except the tagName does not need to start with Custom:.
Example:
Blob:SetTag('archive', '2020-04/08/11/114524_n10.pdf.dat', 'PassportNumber', 'AZ1234');
Blob:GetTag
String tagValue : Blob:GetTag(String silo, String blobKey, String tagName);
GetMetadata is the function dual to SetMetadata and, as the previous one, tagName does not need to start with Custom:.
Example:
Blob:GetTag('archive', '2020-04/08/11/114524_n10.pdf.dat', 'PassportNumber');Blob:DeleteTag
Nothing : Blob:DeleteTag(String silo, String blobKey, String tagName)
This function must be used to explicitly remove a tag. Nothing is returned, and the tag is specified as in the Blob:GetTag function.
Example:
Blob:DeleteTag('archive', '2020-04/08/11/114524_n10.pdf.dat', 'PassportNumber');Blob:ListTags
Indexed String tagNames: Blob:ListTags(String silo, String blobKey);
This function returns all existing tags for a Blob. It returns an Indexed Collection of Strings representing the sets of tags for the Blob. If there is no tag or the Blob does not exist, an empty list is returned. Note that the tags are a subset of metadata of the Blob.
Example:
Blob:ListTags('archive', '2020-04/08/11/114524_n10.pdf.dat');Which File Mechanism to Use
FNZ Studio provides various features to store data. Following is a list highlighting their pros ans cons:
- LocalFile: data is persisted on the local disk of the cluster node and, consequently, data is not available on the other nodes. If FNZ Studio is deployed using containers, data could be lost in case of destruction of the container. This approach is indicated for temporary data that needs to be processed on a node and then discarded.
- ClusterFile: data is stored into an Hazelcast distributed map, putting pressure on system memory if the data to be stored is large. Data can be retrieved fast and it is available on all the cluster nodes. This approach is recommended for small sets of data with a limited life. This is because data occupies memory as long as it is kept.
- Blob Storage: data is stored outside the FNZ Studio cluster, but is accessible from any cluster node. This approach is recommended for large sets of data that need to be stored indefinitely.
Typical Use Cases
The Blob Storage has been designed to be used in all those Solutions where large documents needs to be stored and retrieved. A typical use case could be the Client Onboarding where you need to collect considerable quantities of documents, or large documents such as pictures or videos.
Another typical use case is when a Blob needs to be stored in two different storage solutions during its lifetime, a temporary one and a permanent one. Let’s imagine, for example, that a bank collects information about a potential customer during the onboarding phase, and wants to use a temporary silo to collect all the documentation. Once the onboarding is approved, those documents are then moved to a different silo to be kept permanently.
AWSBlobStorage Extension
The AWSBlobStorage extension contains the Provider for the Blob Storage to store Blobs using the AWS S3 service. To use the AWSBlobStorage extension, the following Configuration Properties have to be set for the extension (all of them are required):
- com.nm.extensions.awsblobstorage.accessKey: AWS access key to access the S3 service.
- com.nm.extensions.awsblobstorage.secretKey: AWS secret key to access the S3 service. Note: The AWS access key and secret key, after being manually configured, are encrypted, stored in the FNZ Studio internal vault, and marked as [ENCRYPTED].
- com.nm.extensions.awsblobstorage.region: AWS region used to access the S3 service. e.g.
eu-central-1. - com.nm.extensions.awsblobstorage.silos: The silo specifications. The value of the property is a list of key-values pairs separated by a comma (,). The key represents the name of the silo, while the value specifies the existing bucket to use. e.g.
archive-s3=client-onboarding-bucket
AzureBlobStorage Extension
The AzureBlobStorage extension contains the Blob Storage Provider to store Blobs using Azure [1] Block Blobs.
To use the AzureBlobStorage extension, the following properties need to be configured (all of them are required):
com.nm.extensions.azureblobstorage.accountName: the name of the Azure storage account. This account must be created before the extension is used. Note that the value for this configuration property is hidden as soon as it is set for security reasons.com.nm.extensions.azureblobstorage.accountKey: the security key used to access the storage account. It needs to be created before the extension is used. Note that the value for this configuration property is hidden as soon as it is set for security reasons.com.nm.extensions.azureblobstorage.silos: The silo specifications. The value of the property is a list of key-values pairs separated by a comma (,). The key is used for the name of the silo, the value is used to specify the existing bucket to use. E.g.archive-azure=client-onboarding-bucket