Flux can export documents with their metadata as “archive files” - ZIP files that contain an entry for each document and another entry for the XML metadata file associated with each document. Archive files can then be imported via the import-archive-files
command, providing a convenient mechanism for storing data and later importing it into a separate database.
Table of contents
Usage
The export-archive-files
command requires a query for selecting documents to export and a directory path for writing archive files to:
-
./bin/flux export-archive-files \ --connection-string "flux-example-user:password@localhost:8004" \ --collections example \ --path destination
-
bin\flux export-archive-files ^ --connection-string "flux-example-user:password@localhost:8004" ^ --collections example ^ --path destination
The following options control which documents are selected to be exported:
Option | Description |
---|---|
--collections | Comma-delimited sequence of collection names. |
--directory | A database directory for constraining on URIs. |
--options | Name of a REST API search options document; typically used with a string query. |
--query | A structured, serialized CTS, or combined query expressed as JSON or XML. |
--string-query | A string query utilizing MarkLogic’s search grammar. |
--uris | Newline-delimited sequence of document URIs to retrieve. |
You must specify at least one of --collections
, --directory
, --query
, --string-query
, or --uris
. You may specify any combination of those options as well, with the exception that --query
will be ignored if --uris
is specified.
You must then use the --path
option to specify a directory to write archive files to.
Windows-specific issues with zip files
In the likely event that you have one or more URIs with a forward slash - /
- in them, then creating a zip file with those URIs - which are used as the zip entry names - will produce confusing behavior on Windows. If you open the zip file via Windows Explorer, Windows will erroneously think the zip file is empty. If you open the zip file using 7-Zip, you will see a top-level entry named _
if one or more of your URIs begin with a forward slash. These are effectively issues that only occur when viewing the file within Windows and do not reflect the actual contents of the zip file. The contents of the file are correct and if you were to import them with Flux via the import-archive-files
command, you will get the expected results.
Controlling document metadata
Each exported document will have all of its associated metadata - collections, permissions, quality, properties, and metadata values - included in an XML document in the archive zip file. You can control which types of metadata are included with the --categories
option. This option accepts a comma-delimited sequence of the following metadata types:
collections
permissions
quality
properties
metadatavalues
If the option is not included, all metadata will be included.
Transforming document content
You can apply a MarkLogic REST transform to each document before it is written to an archive. A transform is configured via the following options:
Option | Description |
---|---|
--transform | Name of a MarkLogic REST transform to apply to the document before writing it. |
--transform-params | Comma-delimited list of transform parameter names and values - e.g. param1,value1,param2,value2. |
--transform-params-delimiter | Delimiter for --transform-params ; typically set when a value contains a comma. |
Specifying an encoding
MarkLogic stores all content in the UTF-8 encoding. You can specify an alternate encoding when exporting archives via the --encoding
option - e.g.:
-
./bin/flux export-archives \ --connection-string "flux-example-user:password@localhost:8004" \ --collections example \ --path destination \ --encoding ISO-8859-1
-
bin\flux export-archives ^ --connection-string "flux-example-user:password@localhost:8004" ^ --collections example ^ --path destination ^ --encoding ISO-8859-1
The encoding will be used for both document and metadata entries in each archive zip file.