Ingest Using MLCP

MarkLogic Content Pump (MLCP) is a standalone Java utility provided by MarkLogic. It provides a rich command-line interface for loading content into MarkLogic. You can read more in the MLCP User Guide.

Before you can ingest, you must already have a DHF project. You can create one with QuickStart or with the Gradle plugin.

You can have MLCP invoke your input flow by including three parameters with your MLCP command:

  • -transform_module
  • -transform_namespace
  • -transform_param

Input Flow Parameters

The -transform_module and -transform_namespace parameters must be set to the following:

  -transform_module "/data-hub/4/transforms/mlcp-flow-transform.xqy"
  -transform_namespace "http://marklogic.com/data-hub/mlcp-flow-transform"

For SJS transforms use

  -transform_module "/data-hub/4/transforms/mlcp-flow-transform.sjs"

The -transform_param parameter will contain a comma-delimited list of key=value pairs to be passed to the mlcp-flow-transform.xqy module. Here are the keys and a description of their values:

  • entity-name - the URL-encoded name of the entity to which the flow belongs.
  • flow-name - the URL-encoded name of the flow.
  • job-id - [Optional] a job id, any string is OK. If none is provided then a UUID is generated for you.
  • options - [Optional] additional JSON options you can pass to the flow. Must be a JSON object

Spaces in Flow Names

MLCP does not allow spaces in the command line options for -output_collections and -transform_param. Prior to Data Hub Framework 2.0.0 there is no way to run a flow with a space in the name from standalone MLCP.

Since 2.0.0 you can URL encode the name and it will run (as in the example below).

MLCP Example

This is how you would run a flow named “MyAwesomeFlow” for the entity named “MyAwesomeEntity”.

  /path/to/mlcp import \
  ... \
  -transform_module "/data-hub/4/transforms/mlcp-flow-transform.xqy" \
  -transform_namespace "http://marklogic.com/data-hub/mlcp-flow-transform" \
  -transform_param 'entity-name=MyAwesomeEntity,flow-name=MyAwesomeFlow,job-id=someString,options={"your":"options"}'

If your flow is implemented with JavaScript, use this module:

  /path/to/mlcp import \
  ... \
  -transform_module "/data-hub/4/transforms/mlcp-flow-transform.sjs" \
  -transform_param 'entity-name=MyAwesomeEntity,flow-name=MyAwesomeFlow,job-id=someString,options={"your":"options"}'

See Also