Gradle Tasks in DHF

Gradle is a third-party tool that automates build tasks. MarkLogic provides a Gradle plugin (ml-gradle) that automates many of tasks required to manage a MarkLogic server.

The DHF Gradle Plugin (ml-data-hub) expands ml-gradle with DHF-specific tasks and uses it to deploy MarkLogic server resources (e.g., databases, users, roles, app servers). ml-gradle deploys these resources according to the configurations in the following directories:

  • hub-internal-config (your-project-root/src/main/hub-internal-config)
  • ml-config (your-project-root/src/main/ml-config)

This page provides a complete list of all of the Gradle tasks available in DHF Gradle Plugin (ml-data-hub).

  • Tasks with names starting with ml are customized for DHF from the ml-gradle implementation.
  • Tasks with names starting with hub are created specifically for DHF.

See ml-gradle Common Tasks or ml-gradle Task Reference for the default (non-DHF) behavior of ml-gradle tasks.

In this page:

Using Gradle in DHF

To use DHF Gradle Plugin in the DHF flows (i.e., ingest and harmonize), see DHF Gradle Plugin.

To pass parameters to Gradle tasks, use the -P option.

<div class="code-tabs">
  <ul class="nav nav-tabs" role="tablist">
    <li role="presentation" class="active">
      <a role="tab" aria-controls="tab1-1" href="#tab1-1">Unix Systems</a>
    </li>
    <li role="presentation" class="">
      <a role="tab" aria-controls="tab2-1" href="#tab2-1">Windows</a>
    </li>
  </ul>
  <div class="tab-content">
    <div role="tabpanel" class="tab-pane active" id="tab1-1">
      <pre class="cmdline">./gradlew taskname ... -PparameterName=parameterValue ...</pre>
    </div>
    <div role="tabpanel" class="tab-pane" id="tab2-1">
      <pre class="cmdline">gradlew.bat taskname ... -PparameterName=parameterValue ...</pre>
    </div>
  </div>
</div>

Running ml-gradle Tasks for Different Environments

You can run ml-gradle tasks for a specific environment (e.g., development, QA, production, local).

  1. For each environment, create a properties file with a filename in the format gradle-${env}.properties, where ${env} is the environment the file is intended for.

    Examples:

    • For a development environment, create a file called gradle-dev.properties.
    • For a QA environment, create a file called gradle-qa.properties.
    • For a production environment, create a file called gradle-prod.properties.

    By default, DHF uses gradle-local.properties for your local environment.

  2. Enter the environment-specific property settings inside the appropriate properties file. The contents of these environment files will override any values set in the gradle.properties file.

  3. To specify an environment at runtime, use the -PenvironmentName=xxx option.

    Example: To run a Gradle command against the production environment,

    ./gradlew taskname ... -PenvironmentName=production ...
    gradlew.bat taskname ... -PenvironmentName=production ...

MarkLogic Data Hub Setup Tasks

These tasks are used to configure the Data Hub Framework and manage the data hub.

mlDeploy
Uses `hubPreinstallCheck` to deploy your DHF project.
./gradlew mlDeploy
gradlew.bat mlDeploy
mlWatch
Extends ml-gradle's WatchTask by ensuring that modules in DHF-specific folders (`plugins` and `entity-config`) are monitored.
./gradlew mlWatch
gradlew.bat mlWatch
mlUpdateIndexes
Updates the properties of every database without creating or updating forests. Many properties of a database are related to indexing.
./gradlew mlUpdateIndexes
gradlew.bat mlUpdateIndexes
hubUpdate
Updates your DHF instance to a newer version. At your project's root folder, run the `hubUpdate -i` Gradle task.
./gradlew hubUpdate -i
gradlew.bat hubUpdate -i

Before you run the hubUpdate task, edit the build.gradle file. Under plugins, change the value of 'com.marklogic.ml-data-hub' version to the new DHF version.

Example: If you are updating to DHF 4.2.0,

  plugins {
      id 'com.marklogic.ml-data-hub' version '4.2.0'
  }
For complete instructions to upgrade to a newer DHF version, see [Upgrading DHF](/marklogic-data-hub/upgrade/). Running the `hubUpdate` task with the `-i` option (info mode) displays specifically what the task does, including configuration settings that changed.
Example: A verbose report.
    Upgrading entity-config dir
    Upgrading hub-internal-config dir
    Processing /your-project-root/hub-internal-config/databases/job-database.json
    Setting "schema-database" to "%%mlStagingSchemasDbName%%"
    Setting "triggers-database" to "%%mlStagingTriggersDbName%%"
    Adding path range indexes to job-database.json
    Writing /your-project-root/hub-internal-config/databases/job-database.json to /your-project-root/src/main/hub-internal-config/databases/job-database.json
    Processing /your-project-root/hub-internal-config/databases/final-database.json
    Setting "schema-database" to "%%mlFinalSchemasDbName%%"
    Setting "triggers-database" to "%%mlFinalTriggersDbName%%"
    Writing /your-project-root/hub-internal-config/databases/final-database.json to /your-project-root/src/main/ml-config/databases/final-database.json
    Processing /your-project-root/hub-internal-config/databases/staging-database.json
    Setting "schema-database" to "%%mlStagingSchemasDbName%%"
    Setting "triggers-database" to "%%mlStagingTriggersDbName%%"
    Writing /your-project-root/hub-internal-config/databases/staging-database.json to /your-project-root/src/main/hub-internal-config/databases/staging-database.json
    Writing /your-project-root/hub-internal-config/databases/modules-database.json to /your-project-root/src/main/ml-config/databases/modules-database.json
    Processing /your-project-root/hub-internal-config/servers/job-server.json
    Setting "url-rewriter" to "/data-hub/4/tracing/tracing-rewriter.xml"
    Writing /your-project-root/hub-internal-config/servers/job-server.json to /your-project-root/src/main/hub-internal-config/servers/job-server.json
    Writing /your-project-root/hub-internal-config/servers/final-server.json to /your-project-root/src/main/ml-config/servers/final-server.json
    Processing /your-project-root/hub-internal-config/servers/staging-server.json
    Setting "url-rewriter" to "/data-hub/4/rest-api/rewriter.xml"
    Setting "error-handler" to "/data-hub/4/rest-api/error-handler.xqy"
    Writing /your-project-root/hub-internal-config/servers/staging-server.json to /your-project-root/src/main/hub-internal-config/servers/staging-server.json
    Upgrading user-config dir
    
hubInfo
Prints out basic info about the DHF configuration.
./gradlew hubInfo
gradlew.bat hubInfo
hubEnableDebugging
Enables extra debugging features in DHF.
./gradlew hubEnableDebugging
gradlew.bat hubEnableDebugging
hubDisableDebugging
Disables extra debugging features in DHF.
./gradlew hubDisableDebugging
gradlew.bat hubDisableDebugging
hubEnableTracing
Enables tracing in DHF.
./gradlew hubEnableTracing
gradlew.bat hubEnableTracing
hubDisableTracing
Disables tracing in DHF.
./gradlew hubDisableTracing
gradlew.bat hubDisableTracing
hubDeployUserArtifacts
Installs user artifacts, such as entities and mappings, to the MarkLogic server. (DHF 4.2 or later)
./gradlew hubDeployUserArtifacts
gradlew.bat hubDeployUserArtifacts

MarkLogic Data Hub Scaffolding Tasks

These tasks allow you to scaffold projects, entities, and flows.

hubInit
Initializes the current directory as a DHF project.
./gradlew hubInit
gradlew.bat hubInit
hubCreateEntity
Creates a boilerplate entity.
./gradlew hubCreateEntity -PentityName=yourentityname
gradlew.bat hubCreateEntity -PentityName=yourentityname
Parameter Description
entityName (Required) The name of the entity to create.
hubCreateInputFlow
Creates an input flow.
./gradlew hubCreateInputFlow -PentityName=yourentityname -PflowName=yourflowname -PdataFormat=(xml|json) -PpluginFormat=(xqy|sjs)
gradlew.bat hubCreateInputFlow -PentityName=yourentityname -PflowName=yourflowname -PdataFormat=(xml|json) -PpluginFormat=(xqy|sjs)
Parameter Description
entityName (Required) The name of the entity that owns the flow.
flowName (Required) The name of the input flow to create.
dataFormat xml or json. Default is json.
hubCreateHarmonizeFlow
Creates a harmonization flow.
./gradlew hubCreateHarmonizeFlow -PentityName=yourentityname -PflowName=yourflowname -PdataFormat=(xml|json) -PpluginFormat=(xqy|sjs) -PmappingName=yourmappingname
gradlew.bat hubCreateHarmonizeFlow -PentityName=yourentityname -PflowName=yourflowname -PdataFormat=(xml|json) -PpluginFormat=(xqy|sjs) -PmappingName=yourmappingname
Parameter Description
entityName (Required) The name of the entity that owns the flow.
flowName (Required) The name of the harmonize flow to create.
dataFormat xml or json. Default is json.
pluginFormat xqy or sjs. The plugin programming language.
mappingName The name of a model-to-model mapping to use during code generation.
hubGeneratePii
Generates security configuration files for protecting entity properties designated as Personally Identifiable Information (PII). For details, see Managing Personally Identifiable Information.
./gradlew hubGeneratePii
gradlew.bat hubGeneratePii

MarkLogic Data Hub Flow Management tasks

These tasks allow you to run flows and clean up.

hubRunFlow
Runs a harmonization flow.
./gradlew hubRunFlow -PentityName=yourentityname -PflowName=yourflowname -PbatchSize=100 -PthreadCount=4 -PsourceDB=data-hub-STAGING-PdestDB=data-hub-FINAL -PshowOptions=(true|false)
gradlew.bat hubRunFlow -PentityName=yourentityname -PflowName=yourflowname -PbatchSize=100 -PthreadCount=4 -PsourceDB=data-hub-STAGING-PdestDB=data-hub-FINAL -PshowOptions=(true|false)
Parameter Description
entityName (Required) The name of the entity containing the harmonize flow.
flowName (Required) The name of the harmonize flow to run.
batchSize The number of items to include in a batch. Default is 100.
threadCount The number of threads to run. Default is 4.
sourceDB The name of the database to run against. Default is the name of your staging database.
destDB The name of the database to put harmonized results into. Default is the name of your final database.
showOptions Whether or not to print out options that were passed in to the command. Default is false.

You can also pass custom key-value parameters to your flows. These key-value pairs will be available in the $options (xqy) or options (sjs) passed to your flows. To pass custom key-value pairs, prefix your keys with dhf.

Example:
./gradlew hubRunFlow -PentityName=yourentityname -PflowName=yourflowname -Pdhf.myKey=myValue -Pdhf.myOtherKey=myOtherValue
gradlew.bat hubRunFlow -PentityName=yourentityname -PflowName=yourflowname -Pdhf.myKey=myValue -Pdhf.myOtherKey=myOtherValue

The following options become available:

    {
      "myKey": "myValue",
      "myOtherKey": "myOtherValue"
    }
hubExportJobs
Exports job records and their associated traces. This task does not affect the contents of the staging or final databases.
./gradlew hubExportJobs -PjobIds=list-of-ids -Pfilename=export.zip
gradlew.bat hubExportJobs -PjobIds=list-of-ids -Pfilename=export.zip
Parameter Description
jobIds A comma-separated list of job IDs to export. Any traces associated with those jobs will be exported.
filename The name of the zip file to generated, including the file extension. Default is jobexport.zip.
hubDeleteJobs
Deletes job records and their associated traces. This task does not affect the contents of the staging or final databases.
./gradlew hubDeleteJobs -PjobIds=list-of-ids
gradlew.bat hubDeleteJobs -PjobIds=list-of-ids
Parameter Description
jobIds (Required) A comma-separated list of job IDs to delete.

Uninstalling Your MarkLogic Data Hub

mlUndeploy
Removes all components of your data hub from the MarkLogic server, including databases, application servers, forests, and users.
./gradlew mlUndeploy -Pconfirm=true
gradlew.bat mlUndeploy -Pconfirm=true