Upgrade to DHF 4.1.x / 4.2.x

Prerequisites

Data Hub Framework 4.1.x / 4.2.x requires the following software:

Upgrade Notes and Steps

The notes and steps in this tab are for the following upgrade paths:

  • DHF 4.0.3 » 4.1.x or 4.2.x
  • DHF 4.0.2 » 4.1.x or 4.2.x
  • DHF 4.0.1 » 4.1.x or 4.2.x

Upgrade Notes

  • The hubUpdate task makes the following changes.

    • Archives existing configuration directories under your-project-root/src/main. (4.0.x is the old DHF version.)

      old directory new archive directory
      hub-internal-config hub-internal-config-4.0.x
      ml-config ml-config-4.0.x
    • Overwrites the existing databases, server directories, and the security directory.

  • Running the hubUpdate task with the -i option (info mode) displays specifically what the task does, including configuration settings that changed.

    Example: A verbose report.
      Upgrading entity-config dir
      Upgrading hub-internal-config dir
      Processing /your-project-root/hub-internal-config/databases/job-database.json
      Setting "schema-database" to "%%mlStagingSchemasDbName%%"
      Setting "triggers-database" to "%%mlStagingTriggersDbName%%"
      Adding path range indexes to job-database.json
      Writing /your-project-root/hub-internal-config/databases/job-database.json to /your-project-root/src/main/hub-internal-config/databases/job-database.json
      Processing /your-project-root/hub-internal-config/databases/final-database.json
      Setting "schema-database" to "%%mlFinalSchemasDbName%%"
      Setting "triggers-database" to "%%mlFinalTriggersDbName%%"
      Writing /your-project-root/hub-internal-config/databases/final-database.json to /your-project-root/src/main/ml-config/databases/final-database.json
      Processing /your-project-root/hub-internal-config/databases/staging-database.json
      Setting "schema-database" to "%%mlStagingSchemasDbName%%"
      Setting "triggers-database" to "%%mlStagingTriggersDbName%%"
      Writing /your-project-root/hub-internal-config/databases/staging-database.json to /your-project-root/src/main/hub-internal-config/databases/staging-database.json
      Writing /your-project-root/hub-internal-config/databases/modules-database.json to /your-project-root/src/main/ml-config/databases/modules-database.json
      Processing /your-project-root/hub-internal-config/servers/job-server.json
      Setting "url-rewriter" to "/data-hub/4/tracing/tracing-rewriter.xml"
      Writing /your-project-root/hub-internal-config/servers/job-server.json to /your-project-root/src/main/hub-internal-config/servers/job-server.json
      Writing /your-project-root/hub-internal-config/servers/final-server.json to /your-project-root/src/main/ml-config/servers/final-server.json
      Processing /your-project-root/hub-internal-config/servers/staging-server.json
      Setting "url-rewriter" to "/data-hub/4/rest-api/rewriter.xml"
      Setting "error-handler" to "/data-hub/4/rest-api/error-handler.xqy"
      Writing /your-project-root/hub-internal-config/servers/staging-server.json to /your-project-root/src/main/hub-internal-config/servers/staging-server.json
      Upgrading user-config dir
      

Procedure

  1. In your build.gradle file, replace all occurrences of your old DHF version number with 4.1.1.

    Example: In the plugins section and the dependencies section,

       plugins {
           id 'net.saliman.properties' version '1.4.6'
           id 'com.marklogic.ml-data-hub' version '4.1.1'
       }
       ...
       dependencies {
         compile 'com.marklogic:marklogic-data-hub:4.1.1'
         compile 'com.marklogic:marklogic-xcc:9.0.6'
       }
    
  2. At your project’s root folder, run the hubUpdate -i Gradle task.

    ./gradlew hubUpdate -i
    gradlew.bat hubUpdate -i
  3. Edit your gradle.properties file, and add the following property.

       mlDHFVersion=4.1.0
    
  4. In your-project-root/src/main, copy any custom database/server configurations from the archived configuration files to the new ones. (4.0.x is the old DHF version.)

    copy from files in paste to files in
    hub-internal-config-4.0.x hub-internal-config
    ml-config-4.0.x ml-config
  5. At your project’s root folder, run the mlDeploy Gradle task.

    ./gradlew mlDeploy
    gradlew.bat mlDeploy
  6. Run your ingest and harmonize flows.

The notes and steps in this tab are for the following upgrade paths:

  • DHF 4.0.0 » 4.1.x or 4.2.x

Upgrade Notes

  • DHF 4.0.0 is unique among DHF versions because it has two modules databases: one for the final app server/database and one for the staging app server/database. All other DHF versions before and after 4.0.0 have only one modules database. When upgrading to 4.1.0, those databases must be commented out in gradle.properties.

  • The hubUpdate task makes the following changes.

    • Archives existing configuration directories under your-project-root/src/main.

      old directory new archive directory
      hub-internal-config hub-internal-config-4.0.0
      ml-config ml-config-4.0.0
    • Overwrites the existing databases, server directories, and the security directory.

  • Running the hubUpdate task with the -i option (info mode) displays specifically what the task does, including configuration settings that changed.

    Example: A verbose report.
      Upgrading entity-config dir
      Upgrading hub-internal-config dir
      Processing /your-project-root/hub-internal-config/databases/job-database.json
      Setting "schema-database" to "%%mlStagingSchemasDbName%%"
      Setting "triggers-database" to "%%mlStagingTriggersDbName%%"
      Adding path range indexes to job-database.json
      Writing /your-project-root/hub-internal-config/databases/job-database.json to /your-project-root/src/main/hub-internal-config/databases/job-database.json
      Processing /your-project-root/hub-internal-config/databases/final-database.json
      Setting "schema-database" to "%%mlFinalSchemasDbName%%"
      Setting "triggers-database" to "%%mlFinalTriggersDbName%%"
      Writing /your-project-root/hub-internal-config/databases/final-database.json to /your-project-root/src/main/ml-config/databases/final-database.json
      Processing /your-project-root/hub-internal-config/databases/staging-database.json
      Setting "schema-database" to "%%mlStagingSchemasDbName%%"
      Setting "triggers-database" to "%%mlStagingTriggersDbName%%"
      Writing /your-project-root/hub-internal-config/databases/staging-database.json to /your-project-root/src/main/hub-internal-config/databases/staging-database.json
      Writing /your-project-root/hub-internal-config/databases/modules-database.json to /your-project-root/src/main/ml-config/databases/modules-database.json
      Processing /your-project-root/hub-internal-config/servers/job-server.json
      Setting "url-rewriter" to "/data-hub/4/tracing/tracing-rewriter.xml"
      Writing /your-project-root/hub-internal-config/servers/job-server.json to /your-project-root/src/main/hub-internal-config/servers/job-server.json
      Writing /your-project-root/hub-internal-config/servers/final-server.json to /your-project-root/src/main/ml-config/servers/final-server.json
      Processing /your-project-root/hub-internal-config/servers/staging-server.json
      Setting "url-rewriter" to "/data-hub/4/rest-api/rewriter.xml"
      Setting "error-handler" to "/data-hub/4/rest-api/error-handler.xqy"
      Writing /your-project-root/hub-internal-config/servers/staging-server.json to /your-project-root/src/main/hub-internal-config/servers/staging-server.json
      Upgrading user-config dir
      

Procedure

  1. In your build.gradle file, replace all occurrences of your old DHF version number with 4.1.1.

    Example: In the plugins section and the dependencies section,

       plugins {
           id 'net.saliman.properties' version '1.4.6'
           id 'com.marklogic.ml-data-hub' version '4.1.1'
       }
       ...
       dependencies {
         compile 'com.marklogic:marklogic-data-hub:4.1.1'
         compile 'com.marklogic:marklogic-xcc:9.0.6'
       }
    
  2. At your project’s root folder, run the hubUpdate -i Gradle task.

    ./gradlew hubUpdate -i
    gradlew.bat hubUpdate -i
  3. In your-project-root/src/main, copy any custom database/server configurations from the archived configuration files to the new ones.

    copy from files in paste to files in
    hub-internal-config-4.0.0 hub-internal-config
    ml-config-4.0.0 ml-config
  4. Edit your gradle.properties file.

    a. Remove the following properties:

       mlStagingModulesDbName
       mlStagingModulesForestsPerHost
       mlStagingModulePermissions
       mlFinalModulesDbName
       mlFinalModulesForestsPerHost
       mlFinalModulePermissions
    

    b. Add the following properties and replace the values accordingly.

       mlDHFVersion=4.1.0
       mlModulesDbName=data-hub-MODULES
       mlModulesForestsPerHost=1
    
  5. At your project’s root folder, run the mlDeploy Gradle task.

    ./gradlew mlDeploy
    gradlew.bat mlDeploy
  6. Run your ingest and harmonize flows.

    If you use MarkLogic Content Pump for your input flows, run MLCP with the -transform_module option as follows:

    -transform_module "/data-hub/4/transforms/mlcp-flow-transform.xqy"
    -transform_module "/data-hub/4/transforms/mlcp-flow-transform.sjs"

Remarks

After running mlUndeploy, delete the following obsolete resources:

  • data-hub-staging-MODULES database and forest
  • data-hub-final-MODULES database and forest

The notes and steps in this tab are for the following upgrade paths:

  • DHF 3.0.0 » 4.1.x or 4.2.x
  • DHF 2.0.6 » 4.1.x or 4.2.x
  • DHF 2.0.5 » 4.1.x or 4.2.x
  • DHF 2.0.4 » 4.1.x or 4.2.x

Upgrade Notes

  • The hubUpdate Gradle task makes the following changes.

    • Renames old configuration directories under your project root.

      old directory new directory
      hub-internal-config hub-internal-config.old
      user-config user-config.old
      entity-config entity-config.old
    • Creates the new project directory structure (your-project-root/src/main and its subdirectories) and new files.

    • Copies some settings from the old configuration files to the new ones.

    • Updates all flows to use updated imports. See the [notes to upgrade to 4.0.0] ](https://marklogic.github.io/marklogic-data-hub/upgrade/upgrade-to-4_0_x/#upgrading-from-204-to-40x).

  • Running the hubUpdate task with the -i option (info mode) displays specifically what the task does, including configuration settings that changed.

    Example: A verbose report.
      Upgrading entity-config dir
      Upgrading hub-internal-config dir
      Processing /your-project-root/hub-internal-config/databases/job-database.json
      Setting "schema-database" to "%%mlStagingSchemasDbName%%"
      Setting "triggers-database" to "%%mlStagingTriggersDbName%%"
      Adding path range indexes to job-database.json
      Writing /your-project-root/hub-internal-config/databases/job-database.json to /your-project-root/src/main/hub-internal-config/databases/job-database.json
      Processing /your-project-root/hub-internal-config/databases/final-database.json
      Setting "schema-database" to "%%mlFinalSchemasDbName%%"
      Setting "triggers-database" to "%%mlFinalTriggersDbName%%"
      Writing /your-project-root/hub-internal-config/databases/final-database.json to /your-project-root/src/main/ml-config/databases/final-database.json
      Processing /your-project-root/hub-internal-config/databases/staging-database.json
      Setting "schema-database" to "%%mlStagingSchemasDbName%%"
      Setting "triggers-database" to "%%mlStagingTriggersDbName%%"
      Writing /your-project-root/hub-internal-config/databases/staging-database.json to /your-project-root/src/main/hub-internal-config/databases/staging-database.json
      Writing /your-project-root/hub-internal-config/databases/modules-database.json to /your-project-root/src/main/ml-config/databases/modules-database.json
      Processing /your-project-root/hub-internal-config/servers/job-server.json
      Setting "url-rewriter" to "/data-hub/4/tracing/tracing-rewriter.xml"
      Writing /your-project-root/hub-internal-config/servers/job-server.json to /your-project-root/src/main/hub-internal-config/servers/job-server.json
      Writing /your-project-root/hub-internal-config/servers/final-server.json to /your-project-root/src/main/ml-config/servers/final-server.json
      Processing /your-project-root/hub-internal-config/servers/staging-server.json
      Setting "url-rewriter" to "/data-hub/4/rest-api/rewriter.xml"
      Setting "error-handler" to "/data-hub/4/rest-api/error-handler.xqy"
      Writing /your-project-root/hub-internal-config/servers/staging-server.json to /your-project-root/src/main/hub-internal-config/servers/staging-server.json
      Upgrading user-config dir
      
  • If custom configurations (i.e., from user-config) are missing, you must manually copy them to ml-config.

  • Because DHF 3.0.0 and DHF 2.0.4+ had only a single schemas database and a single triggers database, you must decide whether to use those existing databases as the staging databases or as the final databases in DHF 4.1.0. The settings in gradle.properties (and possibly other configurations) depend on your decision.

    DHF 3.0.0 and 2.0.4+ DHF 4.x
    data-hub-SCHEMAS database data-hub-staging-SCHEMAS database
    data-hub-final-SCHEMAS database
    data-hub-TRIGGERS dtabase data-hub-staging-TRIGGERS database
    data-hub-final-TRIGGERS database

Procedure

  1. In your build.gradle file, replace all occurrences of your old DHF version number with 4.1.1.

    Example: In the plugins section and the dependencies section,

       plugins {
           id 'net.saliman.properties' version '1.4.6'
           id 'com.marklogic.ml-data-hub' version '4.1.1'
       }
       ...
       dependencies {
         compile 'com.marklogic:marklogic-data-hub:4.1.1'
         compile 'com.marklogic:marklogic-xcc:9.0.6'
       }
    
  2. At your project’s root folder, run the hubUpdate -i Gradle task.

    ./gradlew hubUpdate -i
    gradlew.bat hubUpdate -i
  3. Edit your gradle.properties file.

    a. Remove the following properties:

    • data-hub-TRACING server
    • data-hub-TRACING database
    • data-hub-TRIGGERS database
    • data-hub-SCHEMAS database

    b. Add the following properties and replace the values accordingly.

       mlDHFVersion=4.1.0
       mlStagingTriggersDbName=data-hub-staging-TRIGGERS
       mlStagingTriggersForestsPerHost=1
       mlStagingSchemasDbName=data-hub-staging-SCHEMAS
       mlStagingSchemasForestsPerHost=1
       mlFinalTriggersDbName=data-hub-final-TRIGGERS
       mlFinalTriggersForestsPerHost=1
       mlFinalSchemasDbName=data-hub-final-SCHEMAS
       mlFinalSchemasForestsPerHost=1
       mlHubAdminRole=hub-admin-role
       mlHubAdminUserName=hub-admin-user
    

    c. Add default modules permissions.

       mlModulePermissions=rest-reader,read,rest-writer,insert,rest-writer,update,rest-extension-user,execute,data-hub-role,read,data-hub-role,execute
    
  4. At your project’s root folder, run the mlDeploy Gradle task.

    ./gradlew mlDeploy
    gradlew.bat mlDeploy
  5. Run your ingest and harmonize flows.

    If you use MarkLogic Content Pump for your input flows, run MLCP with the -transform_module option as follows:

    -transform_module "/data-hub/4/transforms/mlcp-flow-transform.xqy"
    -transform_module "/data-hub/4/transforms/mlcp-flow-transform.sjs"

Remarks

Before running mlUndeploy, delete the resources associated with the properties that were removed from the gradle.properties file:

  • data-hub-TRACING server
  • data-hub-TRACING database
  • data-hub-TRIGGERS database
  • data-hub-SCHEMAS database

The notes and steps in this tab are for the following upgrade paths:

  • DHF 2.0.3 and earlier versions » 4.1.x or 4.2.x

Procedure

  1. Upgrade to DHF 2.0.6.
  2. Follow the steps to upgrade from 3.0.0 or 2.0.4+ to DHF 4.1.0.

See Also