Project Directory Structure for DHF 4.1.x

A DHF 4.1.x project will have the following directory structure after initialization (through QuickStart or the hubInit Gradle task).

Show complete directory structure.
  ├─ build.gradle
  ├─ ...
  ├─ gradlew
  ├─ gradlew.bat
  ├─ gradle
  │  └─ wrapper
  │     ├─ gradle-wrapper.jar
  │     └─
  ├─ plugins
  │  ├─ entities
  │  │  ├─ entity1
  │  │  │  ├─ input
  │  │  │  │  ├─ inputflow1
  │  │  │  │  │  ├─ content.(sjs|xqy)
  │  │  │  │  │  ├─ headers.(sjs|xqy)
  │  │  │  │  │  ├─ main.(sjs|xqy)
  │  │  │  │  │  └─ triples.(sjs|xqy)
  │  │  │  │  ├─ ...
  │  │  │  │  └─ inputflowN
  │  │  │  └─ harmonize
  │  │  │     ├─ harmonizeflow1
  │  │  │     │  ├─ collector.(sjs|xqy)
  │  │  │     │  ├─ content.(sjs|xqy)
  │  │  │     │  ├─ headers.(sjs|xqy)
  │  │  │     │  ├─ main.(sjs|xqy)
  │  │  │     │  ├─ triples.(sjs|xqy)
  │  │  │     │  └─ writer.(sjs|xqy)
  │  │  │     ├─ ...
  │  │  │     └─ harmonizeflowN
  │  │  ├─ ...
  │  │  └─ entityN
  │  └─ mappings
  │     ├─ mappingName1
  │     │  ├─ mappingName-0.mapping.json
  │     │  ├─ ...
  │     │  └─ mappingName-N.mapping.json
  │     ├─ ...
  │     └─ mappingNameN
  ├─ src
  │  └─ main
  │     ├─ entity-config
  │     │  ├─ final-entity-options.xml
  │     │  ├─ staging-entity-options.xml
  │     │  └─ databases
  │     │     ├─ final-database.json
  │     │     └─ staging-database.json
  │     ├─ hub-internal-config
  │     │  ├─ databases
  │     │  │  ├─ job-database.json
  │     │  │  ├─ staging-database.json
  │     │  │  ├─ staging-schemas-database.json
  │     │  │  └─ staging-triggers-database.json
  │     │  ├─ schemas
  │     │  ├─ security
  │     │  │  ├─ privileges
  │     │  │  │  ├─ dhf-internal-data-hub.json
  │     │  │  │  ├─ dhf-internal-entities.json
  │     │  │  │  ├─ dhf-internal-mappings.json
  │     │  │  │  └─ dhf-internal-trace-ui.json
  │     │  │  ├─ roles
  │     │  │  │  ├─ data-hub-role.json
  │     │  │  │  └─ hub-admin-role.json
  │     │  │  └─ users
  │     │  │     ├─ data-hub-user.json
  │     │  │     └─ hub-admin-user.json
  │     │  ├─ servers
  │     │  │  ├─ job-server.json
  │     │  │  └─ staging-server.json
  │     │  └─ triggers
  │     ├─ ml-config
  │     │  ├─ databases
  │     │  │  ├─ final-database.json
  │     │  │  ├─ final-schemas-database.json
  │     │  │  ├─ final-triggers-database.json
  │     │  │  └─ modules-database.json
  │     │  ├─ entities.layout.json
  │     │  ├─ security
  │     │  │  ├─ privileges
  │     │  │  ├─ roles
  │     │  │  └─ users
  │     │  ├─ servers
  │     │  │  └─ final-server.json
  │     │  └─ triggers
  │     ├─ ml-modules
  │     └─ ml-schemas
  └─ .tmp


  ├─ build.gradle
  ├─ ...
  ├─ gradlew
  ├─ gradlew.bat
  ├─ gradle
  ├─ plugins
  ├─ src
  └─ .tmp
This file enables you to use Gradle to configure and manage your data hub instance. See the Gradle documentation.
This properties file defines variables needed by the data hub to install and run properly. Use this file to store values that apply to all instances of your data hub.
DHF determines your project’s various environments (e.g.: dev, qa, prod, local) based on the existence of override files in your hub project. To create a new environment, simply create a new override file with the environment name after the dash. Example: The file contains settings that override the variables in for your local environment.
gradlew, gradlew.bat
These are Unix/Linux and Windows executable files that run the Gradle wrapper in the gradle directory.


  └─ wrapper
     ├─ gradle-wrapper.jar

This directory contains the Gradle wrapper, which is a custom local version of Gradle, so Gradle doesn’t have to be installed separately. The Gradle wrapper is installed when you initialize a new DHF project.


  ├─ entities
  │  ├─ entity1
  │  │  ├─ input
  │  │  │  ├─ inputflow1
  │  │  │  ├─ ...
  │  │  │  └─ inputflowN
  │  │  └─ harmonize
  │  │     ├─ harmonizeflow1
  │  │     ├─ ...
  │  │     └─ harmonizeflowN
  │  ├─ ...
  │  └─ entityN
  └─ mappings
     ├─ mappingName1
     ├─ ...
     └─ mappingNameN

This directory contains project-specific server-side modules that are deployed into MarkLogic.

You can add custom server-side files in this directory.

When deployed to MarkLogic ./plugins is equivalent to the root URI (/), so a library module at ./plugins/my-directory/my-lib.xqy would be loaded into the modules database as /my-directory/my-lib.xqy.


  ├─ entity1
  │  ├─ input
  │  │  ├─ inputflow1
  │  │  │  ├─ content.(sjs|xqy)
  │  │  │  ├─ headers.(sjs|xqy)
  │  │  │  ├─ main.(sjs|xqy)
  │  │  │  └─ triples.(sjs|xqy)
  │  │  ├─ ...
  │  │  └─ inputflowN
  │  └─ harmonize
  │     ├─ harmonizeflow1
  │     │  ├─ collector.(sjs|xqy)
  │     │  ├─ content.(sjs|xqy)
  │     │  ├─ headers.(sjs|xqy)
  │     │  ├─ main.(sjs|xqy)
  │     │  ├─ triples.(sjs|xqy)
  │     │  └─ writer.(sjs|xqy)
  │     ├─ ...
  │     └─ harmonizeflowN
  ├─ ...
  └─ entityN
This directory contains your entity definitions. An entity is a domain object like Employee or SalesOrder. Each entity directory contains two subdirectories: input and harmonize. DHF has custom logic to handle the deployment of this directory to MarkLogic.
The input subdirectory contains all of the input flows for a given entity. Input flows are responsible for creating an XML or JSON envelope during content ingest. This directory contains one server-side module for each part of the envelope: content, headers, and triples. You may also optionally include a REST directory that contains custom MarkLogic REST extensions related to this input flow.
The server-side module (XQuery or JavaScript) responsible for creating the content section of your envelope.
The server-side module (XQuery or JavaScript) responsible for creating the headers section of your envelope.
The server-side module (XQuery or JavaScript) responsible for orchestrating your plugins.
The server-side module (XQuery or JavaScript) responsible for creating the triples section of your envelope.
The harmonize subdirectory contains all of the harmonize flows for a given entity. Harmonize flows are responsible for creating an XML or JSON envelope during content harmonization. This directory contains one server-side module for each part of the envelope: content, headers, and triples. It also contains collector and writer modules as described below. You may also optionally include a REST directory that contains custom MarkLogic REST extensions that are related to this input flow.
The server-side module (XQuery or JavaScript) responsible for returning a list of things to harmonize. Harmonization is a batch process that operates on one or more items. The returned items should be an array of strings. Each string can have any meaning you like: uri, identifier, sequence number, etc.
The server-side module (XQuery or JavaScript) responsible for creating the content section of your envelope.
The server-side module (XQuery or JavaScript) responsible for creating the headers section of your envelope.
The server-side module (XQuery or JavaScript) responsible for orchestrating your plugins.
The server-side module (XQuery or JavaScript) responsible for creating the triples section of your envelope.
The server-side module (XQuery or JavaScript) responsible for persisting your envelope into MarkLogic.
Items that used to be here have been moved to src/main/ml-modules in DHF 4.x.


  ├─ mapping1
  │  ├─ mapping1-0.mapping.json
  │  ├─ ...
  │  └─ mapping1-N.mapping.json
  ├─ ...
  └─ mappingN

This directory contains model-to-model mapping configuration artifacts that can be used to configure an input flow. For details, see Using Model-to-Model Mapping.

This directory contains all versions of a given model-to-model mapping. The name of the directory is the same as mapping name. For details, see For details, see Using Model-to-Model Mapping.
This JSON file is a model-to-model mapping configuration file. QuickStart creates a new version each time you modify a mapping. See Using Model-to-Model Mapping.


  └─ main
     ├─ entity-config
     │  └─ databases
     ├─ hub-internal-config
     │  ├─ databases
     │  ├─ schemas
     │  ├─ security
     │  ├─ servers
     │  └─ triggers
     ├─ ml-config
     │  ├─ databases
     │  ├─ entities.layout.json
     │  ├─ security
     │  ├─ servers
     │  └─ triggers
     ├─ ml-modules
     └─ ml-schemas


  ├─ final-entity-options.xml
  ├─ staging-entity-options.xml
  └─ databases
     ├─ final-database.json
     └─ staging-database.json

This directory contains two options files and two database configuration files for staging and for final. These files can be modified to configure indexes.


  ├─ databases
  │  ├─ job-database.json
  │  ├─ staging-database.json
  │  ├─ staging-schemas-database.json
  │  └─ staging-triggers-database.json
  ├─ schemas
  ├─ security
  │  ├─ privileges
  │  │  ├─ dhf-internal-data-hub.json
  │  │  ├─ dhf-internal-entities.json
  │  │  ├─ dhf-internal-mappings.json
  │  │  └─ dhf-internal-trace-ui.json
  │  ├─ roles
  │  │  ├─ data-hub-role.json
  │  │  └─ hub-admin-role.json
  │  └─ users
  │     ├─ data-hub-user.json
  │     └─ hub-admin-user.json
  ├─ servers
  │  ├─ job-server.json
  │  └─ staging-server.json
  └─ triggers

This directory contains subdirectories and JSON files that represent the minimum configuration necessary for DHF to function. Do not edit anything in this directory. If you need to override a configuration in this directory, create a file with the same name and directory structure under the ml-config directory and add any properties you’d like to override.

Each of the above JSON files conforms to the MarkLogic REST API for creating the following:


  ├─ databases
  │  ├─ final-database.json
  │  ├─ final-schemas-database.json
  │  ├─ final-triggers-database.json
  │  └─ modules-database.json
  ├─ entities.layout.json
  ├─ security
  │  ├─ privileges
  │  ├─ roles
  │  └─ users
  ├─ servers
  │  └─ final-server.json
  └─ triggers

This directory contains additional subdirectories and JSON files used to configure your DHF project. You can add custom modules and transforms, as well as other configuration assets, in this directory.

The following files are found in the ml-config directory only:

  • final-database.json
  • final-schemas-database.json
  • final-triggers-database.json
  • modules-database.json
  • final-server.json


This directory is the default ml-gradle location for artifacts to be deployed to the modules database.


This directory contains your project’s schemas which are loaded by ml-gradle.


This directory contains temporary hub artifacts.

Project Directory Structure in Previous DHF Releases

See Also