Getting Started Tutorial for DHF 3.x

Introducing QuickStart

This tutorial uses QuickStart, an easy-to-use user interface that you can run locally to start working with the Data Hub Framework quickly. With QuickStart, you will have a working data hub in a matter of minutes. No need to worry about deployment strategies or configuration details. Simply run the QuickStart .war (Java web application archive) and point it at your MarkLogic installation.

QuickStart is a DevOps tool. It is meant to be run on your development machine to aid you in quickly deploying your hub.

Before You Start

You might want to check out our high-level introductions before starting this tutorial:

Build an Online Shopping Hub

This tutorial will walk you through setting up a simple data hub for harmonizing online shopping data.

Imagine you’re a company that sells board games and board game accessories. You’ve been tasked with creating a data hub on top of MarkLogic. You must load all of your product and order data into MarkLogic and harmonize it for use in a new application.

You will take the following approach:

  1. Load product data as-is
  2. Harmonize product data
  3. Load order data as-is
  4. Harmonize order data
  5. Serve the data to downstream clients

In a Hurry? Download the completed version of this tutorial.

Prerequisites

Before you can begin this tutorial and work with the Data Hub Framework, you need to have some software installed.

  • Oracle’s Java 8 JDK

    Java versions later than 8 are currently not supported. We have not tested with OpenJDK. Not sure which Java version you have? Run the following from the command line:

    java -version
    

    If you have version 8, you will see something like the following:

    java version "1.8.0_40"
    

    The second number denotes the version.

  • MarkLogic Server 9.0-5 up to the latest 9.x version

    MarkLogic Server must be installed and initialized. See the installation instructions.

    Not sure which MarkLogic version you have? Open your web browser to http://localhost:8001. After logging in, look at the top-left corner for the version info:

    Check ML Version

    The following video describes how to install MarkLogic on Windows 10:

  • A modern web browser

    Chrome or Firefox works best. Use IE at your own risk.

Common Concerns

I have a MarkLogic instance, but it already has awesome stuff in it. Will this tutorial mess that up? No. The Data Hub Framework is installed on isolated databases and application servers. It is possible that the default DHF ports (8010, 8011, 8012, 8013) may already be in use. In that case you will be warned about the conflicts and given the opportunity to change them. The DHF will not harm any existing settings.

How difficult is it to remove this tutorial when I am finished? Easy. Just click Settings in the QuickStart top navigation and then Uninstall Hub on the page that appears.

Procedure

  1. Install the Data Hub Framework
  2. Loading Products
    1. Create the Product Entity
    2. Create the Product Input Flow
    3. Load the Product Data As-Is
  3. Harmonizing Products
    1. Browse and Understand the Product Data
    2. Model the Product Entity
    3. Create a Model-to-Model Mapping for Product
    4. Harmonize the Product Data
  4. Loading Orders
    1. Create the Order Entity
    2. Create the Order Input Flow
    3. Load the Orders As-Is
  5. Harmonizing Orders
    1. Model the Order Entity
    2. Harmonize the Order Data
  6. Serve the Data Out of MarkLogic
  7. Wrapping Up

See Also