The geospatial query capabilities provided by GDS depend on service descriptors (which are also referred to as “models”). A service descriptor is a JSON document that defines a feature service and one or more layers. Each layer is defined by a data source, which specifies the set of data in MarkLogic that will be queried for that layer. Note that while the terminology and structure are very similar to how ArcGIS defines a feature service and layer, the intent is not to conform exactly to the ArcGIS definition.
The documentation below describes how to load and define service descriptors. For a working example, please see the example project in this repository.
If you plan on querying data via a MarkLogic TDE template, please read the instructions for creating a TDE first.
Defining a service descriptor
A service descriptor is a JSON object that consists of two required top-level keys - info
and layers
. Each of these is described below.
Info object
The info
field must be a JSON object containing the following fields:
name
= required; a unique string name for identifying the feature service.description
= optional string for describing the feature service.
For example, your initial descriptor file could look like this:
{
"info": {
"name": "MyService",
"description": "This is an example of a feature service descriptor file."
}
}
Layers array
The layers
field must be an array of JSON objects, one for each layer. Each layer is a JSON object that can contain at least the following fields:
id
= required; unique number for identifying the layer.name
= required; unique string for identifying the layer.description
= optional string for describing the layer.geometryType
= required string for identifying type of geometry of each feature. Supported values arePoint
andPolygon
.idField
= optional string for identifying a column in the associated TDE that contains a value for identifying a feature; defaults to “OBJECTID”.boundingQuery
= optional JSON object that captures a serialized CTS query which will constrain all queries on this layer for features.geometry
= required JSON object that defines the features associated with this layer; defined further below.dataSources
= array of JSON objects that define queryable data in MarkLogic; defined further below.schema
= optional string; defined further below.view
= optional string; defined further below.
The layer can contain any number of additional fields, which will be included when a client requests a service descriptor. For example, while GDS does not make use of an extent
field, a user will typically want to include that when using the MarkLogic Koop provider so that ArcGIS clients can leverage this field.
Geometry
The geometry
field defines how geometry data is represented in a feature (typically a document in MarkLogic, but not restricted to this) so that GDS can query and return this data.
The value of the field is a JSON object with the following keys:
format
= optional string that defaults togeojson
; must be one ofgeojson
,gml
,kml
,rss
,mcgm
,any
,cts
, orcustom
.coordinateSystem
= required string; must be one of the values listed for the coordinate-system option. See the MarkLogic documentation for more information on coordinate systems.pointFormat
= only used whenformat
isgml
; string that describes how a point is formatted; must be eitherdefault
(lat/long) orlong-lat-point
(long/lat)xpath
= optional string, but required if geometry must be returned as part of a query for features; XPath expression that points to the geometry data in a feature.source
= optional JSON object used in place ofxpath
; can have the following fields:format
= required string; must be one ofgeojson
,wkt
, orcts
.xpath
= optional string; XPath expression that points to the geometry data in a feature.column
= optional string; name of a feature column containing the feature’s geometry data.documentUriColumn
= optional string; name of a feature column, instead of the internal MarkLogic fragment ID column, used for performing a join when extracting geometry for a feature.
indexes
= optional JSON object that controls how geometry data is queried and extracted. Supports the following child keys:regionPath
= optional array of JSON objects for defining a geospatial region query to be used for constraining all queries on the layer. Each JSON object must have a key ofpath
and may also have a key ofcoordinateSystem
.element
= optional array of JSON objects; for use whenformat
iscustom
. Each object must be a serialized MarkLogic element index.elementChild
= optional array of JSON objects; for use whenformat
iscustom
. Each object must be a serialized MarkLogic element child index.elementPair
= optional array of JSON objects; for use whenformat
iscustom
. Each object must be a serialized MarkLogic element pair index.elementAttributePair
= optional array of JSON objects; for use whenformat
iscustom
. Each object must be a serialized MarkLogic element attribute pair index.path
= optional array of JSON objects; for use whenformat
iscustom
. Each object must be a serialized MarkLogic path index.
For most use cases, it should suffice to define the format
, coordinateSystem
, and xpath
fields to describe a path to the geometry data in a document, where a document is associated with a single feature.
Data sources
When defining a layer, a user has two choices for defining the source of data to be queried in MarkLogic:
- Use
schema
andview
to identify a TDE template. - Use the
dataSources
array.
The dataSources
array is intended to provide more flexibility by allowing for multiple sources of data that can be joined together. Each object in the array can have the following fields:
source
= required string; eitherview
orsparql
.schema
= required whensource
isview
; the schema associated with a TDE template.view
= required whensource
isview
; the view associated with a TDE template.query
= required whensource
issparql
; a SPARQL query.joinOn
= required for the second data source; a JSON object with two keys -left
andright
- that identify how data from this data source should be joined with data from the first data source. Theleft
key identifies a column in the first data source, whileright
identifies a column in the second data source. The object may optionally have ajoinType
key that defines the type of join; supported values areinner
(the default),left outer
, andfull outer
.joinFunction
= optional string; one ofjoinInner
(default),joinLeftOuter
, andjoinFullOuter
. Only applied when at least one other data source exists to be joined to the first data source.fields
= required JSON object whensource
issparql
; defines fields to add to each feature. Each key in this object is the name of an additional feature field. Each key is an object itself with ascalarType
key that identifies the type of column.includeFields
= optional array that identifies a subset of the field names to be included in each feature.fragmentIdColumn
= optional string for whensource
=view
; allows for specifying a column to be used for a fragment ID when performing a join. Specifically, this allows for passing in a value for thesystemCols
argument when GDS uses op.fromView to build an Optic pipeline based on a view.
Loading service descriptors
Each service descriptor must be loaded to the content database in your MarkLogic application. The following approach is recommended to accomplish this:
- Put each service descriptor JSON file in the
src/main/ml-data
directory in your ml-gradle project; you can store these in any child directory that you wish. - Add a
collections.properties
file to the directory containing your service descriptor files and add*=http://marklogic.com/feature-services
to the file. - Add a
permissions.properties
file to the directory containing your service descriptor files and add*=geo-data-services-reader,read,geo-data-services-writer,update
to it, swapping outgeo-data-services-reader
andgeo-data-services-writer
with application-specific roles if desired. - Run
./gradlew mlLoadData
or./gradlew mlDeploy
to load the service descriptor files in your content database.
Verifying a service descriptor
After loading your service descriptors into your MarkLogic application’s content database, you can verify that they are accessible via simple requests in your browser and via curl. Note that the URLs below are not yet considered part of GDS’s public interface as directly interacting with GDS is not yet documented nor supported; the expectation is that clients will use the MarkLogic Koop provider or similar tool that depends on GDS.
The examples below assume that you have installed the example project and thus use port 8095. Change this as needed for your own installation of GDS.
Additionally, it is recommended to authenticate as a user with the GDS roles as opposed to an admin or admin-like user. This user should also have at least read access to the data that can be queried via your service descriptors’ layers.
To see a list of service descriptors:
http://localhost:8095/v1/resources/modelService
To see a particular service descriptor (change the name of the rs:id
parameter to be that of the service descriptor you wish to see):
http://localhost:8095/v1/resources/modelService?rs:id=GDeltExample
To query for data, you will need to submit an HTTP POST request, which can be done via a tool like curl. You can use the statement below, changing user:password
for the user and password of a user with the GDS roles:
curl --anyauth -u user:password -X POST 'localhost:8095/v1/resources/geoQueryService' --header 'Content-Type:application/json' --data-raw '{"params":{"id":"GDeltExample","layer":0,"method":"query"},"query":{"returnCountOnly":true}}'