MarkLogic Datasource Integration

Knowi enables data plumbing and visualizations from MarkLogic to go from data to visual interactive insights quickly.

Overview

  1. Connect, extract and transform data from your MarkLogic, using one of the following options:

    a. Through our UI to connect directly, if your MarkLogic servers are accessible from the cloud.

    b. Using our Cloud9Agent for datasources inside your network.

  2. Query, Visualize and track all your key metrics instantly.

UI Based Approach

Connecting

  1. Log in to Knowi and select Queries from the left sidebar.

  2. Click on New Datasource + button and select MarkLogic from the list of datasources.

  3. After navigating to the New Datasource page, either use the pre-configured settings into Cloud9 Chart's own demo MarkLogic database or follow the prompts and configure the following details to set up connectivity to your own MarkLogic database:

    a. Datasource Name: Enter a name for your datasource
    b. Host Name: Enter the host name to connect to
    c. Port: Enter the database port
    d. Database Name: Enter database name or leave empty to use default database
    e. User: Enter the User ID to connect
    f. Password: Enter the password to connect to the database
    g. Database Properties: Additional database connection properties/url parameters. For example, ssl=true&anotherProp=anotherVal. To set Connection to an SSL-enabled XDBC App Server, please set ssl=true.

  4. Establish Network connectivity and click on the Test Connection button.

    Note: The connection validity of the network can be tested only if it has been established via Direct Connectivity or an SSH tunnel. For more information on connectivity and datasource, please refer to the documentation on- Connectivity & Datasources.

  5. Click on Save and start Querying.

marklogic Connect

Query

Set up Query using a visual builder or query editor

Visual Builder

After connecting to the Couchbase datasource, Knowi will pull out a list of collections along with field samples.

Step1: After connecting to the MarkLogic datasource, Knowi will pull out a list of collections along with field samples. Using these tables, you can automatically generate queries through our visual builder in a no-code environment by either dragging and dropping fields or making your selections through the drop-down.

Query marklogic

Tip: You can also write queries directly in the Query Editor, a versatile text editor that offers more advanced editing functionalities like JavaScript/XQuery, support for multiple language modes, Cloud9QL, and more.

Step 2: Define data execution strategy by using any of the following two options:

Non-direct execution can be put into action if you choose to run the Query once or at scheduled intervals. For more information, feel free to check out this documentation- Defining Data Execution Strategy

Data Strategy marklogic

Step 3: Click on the Preview button to analyze the results of your Query and fine-tune the desired output, if required.

Preview Results

The result of your Query is called Dataset. After reviewing the results, name your dataset and then hit the Create & Run button.

Create and Run

Query Editor

A versatile text editor designed for editing code that comes with a number of language modes including Influx Query Language (IQL) and add-ons like Cloud9QL, and AI Assistant which empowers you with powerful transformations and analysis capabilities like prediction modeling and cohort analysis if you need it.

Create and Run

AI Assistant

AI assistant query generator automatically generates queries from plain English statements for searching the connected databases and retrieving information. The goal is to simplify and speed up the search process by automatically generating relevant and specific queries, reducing the need for manual input, and improving the probability of finding relevant information.

Step 1: Select Generate Query from AI Assistant dropdown and enter the details of the query you'd like to generate in plain English. Details can include table or collection names, fields, filters, etc.
Example: XQuery query to show description from feeds

Note: The AI Assistant uses OpenAI to generate a query and only the question is sent to OpenAI APIs and not the data.

Create and Run

Step 2: Define data execution strategy by using any of the following two options:

Non-direct execution can be put into action if you choose to run the Query once or at scheduled intervals. For more information, feel free to check out this documentation- Defining Data Execution Strategy

Data Strategy marklogic

Step 3: Click on the Preview button to analyze the results of your Query and fine-tune the desired output, if required.

Data Strategy MysqlDB

Note 1: The OpenAI must be enabled by the admin before using the AI Query Generator. 

Note 2: The user can copy the API key from the personal OpenAI account and use the same or use the default key provided by Knowi.

{Account Settings > Customer Settings > OpenAI Integration}

Furthermore, AI Assistant offers you additional features that can be performed on top of the generated query as listed below:

Explain Query

Provides explanations for your existing query. For example, an explanation requested for the query generated below AI Assistant has returned the description-

This MarkLogic query is declaring a variable, $feeds, which contains two XML elements, each with a title and description. The query then returns the description of each feed. The output of the query would be:

Find Issues

Helps in debugging and troubleshooting the query. For example, finding issues in the query generated below returns this error- The feeds is misspelled (should be "feeds")

Syntax Help

Ask questions around query syntax for this datasource. For example, suggesting the syntax for the requested query returned the response- "The following XQuery code can be used to display records from a MarkLogic database:

let $records := fn:collection("records") for $record in $records return {$record/title} {$record/description} "

Semantics SPARQL with XQuery

Semantic SPARQL can be executed using XQuery as following.

xquery version "1.0-ml";
import module namespace sem = "http://marklogic.com/semantics" at "/MarkLogic/semantics.xqy";
sem:sparql('
  <SPARQL QUERY>
  ')

Example:

xquery version "1.0-ml";
import module namespace sem = "http://marklogic.com/semantics" at "/MarkLogic/semantics.xqy";
sem:sparql('
  SELECT ?person
  WHERE { ?person <http://example.org/marklogic/predicate/livesIn> "London" }
  ')

For more details on Semantics and SPARQL, see MarkLogic Semantics Documentation.

AI Query Generator

The AI query generator automatically generates queries from plain English statements for searching the connected databases and retrieving information. The goal is to simplify and speed up the search process by automatically generating relevant and specific queries, reducing the need for manual input, and improving the probability of finding relevant information.

STEPS:

Step 1: Enter the details of the query you'd like to generate in plain English. Details can include table or collection names, fields, filters, etc. The AI generator uses OpenAI to generate a query. 

Only the question is sent to OpenAI APIs and not the data.

Example:

"Return Count of adjustedprice from claims2018age_gender (XML)"

Step 2: Click on the Run and the query will be generated which you can copy to the clipboard and paste into the Query Editor.

Step 3: Click on the Preview button to analyze the results of your Query and fine-tune the desired output, if required.

Note 1: The OpenAI must be enabled by the admin before using the AI Query Generator. 

Note 2: The user can copy the API key from the personal OpenAI account and use the same or use the default key provided by Knowi.

{Account Settings > Customer Settings > OpenAI Integration}

Cloud9Agent (StandAlone) Configuration

As an alternative to the UI based connectivity above, you can configure Cloud9Agent directly within your network (instead of the UI) to query MarkLogic. See Cloud9Agent to download and run your agent.

Highlights:

  • Pull data using XQuery and optionally manipulate the results further with Cloud9QL.
  • Execute queries on a schedule, or, one time.

The agent contains a datasource_example_markLogic.json and query_example_markLogic.json under the examples folder of the agent installation to get you started.

  • Edit those to point to your database and modify the queries to pull your data.
  • Move it into the config directory (datasource_XXX.json files first if the Agent is running).

Datasource Configuration:

Parameter Comments
name Unique Datasource Name.
datasource Set value to marklogic
host Host or IP to connect to
port Port to connect to
dbName claimsdemo
userId User id to connect, where applicable.
Password Password, where applicable
userId User id to connect, where applicable.

Query Configuration:

Query Config Params Comments
entityName Dataset Name Identifier
identifier A unique identifier for the dataset. Either identifier or entityName must be specified.
dsName Name of the datasource name configured in the datasource_XXX.json file to execute the query against. Required.
queryStr MarkLogic SQL query to execute. Required.
frequencyType One of minutes, hours, days,weeks,months. If this is not specified, this is treated as a one time query, executed upon Cloud9Agent startup (or when the query is first saved)
frequency Indicates the frequency, if frequencyType is defined. For example, if this value is 10 and the frequencyType is minutes, the query will be executed every 10 minutes
startTime Optional, can be used to specify when the query should be run for the first time. If set, the the frequency will be determined from that time onwards. For example, is a weekly run is scheduled to start at 07/01/2014 13:30, the first run will run on 07/01 at 13:30, with the next run at the same time on 07/08/2014. The time is based on the local time of the machine running the Agent. Supported Date Formats: MM/dd/yyyy HH:mm, MM/dd/yy HH:mm, MM/dd/yyyy, MM/dd/yy, HH:mm:ss,HH:mm,mm
c9QLFilter Optional post processing of the results using Cloud9QL. Typically uncommon against SQL based datastores.
overrideVals This enables data storage strategies to be specified. If this is not defined, the results of the query is added to the existing dataset. To replace all data for this dataset within Knowi, specify {"replaceAll":true}. To upsert data specify "replaceValuesForKey":["fieldA","fieldB"]. This will replace all existing records in Knowi with the same fieldA and fieldB with the the current data and insert records where they are not present.

Examples

Datasource Example:

[
   {
      "name": "demoMarkLogic",
      "host": "54.205.52.22",
      "port": "8010",
      "dbName": "claimsdemo",
      "userId": "user",
      "password": "pass",
      "datasource": "marklogic"
   }
]

Query Examples:

[
   {
      "entityName": "Total Claims",
      "queryStr": "let $sorted-claims :=\n    for $claim in collection(\"claimscsv\")/root\n    where $claim/id > 10190 and $claim/id < 10590\n    order by $claim/id\n    return $claim\nfor $claim at $count in subsequence($sorted-claims, 1, 10)\nreturn $claim",
      "c9QLFilter": "SELECT service_month, NET_PAID_AMT, BILL_AMT, MBR_AGE",
      "queryType": "XQuery",
      "dsName": "demoMarkLogic",
      "overrideVals": {
          "replaceAll": true
      },
      "frequencyType":"minute",
      "frequency":10
   }
]

The first query is run every 10 minutes at the top of the hour and replaces all data for that dataset in Knowi.