Queries & Data Exploration

The Query section in Knowi forms the data engineering foundation in Knowi.

High Level Features:

  • Data exploration, query generation capabilities, and writing your own queries

  • Execution of native Queries across NoSQL, SQL, REST APIs and other sources

  • Join multiple Data Sources on the fly without traditional ETL processing

  • Optional transformations via UI or Cloud9QL on top of the returned result set

  • Directly run queries real-time in your database or save the query results (into our ElasticStore)

  • Schedule query executions

  • Drag & drop analytics on the results

  • Reusable dataset, from which multiple visualizations can be created, as well use that dataset as an input into other queries

  • Apply Machine Learning models on the Query

  • Ask ad hoc question to Dataset in plain English using Search-Based Analytics

  • Create widget alerts & trigger notification alerts

Overview

Knowi supports native queries into NoSQL databases, SQL databases and other sources. This allows you to skip a traditional ETL processing where you may need to send the data into a structured warehouse. Additionally, you can join queries on the fly. Queries can be either realtime against your database, or optionally for long queries, store the results for faster processing/shield your database from frequent analytics workloads.

In Knowi, the results of any query (realtime/direct queries as well as non-direct queries), is considered a dataset. A dataset is a reusable component that can span multiple datasources/queries that also abstracts the underlying complexity of execution modalities, runtime parameters and transformations. Further, datasets can be reused as inputs into other queries.

From the dataset, you can create multiple visualizations, each with a transformed view on the original data from the dataset.

Here's a high-level diagram of Knowi:

image alt text

In this documentation, we will explore the following flow:

Adding the Datasource > Generating the Query > Defining Data Execution Strategy > Dataset Settings (Optional) > Preview Final Results > Create & Run Dataset > Run Visualization on Dataset > View Data on Dashboard

Let's get started.

Connectivity & Datasources

Adding the Datasource and establishing network connectivity is the first step in the expedition of building visualizations with Knowi.

Please refer to the documentation on Connectivity and Datasources for more details.

Generating the Query

Depending on the type of Datasource you have selected, a list of collections/tables along with field samples (or column names in case of relational databases) will be automatically detected in the Data Explorer. With the help of these collections/tables, you can start generating the Query by either using the Visual Builder or by writing the query directly into the Query Editor.

Note: There are no data models to define or schema definitions required upfront.

Note: For unstructured/semi-structured Datasources, Knowi uses native APIs for the database to infer fields supported by the database, or fall back to sampling the data to determine fields.

Visual Builder

Visual Builder allows you to generate the Query in a no-code environment.

After selecting the Collections/Tables from the Data Explorer, you can simply choose the Metrics, Dimensions, and Filters to generate the query without involving any coding skills.

image alt text

FIELDS PURPOSE
Metrics Allows you to perform aggregations such as sum, count, avg., median, etc. along with Date manipulations
Dimensions Allows you to group date-based fields based on date, week, etc.
Sort Allows you to sort the results
Limit Allows you restrict the number of records to be fetched from the Datasource
Filters Allows you to specify the filters

Note: You can also directly drag and drop the fields from Data Explorer to Visual Builder for generating Queries.

image alt text

Collapse/Expand Data Explorer from Visual Query Builder

You can collapse or expand the data explorer section to adjust the available space.

image alt text

If you are using a relational database like Amazon Redshift and have enabled Detect Schema (at the time of adding the Datasource), then selecting the metrics from the Data Explorer will have a dimming effect on the collection where the path needs to be allowing you to select multiple tables from the dropdown list.

image alt text

Query Editor

The Query Editor is a versatile text editor designed for editing code and comes with a number of languages modes and add-ons that implement more advanced editing functionalities.

You can simply switch to Query Editor mode via a toggle switch and write the Query directly in the editor.

image alt text

You can also use Query Editor to inspect the Metrics, Dimensions, and Filters applied in the Visual Builder and make necessary edits.

image alt text

Some of the additional functionalities in the Query Editor includes key-maps as summarized in the table illustrated below:

Functionality Windows Mac
Search Ctrl-F Cmd-F
Find next Ctrl-G Cmd-G
Find previous Shift-Ctrl-G Shift-Cmd-G
Replace Shift-Ctrl-F Cmd-Option-F
Replace all Shift-Ctrl-R Shift-Cmd-Option-F
Jump to line Alt-G Cmd-G

Runtime Parameters

Queries can be templated with runtime parameters that can be passed in dynamically (from embedded applications to filters).

Please refer to the documentation on Runtime Parameters for more details.

Cloud9QL

Cloud9QL is a SQL-like syntax that can be used to post-process/transform the return data, to complement native Queries. This is not a replacement for the underlying Query but offers powerful analytics functions on the results returned.

Please refer to this documentation on Cloud9QL for more details.

Defining Data Execution Strategy

Direct Execution

With this mode, you can directly execute the Query on the original Datasource, without any storage in between. In this case, when a widget is displayed, it will fetch the data in real-time from the underlying Datasource.

Note: The loading time under Direct Execution is directly proportional to the time it takes for the Query to execute.

For direct queries against your datasource, you can enable cache option for a given Query with parameters to cache the results for the specified period of time after initial load which results into the following advantages:

  • If your Queries are slow, it can reduce load on your database

  • Faster performance and improved user experience after the first load

Note: If runtime parameters are used, it'll cache for each combination of parameters.

image alt text

Non-Direct Execution

For non-direct queries, results will be stored in Knowi's Elastic Store. This is useful in the following scenarios:

  • Long-running queries

  • Reduce load on your database for reporting workloads by offloading it to the ElasticStore

  • The results can be used as a parent dataset to other derived queries off this dataset. Useful in cases where the resulting dataset tracked in the ElasticStore runs into the millions of records

  • In cases with large raw datasets in the underlying database, this can be used to incrementally update the ElasticStore for that dataset without running the entire historical query

With ElasticStore usage, Overwrite Strategy will provide you control on how the data is updated for the dataset when the Query is run. The values can be one of the following:

Overwrite Strategy Execution
Upsert Selecting this will replace the existing values for the same key with the latest and insert new records where such keys do not exist.
Replace All Selecting this will replace all data in the current dataset with the latest run.
Replace All - Include Empty Same as Replace All, except that in cases where no data is found for the query run, it will update the dataset with no data (whereas Replace All will not update)
TTL - Time-Based Retention Selecting this will skip the records beyond a specific time period for each run. Example: DateField-3m will keep any records where DateField is within 3 months and will drop all prior records for each run.
Append Selecting this will leave the existing data as it is while adding new records for each run. **Caution**: This is not common and you may end up with duplicates if data is the same.

Non-direct execution can be enforced if you choose to run the Query once or at scheduled intervals.

Run Once

Select this option to execute the Query once.

image alt text

Key: Key to base the replacement strategy where you want to replace/insert data based on Keys (for Upserts and TTL - Time-based Retention) use cases. For any same key-value found for the current query run, it'll update values in the existing dataset with the new values. For new values, it'll insert new records. Please note that Key is case-sensitive.

Execute Query at Scheduled Intervals

Select this option to execute Query at scheduled intervals. You can define the execution frequency in terms of minute(s), hour(s), day(s), and month(s) and the records shall be updated based on your selection.

image alt text

Overwrite Strategy: You can define Overwrite Strategy to store the results into Knowi's ElasticStore and leverage on the benefits as discussed above. You can choose to ignore defining Overwrite Strategy if you want to append the data to the existing dataset.

Key: Key to base the replacement strategy off. For any same key-value found for the current query run, it'll update values in the existing dataset with the new values. For new values, it'll insert new records. Please note that Key is case-sensitive.

Dataset Settings

You can assign Name, Categories, and Description to your Dataset for easy management.

image alt text

Categories helps you set bucket your queries (into a folder like structure) for easy search from the Query listing page.

You can also assign a name to your Dataset by navigating to the top-left corner of the interface and clicking on the Edit.

image alt text

Preview Final Results

Click on the Preview button to analyze the results of your Query for

  • Fine-tuning the desired output

  • Configuring visualizations

image alt text

image alt text

Note: You can also click on the caret alongside the Preview button to define the number of records to be fetched from the Datasource.

image alt text

Save Results

Once you are satisfied with the preview of the final result, you can proceed to save the query. Based on the type of data strategy selected, you will be presented with two different options:

  • Save Dataset
  • Save & Run

Save Dataset

If Direct Query is selected as a data strategy, then Knowi will provide you the option to save the dataset. Upon clicking the Save Dataset button, the dataset configuration is saved but the query runs only when the user makes the request. Examples:

  • Running a scheduled email report
  • Loading the widgets in the dashboard

image alt text

A modal window will appear and provide you with the option to go to the Dashboard, Queries List, or Dataset page. You can make the desired selection or alternatively click on the Continue Editing button to keep making the changes.

image alt text

You can also click on the caret alongside the Save Dataset button and select Save & Refresh. Selecting this option will run the query and update the meta data.

image alt text

Save & Run

If Non-Direct Query is selected as a data strategy, then Knowi will provide you the option to save and run the query. Upon clicking the Save & Run button, the user can either run the query once or at scheduled intervals. The results of the query are stored in Knowi which can be driven from the Knowi datastore.

image alt text

You will be navigated directly to the dataset page providing you insights into query status, dataset lineage, data types, and much more.

image alt text

You can also click on the caret next to Save & Run to save the query as dataset. Clicking on Save Dataset will let you save the query but would not allow creating any widgets or executing the query. Such queries will be in grey bulb status.

image alt text

Create a Visualization on Dataset

Once you have successfully created the Dataset, you can create multiple different visualizations on the same dataset.

Click on the Go to Dataset page.

image alt text

Click on the Create Visualization button on the top right corner of the interface to get started.

image alt text

Here you can decide the type of visualization on the data you are looking for and create a new visualization, or apply transformations on top of the dataset and visualize it.

Data Transformation

Data Transformation allows you to manipulate the data from the dataset. You can either use Search-Based Analytics or use the Drag & Drop feature.

image alt text

Search-Based Analytics

Knowi allows you to ask questions in plain English using the search bar to fetch the data.

image alt text

image alt text

Note: You can also change, add, update, or delete the question(s) directly in the interface. In addition, you can make any changes to results.

image alt text

See Self-Service Analytics for more details.

Drag & Drop

You can also perform drag and drop on the top of Dataset and Knowi will perform calculations to fetch the data accordingly.

image alt text

See Self-Service Analytics for more details.

Visualization

You can view the results for the same under the Visualization tab. If you've specified any manipulations on your data using Data Transformation, the visualization tab will reflect data after manipulation.

image alt text

Any changes to the settings can be done via Settings, Chart Settings, Display & Formatting, and More Options.

Please refer to the documentation on Visualization & Chart Settings for more details.

Click the Save button on the top right corner of the interface to save the visualization.

image alt text

A modal window will appear on your screen that will prompt you to Create Widget. Enter the name for your Widget/Chart and select the Dashboard to which you want your visualization to add to. Hit the Create button once done.

image alt text

Note: You can also create a New Dashboard to add your visualization into.

image alt text

View Data on Dashboard

For more information, please refer to the documentation on Dashboards.

Datasource, Query, & Widget Listings

Saved Datasources, Queries, and Widgets can be viewed anytime in the listings page from which you can access, edit, run, share, clone along with other actions.

For more information, please refer to the documentation on Datasource, Query, & Widget Listings

Reusing Your Dataset

Knowi provides the option for reusing an already created dataset, where you can link another existing dataset into a listed Query as per your data requirements.

Note: The original Dataset will be the parent and the new one will be derived from it.

image alt text

The linked Datasets are also API enabled. You can fetch or query these datasets using our Management APIs. These Datasets can also be pushed into your APIs through webhooks on a scheduler or trigger condition.

Join Multiple Datasources

Knowi provides the option to connect multiple Datasources to process, blend and store combined results seamlessly.

Please refer to the documentation on Multi Data Source Joins for more details.

Apply ML Model

You can create Machine Learning models within the ML workspace within Knowi and use those models in your queries to blend your current data as well as the results from the model for the data. You can also define Models outside of Knowi and have those be referenced within Queries via a REST API.

Please refer to the documentation on Machine Learning for more details.