Queries & Data Exploration
The Query section in Knowi forms the data engineering foundation in Knowi.
High Level Features:
Data exploration, query generation capabilities, and writing your own queries
Execution of native Queries across NoSQL, SQL, REST APIs and other sources
Join multiple Data Sources on the fly without traditional ETL processing
Optional transformations via UI or Cloud9QL on top of the returned result set
Directly run queries real-time in your database or save the query results (into our ElasticStore)
Schedule query executions
Drag & drop analytics on the results
Reusable dataset, from which multiple visualizations can be created, as well use that dataset as an input into other queries
Apply Machine Learning models on the Query
Ask ad hoc question to Dataset in plain English using Search-Based Analytics
Create widget alerts & trigger notification alerts
Knowi supports native queries into NoSQL databases, SQL databases and other sources. This allows you to skip a traditional ETL processing where you may need to send the data into a structured warehouse. Additionally, you can join queries on the fly. Queries can be either realtime against your database, or optionally for long queries, store the results for faster processing/shield your database from frequent analytics workloads.
In Knowi, the results of any query (realtime/direct queries as well as non-direct queries), is considered a dataset. A dataset is a reusable component that can span multiple datasources/queries that also abstracts the underlying complexity of execution modalities, runtime parameters and transformations. Further, datasets can be reused as inputs into other queries.
From the dataset, you can create multiple visualizations, each with a transformed view on the original data from the dataset.
Here's a high-level diagram of Knowi:
In this documentation, we will explore the following flow:
Adding the Datasource > Generating the Query > Defining Data Execution Strategy > Dataset Settings (Optional) > Preview Final Results > Create & Run Dataset > Run Visualization on Dataset > View Data on Dashboard
Let's get started.
Connectivity & Datasources
Adding the Datasource and establishing network connectivity is the first step in the expedition of building visualizations with Knowi.
Please refer to the documentation on Connectivity and Datasources for more details.
Generating the Query
Depending on the type of Datasource you have selected, a list of collections/tables along with field samples (or column names in case of relational databases) will be automatically detected in the Data Explorer. With the help of these collections/tables, you can start generating the Query by either using the Visual Builder or by writing the query directly into the Query Editor.
Note: There are no data models to define or schema definitions required upfront.
Note: For unstructured/semi-structured Datasources, Knowi uses native APIs for the database to infer fields supported by the database, or fall back to sampling the data to determine fields.
Visual Builder allows you to generate the Query in a no-code environment.
After selecting the Collections/Tables from the Data Explorer, you can simply choose the Metrics, Dimensions, and Filters to generate the query without involving any coding skills.
|Allows you to perform aggregations such as sum, count, avg., median, etc. along with Date manipulations
|Allows you to group date-based fields based on date, week, etc.
|Allows you to sort the results
|Allows you restrict the number of records to be fetched from the Datasource
|Allows you to specify the filters
Note: You can also directly drag and drop the fields from Data Explorer to Visual Builder for generating Queries.
Collapse/Expand Data Explorer from Visual Query Builder
You can collapse or expand the data explorer section to adjust the available space.
If you are using a relational database like Amazon Redshift and have enabled Detect Schema (at the time of adding the Datasource), then selecting the metrics from the Data Explorer will have a dimming effect on the collection where the path needs to be allowing you to select multiple tables from the dropdown list.
The Query Editor is a versatile text editor designed for editing code and comes with a number of languages modes and add-ons that implement more advanced editing functionalities.
You can simply switch to Query Editor mode via a toggle switch and write the Query directly in the editor.
You can also use Query Editor to inspect the Metrics, Dimensions, and Filters applied in the Visual Builder and make necessary edits.
Some of the additional functionalities in the Query Editor includes key-maps as summarized in the table illustrated below:
|Jump to line
Queries can be templated with runtime parameters that can be passed in dynamically (from embedded applications to filters).
Please refer to the documentation on Runtime Parameters for more details.
Cloud9QL is a SQL-like syntax that can be used to post-process/transform the return data, to complement native Queries. This is not a replacement for the underlying Query but offers powerful analytics functions on the results returned.
Please refer to this documentation on Cloud9QL for more details.
Defining Data Execution Strategy
With this mode, you can directly execute the Query on the original Datasource, without any storage in between. In this case, when a widget is displayed, it will fetch the data in real-time from the underlying Datasource.
Note: The loading time under Direct Execution is directly proportional to the time it takes for the Query to execute.
For direct queries against your datasource, you can enable cache option for a given Query with parameters to cache the results for the specified period of time after initial load which results into the following advantages:
If your Queries are slow, it can reduce load on your database
Faster performance and improved user experience after the first load
Note: If runtime parameters are used, it'll cache for each combination of parameters.
For non-direct queries, results will be stored in Knowi's Elastic Store. This is useful in the following scenarios:
Reduce load on your database for reporting workloads by offloading it to the ElasticStore
The results can be used as a parent dataset to other derived queries off this dataset. Useful in cases where the resulting dataset tracked in the ElasticStore runs into the millions of records
In cases with large raw datasets in the underlying database, this can be used to incrementally update the ElasticStore for that dataset without running the entire historical query
With ElasticStore usage, Overwrite Strategy will provide you control on how the data is updated for the dataset when the Query is run. The values can be one of the following:
|Selecting this will replace the existing values for the same key with the latest and insert new records where such keys do not exist.
|Selecting this will replace all data in the current dataset with the latest run.
|Replace All - Include Empty
|Same as Replace All, except that in cases where no data is found for the query run, it will update the dataset with no data (whereas Replace All will not update)
|TTL - Time-Based Retention
|Selecting this will skip the records beyond a specific time period for each run. Example: DateField-3m will keep any records where DateField is within 3 months and will drop all prior records for each run.
|Selecting this will leave the existing data as it is while adding new records for each run. **Caution**: This is not common and you may end up with duplicates if data is the same.
Non-direct execution can be enforced if you choose to run the Query once or at scheduled intervals.
Select this option to execute the Query once.
Key: Key to base the replacement strategy where you want to replace/insert data based on Keys (for Upserts and TTL - Time-based Retention) use cases. For any same key-value found for the current query run, it'll update values in the existing dataset with the new values. For new values, it'll insert new records. Please note that Key is case-sensitive.
Execute Query at Scheduled Intervals
Select this option to execute Query at scheduled intervals. You can define the execution frequency in terms of minute(s), hour(s), day(s), and month(s) and the records shall be updated based on your selection.
Overwrite Strategy: You can define Overwrite Strategy to store the results into Knowi's ElasticStore and leverage on the benefits as discussed above. You can choose to ignore defining Overwrite Strategy if you want to append the data to the existing dataset.
Key: Key to base the replacement strategy off. For any same key-value found for the current query run, it'll update values in the existing dataset with the new values. For new values, it'll insert new records. Please note that Key is case-sensitive.
You can assign Name, Categories, and Description to your Dataset for easy management.
Categories helps you set bucket your queries (into a folder like structure) for easy search from the Query listing page.
You can also assign a name to your Dataset by navigating to the top-left corner of the interface and clicking on the Edit.
Preview Final Results
Click on the Preview button to analyze the results of your Query for
Fine-tuning the desired output
Note: You can also click on the caret alongside the Preview button to define the number of records to be fetched from the Datasource.
Once you are satisfied with the preview of the final result, you can proceed to save the query. Based on the type of data strategy selected, you will be presented with two different options:
- Save Dataset
- Save & Run
If Direct Query is selected as a data strategy, then Knowi will provide you the option to save the dataset. Upon clicking the Save Dataset button, the dataset configuration is saved but the query runs only when the user makes the request. Examples:
- Running a scheduled email report
- Loading the widgets in the dashboard
A modal window will appear and provide you with the option to go to the Dashboard, Queries List, or Dataset page. You can make the desired selection or alternatively click on the Continue Editing button to keep making the changes.
You can also click on the caret alongside the Save Dataset button and select Save & Refresh. Selecting this option will run the query and update the meta data.
Save & Run
If Non-Direct Query is selected as a data strategy, then Knowi will provide you the option to save and run the query. Upon clicking the Save & Run button, the user can either run the query once or at scheduled intervals. The results of the query are stored in Knowi which can be driven from the Knowi datastore.
You will be navigated directly to the dataset page providing you insights into query status, dataset lineage, data types, and much more.
You can also click on the caret next to Save & Run to save the query as dataset. Clicking on Save Dataset will let you save the query but would not allow creating any widgets or executing the query. Such queries will be in grey bulb status.
Create a Visualization on Dataset
Once you have successfully created the Dataset, you can create multiple different visualizations on the same dataset.
Click on the Go to Dataset page.
Click on the Create Visualization button on the top right corner of the interface to get started.
Here you can decide the type of visualization on the data you are looking for and create a new visualization, or apply transformations on top of the dataset and visualize it.
Data Transformation allows you to manipulate the data from the dataset. You can either use Search-Based Analytics or use the Drag & Drop feature.
Knowi allows you to ask questions in plain English using the search bar to fetch the data.
Note: You can also change, add, update, or delete the question(s) directly in the interface. In addition, you can make any changes to results.
See Self-Service Analytics for more details.
Drag & Drop
You can also perform drag and drop on the top of Dataset and Knowi will perform calculations to fetch the data accordingly.
See Self-Service Analytics for more details.
You can view the results for the same under the Visualization tab. If you've specified any manipulations on your data using Data Transformation, the visualization tab will reflect data after manipulation.
Any changes to the settings can be done via Settings, Chart Settings, Display & Formatting, and More Options.
Please refer to the documentation on Visualization & Chart Settings for more details.
Click the Save button on the top right corner of the interface to save the visualization.
A modal window will appear on your screen that will prompt you to Create Widget. Enter the name for your Widget/Chart and select the Dashboard to which you want your visualization to add to. Hit the Create button once done.
Note: You can also create a New Dashboard to add your visualization into.
View Data on Dashboard
For more information, please refer to the documentation on Dashboards.
Datasource, Query, & Widget Listings
Saved Datasources, Queries, and Widgets can be viewed anytime in the listings page from which you can access, edit, run, share, clone along with other actions.
For more information, please refer to the documentation on Datasource, Query, & Widget Listings
Reusing Your Dataset
Knowi provides the option for reusing an already created dataset, where you can link another existing dataset into a listed Query as per your data requirements.
Note: The original Dataset will be the parent and the new one will be derived from it.
The linked Datasets are also API enabled. You can fetch or query these datasets using our Management APIs. These Datasets can also be pushed into your APIs through webhooks on a scheduler or trigger condition.
Join Multiple Datasources
Knowi provides the option to connect multiple Datasources to process, blend and store combined results seamlessly.
Please refer to the documentation on Multi Data Source Joins for more details.
Apply ML Model
You can create Machine Learning models within the ML workspace within Knowi and use those models in your queries to blend your current data as well as the results from the model for the data. You can also define Models outside of Knowi and have those be referenced within Queries via a REST API.
Please refer to the documentation on Machine Learning for more details.