About Extreme Search Visualization

Share

Introduction

Extreme Search Visualization (XSV), is designed as a "helper" app for Scianta Analytics' Extreme Search for Splunk. Co-developed by Scianta Analytics and Splunk Inc., Extreme Search (XS) is now part of the Splunk App for Enterprise Security (ES). XSV provides a robust set of tools to help you create, manage and explore Extreme Search knowledge objects. Additionally, XSV provides valuable documentation about Extreme Search beyond what is included in the app.

What is Extreme Search?

The Extreme Search Engine is an integrated collection of powerful cognitive-computing-based analytics functions engineered for speed and transportability. The suite provides fast, flexible, and comprehensive statistical reasoning, predictive analytics and conceptual query capabilities in any computing environment. Extreme Search for Splunk is an implementation of the Scianta Extreme Search Engine which is engineered specifically for Big Data using the Splunk platform. The engine is optimized to run multiple, parallel threads on each Splunk indexer and is compiled to operate natively in Linux, Windows or Mac OS. This allows Extreme Search to deliver extraordinary performance, even on gigantic data sets.

Extreme Search is, first and foremost, a powerful cognitive computing engine. It is designed to "think" the way you think and to "understand" concept-based queries written in natural language.

Converse with your data™

Splunk is arguably the world's most powerful Big Data engine. Extreme Search transforms Splunk into the world's most powerful human-centered, cognitive computing platform for Big Data. Extreme Search understands concepts in the same way you understand them. This allows you to converse with your data the same way you converse with your co-workers, using powerful qualitative concepts in place of arcane quantitative parameters.

Why is this important? Because it allows regular users to gain deep insight from their data simply by asking questions in a way they understand.

The Value Proposition in Enterprise Security

In addition to supporting qualitative expression in Splunk searches, Extreme Search is capable of periodically refining the meaning of concepts based on a changing environment. Splunk uses this capability to implement Dynamic Thresholding. In the Splunk App for Enterprise Security, this allows the use of semantic terms as thresholds, in place of hard-coded numerical values. For example, a correlation search in ES might produce a notable event if the number of infected hosts is "high". Prior to Extreme Search, that search contained some arbitrary value (200, for example) as a threshold value for a "high number of infected hosts". Is 200 an appropriate "high" threshold value? That depends entirely on the environment.

One of the challenges in implementing a complex, powerful Splunk app like ES has been the time it takes to review every correlation search in collaboration with a customer subject matter expert, replacing these arbitrary hard-coded threshold parameters with values that make sense for the organization. With Extreme Search, the correlation search might simply look for values that are "above low". Then a simple search can map the concept "low" to the organization's actual data. This has enabled a very real acceleration of time-to-value when deploying the Splunk App for Enterprise Security. ES can now be customized for a complex enterprise environment in hours instead of weeks.

Extreme Search also provides a very high level of granularity in setting thresholds. The notion of a "high number of login attempts" is unlikely to be the same on a Monday morning at 9AM as it would be on a Sunday at 2AM. Extreme Search can classify the context for number of login attempts based on any secondary control parameter: time of day; day of week; network type; product category; employee role, etc.. This ability to support Highly Granular, User-Defined, Dynamic Thresholding is one of the important reasons Extreme Search is now included with the Splunk App for Enterprise Security.

Command Categories

Extreme Search commands can be divided into three operational categories:

Conceptual Search

Qualitative, concept-based data exploration.

Statistical Reasoning

Powerful regression and correlation tools for data analysis.

Predictive Analytics

State-of-the-art machine learning tools. The Command Reference documentation page of this app describes each command in each of these three categories.

Conceptual Search

Terminology

To understand Extreme Search's concept-based search, it is useful to understand some terminology used in the product and its documentation:

Concept

An idea represented by a descriptive semantic term. In Extreme Search, these terms are usually user defined as part of a Context. Tall, Short, Fast and Slow are Semantic Terms used to describe Concepts.

Context

A collection of terms that form a conceptually coherent view of a knowledge domain. Height might be a Context comprised of the Terms Tall and Short. Speed might be a Context comprised of the Terms Fast, Typical and Slow.

Semantic Term

A linguistic representation of a Concept. The semantic term Tall might be used to represent the Concept of people of large physical stature as a part of the Context Height. For an introduction to conceptual search, click on "Learn More About Conceptual Search" on the "Overview" page in the XSV app, or select "Intro to Conceptual Search" from the XSV menu.

Statistical Reasoning

Scianta Extreme Search includes powerful statistical reasoning functions that facilitate analysis of very large data sets quickly and intuitively within Splunk. Extreme Search supports the following categories of Statistical Reasoning functions:

  • Linear Regression
  • Non-linear Regression
  • Auto Regression
  • Correlation

     

Learn more about Statistical Reasoning in Extreme Search in the Command Reference in the XSV app.

Predictive Analytics

The Scianta Extreme Search engine is part of the Scianta Analytics Cognitive Computing Suite. Our Cognitive Computing Suite delivers industry-leading, proprietary cognitive modeling, machine learning and predictive analytics to the world of Big Data. Extreme Search for Splunk includes concept-based predictive analytics functions that run natively, within Splunk. Learn more about Predictive Analytics in Extreme Search for Splunk in the Command Reference in the XSV app.

Using Extreme Search

We encourage you to review the Extreme Search Command Reference. Taken together, these commands deliver to Splunk users the power of Scianta Analytics' cognitive-computing-based conceptual search, statistical reasoning and predictive analytics technology natively, within the Splunk platform. As a query processor, the components can also be combined and used as a data filter to collect, filter, and rank information based on the qualitative semantics associated with each data element. Because Extreme Search is tightly integrated with Splunk, results are delivered alongside the results of native Splunk search commands.

Extreme Search is implemented as a set of extensions to Splunk's Search Processing Language (SPL). All Extreme Search functions may be entered as commands directly within the Splunk Search bar, or within scheduled searches and reports. Results are displayed within the Splunk web interface in the same manner as any other Splunk search. It is not necessary to use a specific Splunk app or an external interface to take advantage of conceptual search. Some Scianta Analytics cognitive computing suites, such as Scianta Analytics Extreme Vigilance™ , execute Extreme Search functions and display results through their own user interfaces. The Splunk App for Enterprise Security has integrated Extreme Search qualitative expression within many ES searches, reports and dashboards. Please see the documentation for these systems to learn how they work with Extreme Search for Splunk. Documentation on the use of Extreme Search within the Splunk App for Enterprise Security is available online here.

Extreme Search Architectural Hierarchy

Knowledge objects in Extreme Search follow this organizational hierarchy:

App

Any Splunk App may contain Extreme Search knowledge objects.

Container

An App may contain one or more XS Containers. A Container is a special kind of CSV-formatted lookup file designed to contain one or more Contexts. Containers are generally transparent to the XS knowledge object hierarchy. They can best be thought of as a collection of Contexts.

Context

Containers contain one or more Contexts. A Context is a semantically coherent set of Concepts. All of the Concepts assembled into a particular Context must apply to the same knowledge domain. The Concepts Tall and Short may share a Context. Tall and Fast may not.

Context Class

Contexts may be classified based upon the value of a secondary field. If a Context contains Concepts associated with Network_Latency, you might define a Context Class for each network_type and a Default Class that applies to all network_types. The Classes would be named based upon the potential values of the network_type field. The value of the network_type field can then be used to select the appropriate Context to be used in your search.  Your company's data may, for example, contain the following values for network_type: fast_ethernet, wifi, oc3, oc48, OC192, FDDI.  It would be reasonable to assume that Network_Latency metrics for each of these network types might differ.  Extreme Search allows you to define a Context Class for each of these network_type(s).  This allows the Concepts in the Context Network_Latency to be normalized, allowing you to ask Splunk to show you all of the networks, by network type, where the latency is somewhat long.

Concept

A Concept is a descriptive semantic term. It is represented as a two-dimensional array of points. The X axis corresponds to the value of the field in question. The Y axis represents the "membership" of that value in the Concept, stored as a value from 0 to 1.  We call the Y value the "Compatability Index" (CIX).  It's a measure of how compatable any value in the target field is with the Concept in question.

In our Network_Latency example, you might define the following three Concepts in the Context Network_Latency: short, nominal,long.  Any value stored in the field Network_Latency has some CIX from 0 to 1.  In a Network_Latency range of 10ms to 500ms, 10ms would have no membership in (or compatability with) the Concept "short" (CIX=0) and 100% (CIX=1) membership in the Concept "long".  In an even distribution of values of Network_Latency, one might expect 250ms to have 100% (CIX=1) membership in the Concept "nominal".

XS Architectural Heirarchy

 


XS Architectural Heirarchy

(CLICK TO ENLARGE)

The XSV Context Explorer dashboard allows you to visually navigate Extreme Search knowledge objects using this hierarchy.


XSV Context Explorer

Implementation

Extreme Search, as packaged with the Splunk App for Enterprise Security, is a Splunk Supporting Add-on called "Splunk_SA_ExtremeSearch" at $SPLUNK_HOME/etc/apps/Splunk_SA_ExtremeSearch.

NOTE: XSV requires Extreme Search version 6.0.6 or later, which is included in ES version 3.3.1 and later. Note also that it is not necessary to use the Extreme Search Visualization application or the Splunk App for Enterprise Security to use Extreme Search. Since Extreme Search is implemented as a set of extensions to Splunk's search language, once it is installed, it is available to any Splunk app.

Help Resources

Detailed information about each Extreme Search XS and XSV command is provided within your Splunk environment by Splunk's interactive help system. In addition, the Extreme Search Visualization application view includes Help pages like this one within the application view. Some important Help pages include:

Introduction to Conceptual Search
Hedges and Synonyms
Using Extreme Search Visualization
Instant Anomaly Detection
Extreme Search Command Reference
Scianta Online Docs: Extreme Search Visualization
Splunk Online Docs: Extreme Search