Data Mining Features

Data mining has been defined as "The nontrivial extraction of implicit, previously unknown, and potentially useful information from data".  It uses machine learning, statistical and visualization techniques to discovery and present knowledge in a form which is easily comprehensible to humans.

There are five basic data mining tasks associated with sensorMiner.  These are outlined here:

(1)      Indexing  -  This is the process of ordering and ranking various time series traces based on their shape similarity.  Traces that are very similar to each other would have indices that are close to each other.  The principal tool for indexing in sensorMINER is time warping.

(2)      Clustering – This is the process of finding logical groupings or states in the time series.  SensorMINER employs the Gecko clustering software for this purpose.  Gecko was developed by Florida Tech.

(3)      Classification – This is the process of labeling data. In sensorMINER we employ strictly 1 class, the normal or good class.  We find any deviation from the good class.  (In multi-class systems, classes are added for specific defect types.)

(4)      Summarization -  This is the process of condensing the raw data into a terse concise model.  SensorMINER utilizes several algorithms for this including rule induction, box modeling, and path modeling.

(5)      Anomaly DetectionThis is the process of finding data patterns that deviate substantially from training data.  Other terms for anomaly detection include “change-detection”, and “novelty detection”.  Anomaly detection is responsible for finding surprising, unusual, weird, and uncharacteristic data patterns.  SensorMINER employs several different algorithms for anomaly detection including Euclidean distance and finite state machine tracking. 

Click here for more information on Data Mining: View Technical Papers