|
Data Mining Features
Data mining has been defined as "The nontrivial extraction of implicit, previously unknown, and potentially useful information from data". It uses machine learning, statistical and visualization techniques to discovery and present knowledge in a form which is easily comprehensible to humans.
There are five basic data mining tasks associated with
sensorMiner. These are
outlined here:
(1)
Indexing
- This is the process
of ordering and ranking various time series traces based on their shape
similarity. Traces that are
very similar to each other would have indices that are close to each
other. The principal tool for
indexing in sensorMINER is time warping.
(2)
Clustering – This is the process of finding
logical groupings or states in the time series. SensorMINER employs the Gecko clustering software for
this purpose. Gecko was
developed by Florida Tech.
(3)
Classification – This is the process of
labeling data. In sensorMINER we
employ strictly 1 class, the normal or good class.
We find any deviation from the good class. (In multi-class systems, classes are added for specific
defect types.)
(4)
Summarization - This is the process of condensing the raw data into a terse
concise model. SensorMINER
utilizes several algorithms for this including rule induction, box
modeling, and path modeling.
(5)
Anomaly Detection – This is the process of finding data
patterns that deviate substantially from training data.
Other terms for anomaly detection include “change-detection”,
and “novelty detection”. Anomaly
detection is responsible for finding surprising, unusual, weird, and
uncharacteristic data patterns. SensorMINER
employs several different algorithms for anomaly detection including
Euclidean distance and finite state machine tracking.
Click here for more information on Data Mining: View Technical Papers