Tuesday, September 2, 2014

Data Mining Algorithms

Choosing the Right Algorithm

Choosing the best algorithm to use for a specific analytical task can be a challenge. While you can use different algorithms to perform the same business task, each algorithm produces a different result, and some algorithms can produce more than one type of result. For example, you can use the Microsoft Decision Trees algorithm not only for prediction, but also as a way to reduce the number of columns in a dataset, because the decision tree can identify columns that do not affect the final mining model.

Choosing an Algorithm by Type

Analysis Services includes the following algorithm types:
  • Classification algorithms predict one or more discrete variables, based on the other attributes in the dataset.
  • Regression algorithms predict one or more continuous variables, such as profit or loss, based on other attributes in the dataset.
  • Segmentation algorithms divide data into groups, or clusters, of items that have similar properties.
  • Association algorithms find correlations between different attributes in a dataset. The most common application of this kind of algorithm is for creating association rules, which can be used in a market basket analysis.
  • Sequence analysis algorithms summarize frequent sequences or episodes in data, such as a Web path flow.