What is Feature selection is Business Intelligense ?

March 01, 2022

The purpose of feature selection, also called feature reduction, is to eliminate from the dataset a subset of variables that are not deemed relevant for the purpose of the data mining activities. One of the most critical aspects in a learning process is the choice of the combination of predictive variables more suited to accurately explain the investigated phenomenon.

Feature reduction has several potential advantages. Due to the presence of fewer columns, learning algorithms can be run more quickly on the reduced dataset than on the original one. Moreover, the models generated after the elimination from the dataset of uninfluential attributes are often more accurate and easier to understand.

Feature selection methods can be classified into three main categories: filter methods, wrapper methods, and embedded methods.

Filter methods. Filter methods select the relevant attributes before moving on to the subsequent learning phase and are therefore independent of the specific algorithm being used.

The attributes deemed most significant are selected for learning, while the rest are excluded. Several alternative statistical metrics have been proposed to assess the predictive capability and relevance of a group of attributes.

Generally, these are monotone metrics in that their value increases or decreases according to the number of attributes considered. The simplest filter method to apply for supervised learning involves the assessment of every single attribute based on its level of correlation with the target. Consequently, this leads to the selection of the attributes that appear mostly correlated with the target.

Wrapper methods. If the purpose of the data mining investigation is classification or regression, and consequently performances are assessed mainly in terms of accuracy, the selection of predictive variables should be based not only on the level of relevance of every single attribute but also on the specific learning algorithm being utilized.

Wrapper methods are able to meet this need since they assess a group of variables using the same classification or regression algorithm used to predict the value of the target variable.

Each time, the algorithm uses a different subset of attributes for learning, identified by a search engine that works on the entire set of all possible combinations of variables, and selects the set of attributes that guarantees the best result in terms of accuracy.

Wrapper methods are usually burdensome from a computational standpoint since the assessment of every possible combination identified by the search engine requires one to deal with the entire training phase of the learning algorithm. An example of the use of a wrapper method for attribute selection is given in Section 8.5 in the context of multiple linear regression models.

Embedded methods. For the embedded methods, the attribute selection process lies inside the learning algorithm, so that the selection of the optimal set of attributes is directly made during the phase of model generation. Classification trees, described in Chapter 10, are an example of embedded methods. At each tree node, they use an evaluation function that estimates the predictive value of a single attribute or a linear combination of variables. In this way, the relevant attributes are automatically selected and they determine the rule for splitting the records in the corresponding node.

Filter methods are the best choice when dealing with very large datasets, whose observations are described by a large number of attributes. In these cases, the application of wrapper methods is inappropriate due to very long computation times.

Moreover, filter methods are flexible and in principle can be associated with any learning algorithm. However, when the size of the problem at hand is moderate, it is preferable to turn to the wrapper or embedded methods which afford in most cases accuracy levels that are higher compared to filter methods.

As described above, wrapper methods select the attributes according to a search scheme that inspects in sequence several subsets of attributes and applies the learning algorithm to each subset in order to assess the resulting accuracy of the corresponding model.

If a dataset contains n attributes, there are 2n possible subsets and therefore an exhaustive search procedure would require excessive computation times even for moderate values of n.

As a consequence, the procedure for selecting the attributes for wrapper methods is usually of a heuristic nature, based in most cases on a greedy logic which evaluates for each attribute a relevance indicator adequately defined and then selects the attributes based on their level of relevance.

In particular, three distinct myopic search schemes can be followed: forward, backward and forward-backward search.

Forward. According to the forward search scheme, also referred to as bottom-up search, the exploration starts with an empty set of attributes and subsequently introduces the attributes one at a time based on the ranking induced by the relevance indicator. The algorithm stops when the relevance index of all the attributes still excluded is lower than a prefixed threshold.

Backward. The backward search scheme also referred to as top-down search, begins the exploration by selecting all the attributes and then eliminates them one at a time based on the preferred relevance indicator. The algorithm stops when the relevance index of all the attributes still included in the model is higher than a prefixed threshold.

Forward-backward. The forward-backward method represents a trade-off between the previous schemes, in the sense that at each step the best attribute among those excluded is introduced and the worst attribute among those included is eliminated. Also, in this case, threshold values for the included and excluded attributes determine the stopping criterion.

The various wrapper methods differ in the choice of the relevance measure as well as the threshold preset values for the stopping rule of the algorithm.

Business Intelligence

What is Feature selection is Business Intelligense ?

0 Comments

ABOUT ME

SUBSCRIBE & FOLLOW

POPULAR POSTS

Categories

Advertisement

Your one and only destination on UI, UX and everything in between.

Ad Code

Report Abuse

Search This Notes

Blog Archive

Categories

Translate

Popular Posts

What is service? Explain with Example

Why do we need IT Service Management?

What is Data reduction ?

Footer Menu Widget

Social Plugin

Follow Me On Instagram

About Blog

Popular Posts

Labels

What is Feature selection is Business Intelligense ?

Related Articles

0 Comments

ABOUT ME

SUBSCRIBE & FOLLOW

POPULAR POSTS

Categories

Advertisement

Your one and only destination on UI, UX and everything in between.

Ad Code

Report Abuse

Search This Notes

Blog Archive

Categories

Translate

Popular Posts

What is service? Explain with Example

Why do we need IT Service Management?

What is Data reduction ?

Footer Menu Widget

Social Plugin

Trending Articles

Follow Me On Instagram

About Blog

Popular Posts

Labels