What is Principal component analysis in Business Intelligence ?
Principal component analysis (PCA) is the most widely known technique of attribute reduction by means of projection.
Generally speaking, the purpose of this method is to obtain a projective transformation that replaces a subset of the original numerical attributes with a lower number of new attributes obtained as their linear combination, without this change causing a loss of information.
Experience shows that a transformation of the attributes may lead to many instances of better accuracy in the learning models subsequently developed.
Before applying the principal component method, it is expedient to standardize the data, so as to obtain for all the attributes the same range of values, usually represented by the interval [−1, 1]. Moreover, the mean of each attribute aj is made equal to 0 by applying the transformation Let X denote the matrix resulting from applying the transformation (6.6) to the
X be the covariance matrix of the attributes (for original data, and let V = X )
a definition of the covariance and variance matrices, see Section 7.3.1). If the correlation matrix is used to develop the principal component analysis method
instead of the covariance matrix, the transformation (6.6) is not required. Starting from the n attributes in the original dataset, represented by the matrix X, the principal component method derives n orthogonal vectors, namely the principal components, which constitute a new basis of the space Rn.
Principal components are better suited than the original attributes to explain fluctuations in the data, in the sense that usually a subset consisting of q principal components, with q<n, has an information content that is almost equivalent to that of the original dataset. As a consequence, the original data are projected into a lower-dimensional space of dimension q having the same explanatory capability.
Principal components are generated in sequence by means of an iterative algorithm. The first component is determined by solving an appropriate optimization problem, in order to explain the highest percentage of variation in the data.
At each iteration, the next principal component is selected, among those vectors that are orthogonal to all components already determined, as the one which explains the maximum percentage of variance not yet explained by the previously generated components. At the end of the procedure, the principal components are ranked in non-increasing order with respect to the amount of variance that they are able to explain.
0 Comments