Kombinasi Pembobotan dan Orthogonalisasi pada Unsupervised Feature Selection

Abstract

Sometimes the features of the document is a noise, redundant, or irrelevant and its cause the result of document processing is bias. In this study, we propose feature selection using combination of Random Projection Gram Schmidt Orthogonalization (RPGSO) result and weighting result. Using RPGSO methods we will obtain the rank of term features of all docum ents based on its weight. To test the effectivity of this method, we will use it to cluster the dataset for different number of features based on RPGSO rank. The methode has been tested to datasets of news documents with F-M easure as evaluation criteria,. Based on the testing result, the proposed method generates clusters of documents with average of F-Measure criteria that 6% higher than RPGSO method.