Please use this identifier to cite or link to this item:
Title: Rough Set Extensions for Feature Selection
Authors: Mac Parthal?in, Neil
Issue Date: 23-Sep-2009
Publisher: Aberystwyth University
Description: Rough set theory (RST) was proposed as a mathematical tool to deal with the analysis of imprecise, uncertain or incomplete information or knowledge. It is of fundamental importance to artificial intelligence particularly in the areas of knowledge discovery, machine learning, decision support systems, and inductive reasoning. At the heart of RST is the idea of only employing the information contained within the data, thus unlike many other methods, probability distribution information or assignments are not required. RST relies on the concept of indiscernibility to group equivalent elements and generate knowledge granules. These granules are then used to build a structure to approximate a given concept. This framework has unsurprisingly proven successful for the application to the task of feature selection. Feature selection (FS) is a term given to the problem of selecting input attributes which are most predictive of a given outcome. Unlike other dimensionality reduction methods, feature selection algorithms preserve the original semantics of the features following reduction. This has been applied to tasks which involve datasets that contain huge numbers of features (in the order of tens of thousands), which would be impossible to process otherwise. Recent examples of such problems include text processing and web content classification. FS techniques have also been applied to small and medium-sized datasets in order to discover the most information-rich features. The application of rough sets for FS has resulted in many efficient algorithms. However, due to the granularity of the approximations generated by the rough set approach there is often a resulting level of uncertainty. This uncertainty in information is usually ignored for FS (by nature of the very fact that it is `uncertain'). In this thesis, a number of methods are proposed which attempt to use the uncertain information to improve the performance of rough sets and extensions thereof for the task of FS. These approaches are applied to two application domain problems where the reduction of features is of high importance; mammographic image analysis and complex systems monitoring. The utility of the approaches is demonstrated and compared empirically with several other dimensionality reduction techniques. In several experimental evaluation sections, the approaches are shown to equal or improve classification accuracy when compared to results obtained from unreduced data. Based on the new fuzzy-rough approaches, further developments and techniques are also presented in this thesis. The first of these is the application of a nearest neighbour classifier for the classification of real-valued data. This technique is evaluated within the mammographic imaging application. Also, a novel unsupervised feature selection approach is proposed which reduces features by eliminating those which are considered redundant. Both the fuzzy-rough classifier mentioned above, and UFRFS are employed and evaluated for the complex systems monitoring application.;EPSRC
Other Identifiers: N. Mac Parthal?in, Rough Set Extensions for Feature Selection, PhD Thesis, Aberystwyth University, Wales, 2009.
Type Of Material: OTHER
Appears in Collections:Computer Science

Files in This Item:
Click on the URI links for accessing contents.

Items in HannanDL are protected by copyright, with all rights reserved, unless otherwise indicated.