A linear wrapper method for detection of atypical points in classification.

Hashemi Mohammadabad, Saeed.

dc.contributor.author	Hashemi Mohammadabad, Saeed.	en_US
dc.date.accessioned	2014-10-21T12:36:57Z
dc.date.available	2005
dc.date.issued	2005	en_US
dc.identifier.other	AAINR00952	en_US
dc.identifier.uri	http://hdl.handle.net/10222/54680
dc.description	The detection of atypical data in a dataset, using a linear wrapper approach is the focus of this research. Atypical points are considered to be the misclassified points that the proposed algorithm (Atypical Sequential Removing: ASR) finds not useful to the classification task. They may include outliers and/or overlapping samples. The majority of the available atypical detection techniques apply a filter approach in which there is no requirement for the filter to be consistent with the classifier in use. The fastest available wrapper techniques, on the other hand, have a quadratic running time which is prohibitive in practice for sample subset selection. The approach presented in this research is a linear wrapper technique that, instead of using any predetermined criteria, uses only the classifier itself and a performance measure to identify atypical points in the data. As a result, it is expected to be more consistent with the classifier in use. Using a cross validation scheme, ASR manages to give a reliable test performance while identifying and ranking the atypical points in the whole dataset. To ensure that ASR does not remove informative misclassified points, Ada-boost was compared with S-boost (trained with the data without atypicals). The results showed that when a significant portion of misclassified points were removed from the training set, S-boost had a very close performance to Ada-boost. In the comparison between ASR and the Mahalanobis filter method, the results shows that ASR was more accurate in identifying atypical points, it was more consistent with the classifier in use by keeping its performance as high as the classifier with no removal from the training set, and it was able to remove 30% more points than the Mahalanobis filter. However, the assertions in the literature (removing some points from the training can enhance the performance of classifiers) were not confirmed for overall performance under the experimented linear wrapper. Experiments on 20 benchmark datasets and 7 classifiers show promising results and confirm that this linear wrapper method has some advantages and can be used for atypical detection.	en_US
dc.description	Thesis (Ph.D.)--Dalhousie University (Canada), 2005.	en_US
dc.language	eng	en_US
dc.publisher	Dalhousie University	en_US
dc.publisher		en_US
dc.subject	Computer Science.	en_US
dc.title	A linear wrapper method for detection of atypical points in classification.	en_US
dc.type	text	en_US
dc.contributor.degree	Ph.D.	en_US

Find Full text

Files in this item

Name:: NR00952.PDF
Size:: 5.849Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Faculty of Graduate Studies Online Theses

Show simple item record