Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification

Aurangzeb , khan and Baharum, Baharudin and Khairullah, khan (2010) Efficient Feature Selection and Domain Relevance Term Weighting Method for Document Classification. In: 2010 Second International Conference on Computer Engineering and Applications.

[thumbnail of Efficient_Feature_Selection_and_Domain_Relevance_Term_Weighting_Method_for.pdf] PDF
Efficient_Feature_Selection_and_Domain_Relevance_Term_Weighting_Method_for.pdf - Published Version
Restricted to Registered users only

Download (279kB)

Abstract

Feature selection is of paramount concern in
document classification process which improves the efficiency
and accuracy of text classifier. Vector Space Model is used to
represent the “Bag of Word” BOW of the documents with
term weighting phenomena. Documents representing through
this model has some limitations that is, ignoring term
dependencies, structure and ordering of the terms in
documents. To overcome this problem semantic base feature
vector is proposed. That is used to extracts the concept of term,
co-occurring and associated terms using ontology. The
proposed method is applied on small documents dataset, which
shows that this method outperforms then term frequency/
inverse document frequency (TF-IDF) with BOW feature
selection method for text classification.

Item Type: Conference or Workshop Item (Paper)
Subjects: T Technology > T Technology (General)
Departments / MOR / COE: Departments > Computer Information Sciences
Depositing User: Dr Baharum Baharudin
Date Deposited: 26 Sep 2011 09:36
Last Modified: 19 Jan 2017 08:24
URI: http://scholars.utp.edu.my/id/eprint/6431

Actions (login required)

View Item
View Item