Logo

Semantic Based Features Selection and Weighting Method for Text Classification

Aurangzeb , khan and Baharum , Baharudin and Khairullah , khan (2010) Semantic Based Features Selection and Weighting Method for Text Classification. In: ITSIM'10, June 2010, Kuala Lumpur, Malaysia.

[img] PDF - Published Version
Restricted to Registered users only

349Kb
[img] PDF - Published Version
Restricted to Registered users only

349Kb

Abstract

Feature selection and weighting is of vital concern in text classification process which improves the efficiency and accuracy of text classifier. Vector Space Model is used to represent the documents using "Bag of Word" BOW model with term weighting phenomena. Documents representation through this model has some limitations that are, ignoring term dependencies, structure and ordering of the terms in documents. To overcome this problem, Semantics Base Feature Vector using Part of Speech (POS), is proposed, which is used to extract the concept of terms using WordNet, co-occurring and associated terms. The proposed method is applied on small documents dataset which shows that this method outperforms then term frequency/ inverse document frequency (TF-IDF) with BOW feature selection method for text classification.

Item Type:Conference or Workshop Item (Paper)
Subjects:T Technology > T Technology (General)
Departments / MOR / COE:Departments > Computer Information Sciences
ID Code:6432
Deposited By: Dr Baharum Baharudin
Deposited On:26 Sep 2011 09:36
Last Modified:19 Jan 2017 08:24

Repository Staff Only: item control page

Document Downloads

More statistics for this item...