A Hybrid Attribute Selection Approach for Text Classification

The application of text mining in organizations is growing. Text classification, an important type of text mining problem, is characterized by a large attribute space and entails an efficient and effective attribute selection procedure. There are two general attribute selection approaches: the filter approach and the wrapper approach. While the wrapper approach is potentially more effective in finding the best attribute subset, it is cost-prohibitive in most text classification applications. In this paper, we propose a hybrid attribute selection approach that is both efficient and effective for text classification problems. We apply the proposed approach to detect and prevent Internet abuse in the workplace, which is becoming a major problem in modern organizations. The empirical evaluations we conducted using a variety of classification algorithms, indexing schemes, and attribute selection methods demonstrate the utility of the proposed approach. We found that combining the filter and wrapper approaches not only boosts the accuracies of text classifiers but also brings down the computational costs significantly.

Publication

Journal of the Association for Information Systems Volume 11, Issue 9, pp. 491-518, September 2010

Authors

Chen-Huei Chou, Atish P. Sinha, and Huimin Zhao

About Journal of the Association for Information Systems

The Journal of the Association for Information Systems (JAIS), the flagship journal of the Association for Information Systems, publishes the highest quality scholarship in the field of information systems. It is inclusive in topics, level and unit of analysis, theory, method and philosophical and research approach, reflecting all aspects of Information Systems globally. The Journal promotes innovative, interesting and rigorously developed conceptual and empirical contributions and encourages theory based multi- or inter-disciplinary research.

JAIS has a strong reputation of publishing high quality theory focused articles in Information Systems field as testified by the rapidly increased rankings. The number of submissions, the number of published articles and the number of readers have been marked by a steady growth during the past five so that in all respects JAIS qualifies as an A level journal.