Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier
H.S. Hota1, Akhilesh Kumar Shrivas2, S.K. Singhai3

1Dr. H.S. Hota, Asst. Professor, Department of Computer and IT, Guru Ghasidas University Bilaspur, (C.G.), India.
2Akhilesh Kumar Shrivas, Research Scholar, CV. Raman University, Bilaspur (C.G.), India.
3Dr. S.K. Singhai, Department of Electronics Engineering, Govt. Engineering College, Bilaspur (C.G.), India.

Manuscript received on 21 January 2013 | Revised Manuscript received on 28 January 2013 | Manuscript published on 30 January 2013 | PP: 164-169 | Volume-1 Issue-6, January 2013 | Retrieval Number: F0427021613/2013©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Due to increased bandwidth and strong infrastructure available for accessing internet, internet users are growing rapidly. Internet users frequently use e-mail for fast data communication of audio, vedio and textual data but at the same time they are facing problem due to unwanted e-mail known as spam e-mail. In order to filter this unwanted e-mail, a classifier must be placed in the network or in computer. In this paper three different types of technique: Artificial Neural Network (ANN), Decision tree and statistical technique are explored for designing and developing e-mail classifier. Experimental work has been performed on e-mail data set obtained from UCI repository site and is partitioned into three different partitions to find out best suitable partition to be applied for various model. A suitable ensemble model is chosen based on various error measures calculated after training and testing the models. A final ensemble model is measured in terms of accuracy, precision, recall, Fmeasure and Gain Chart. Highest accuracy of 94.35% is obtained in case of ensemble of C5.0 and SVM with 60%-40% (training – testing) partition.
Keywords: C5.0, Support Vector Machine (SVM), Artificial Neural Network (ANN), Ensemble model.

Scope of the Article: Computer Network