Performance of Classification Algorithms for Prediction of Crop Cultivation by Reducing the Dimensionality of a Data Set
I. Rajeshwari1, K. Shyamala2
1Mrs. I.Rajeshwari,, Associate Professor of Computer Science, Queen Mary‘s College, Chenna, Tamilnadu, India. Email:
2Dr. K. Shyamala, Associate Professor of Computer Science, Dr. Ambedkar Govt. Arts College, Vyasarpadi, Chennai, Tamilnadu, India. Email: 

Manuscript received on November 15, 2019. | Revised Manuscript received on November 23, 2019. | Manuscript published on November 30, 2019. | PP: 101-105 | Volume-8 Issue-4, November 2019. | Retrieval Number: C6285098319/2019©BEIESP | DOI: 10.35940/ijrte.C6285.118419

Open Access | Ethics and Policies | Cite  | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Agriculture plays an important role in Indian economy. Maximizing the crop productivity in one of the main tasks that farmers are facing in their day to day life. They are also lacking in the basic knowledge of nutrient content of soil and selection of crops those best suits their soil thereby improving the crop productivity. In this work, the dataset has been taken from the soil test centres of Dindigul district, Tamilnadu. The parameters are the 12 various nutrients present in the soil samples collected from the different regions of Dindigul district. Using PCA, the dataset has been reduced to 8 parameters. Data Mining classification techniques like decision tree, KNN, Kernal SVM, Linear SVM, Logistic regression, Naive Bayes and Random forest are deployed on the original and dimensionality reduced datasets to predict the crops to be cultivated based on the availability of soil nutrient in the datasets. The performance of the algorithms are analysed based on certain metrics like Accuracy score, Cohen’s Kappa, Precision, Recall And F-Measures, Hamming Loss, Explained Variance Score, Mean Absolute Error, Mean Squared Error and Mean Squared Logarithmic Error. The Confusion matrix and Classification report are used for analysis. The Decision Tree is found to be the best algorithm for the soil datasets and dimensionality reduction does not affect the prediction.
Keywords: Classification, Data Mining, Decision tree, Soil nutrient.
Scope of the Article: Classification.