Optimizing Crop Yield Prediction: Data-Driven Analysis and Machine Learning Modeling Using USDA Datasets

Ravindra Yadav,^1* Anita Seth² and Naresh Dembla³

¹Department of Information Technology, Institute of Engineering and Technology, Devi Ahilya Vishwavidyalaya, Khandwa Road Indore, Madhya Pradesh, India.

²Department of Electronic and Telecommunication, Institute of Engineering and Technology, Devi Ahilya Vishwavidyalaya, Khandwa Road Indore, Madhya Pradesh, India.

³Computer Department, Institute of Professional Studies, Institute of Engineering and Technology, Devi Ahilya Vishwavidyalaya, Khandwa Road Indore, Madhya Pradesh, India.

Corresponding Author E-mail: Ryadav@ietdavv.edu.in

DOI : http://dx.doi.org/10.12944/CARJ.12.1.22

Article Publishing History

Received: 13 Dec 2023
Accepted: 27 Jan 2024
Published Online: 23 Feb 2024

Review Details

Reviewed by: Dr. Jerald Anthony Esteban
Second Review by: Dr. Muhammad Usman
Final Approval by: Dr. R. Pandiselvam

Article Metrics

Views Views: 2,530 PDF Downloads: 574

Abstract:

This research uses a variety of machine learning models and exploratory data analysis (EDA) to forecast crop yields using USDA information from 2003 to 2013 in an effort to achieve precision agriculture. Not only did we want to predict agricultural output, but we also wanted to identify the underlying factors that affect yield. By means of thorough EDA, which encompassed a wide range of agricultural data, including weather patterns and USDA-sourced soil composition, we were able to gain important insights into the variables that impact differences in crop output. The thorough investigation that followed served as the basis for our machine learning modelling. We thoroughly assessed and contrasted the performance of a variety of machine learning algorithms, including Bagging Regressor, KNN, Decision Trees, Gradient Boost, Random Forest, and Linear Regression. The accuracy of the models varied noticeably, as the results showed: the Random Forest, Decision Trees, and Bagging Regressor models showed great accuracy, with respective values of 98.56%, 97.62%, and 98.59%. Conversely, KNN and Linear Regression showed reduced accuracy, indicating their limits in this situation. The robustness of our results was further improved by applying k-fold cross-validation, highlighting the significance of model validation in crop yield prediction. Some models showed changes in accuracy during cross-validation, which revealed more about their dependability. In addition to providing a thorough investigation of the variables affecting agricultural productivity, this study highlights the diverse forecasting powers of machine learning models. Our findings provide a path for well-informed agricultural decision-making by utilizing technology to optimize crop production estimates. The ultimate goal of this research is to support stakeholders in optimizing agricultural productivity and enable sustainable practices.

Keywords:

Crop yielded; Decision Trees; Gradient Boost; KNN; Linear Regression; Random Forest; XGBoost

Download this article as:

Copy the following to cite this article:

Yadav R., Seth A., Dembla N. Optimizing Crop Yield Prediction: Data-Driven Analysis and Machine Learning Modeling Using USDA Datasets. Curr Agri Res 2024; 12(1). doi : http://dx.doi.org/10.12944/CARJ.12.1.22

Copy the following to cite this URL:

Yadav R., Seth A., Dembla N. Optimizing Crop Yield Prediction: Data-Driven Analysis and Machine Learning Modeling Using USDA Datasets. Curr Agri Res 2024; 12(1). Available from: https://bit.ly/3UPOPNi

[ HTML Full Text]

Back to TOC

MenuMenu

Optimizing Crop Yield Prediction: Data-Driven Analysis and Machine Learning Modeling Using USDA Datasets