This repository provides selected scripts and notebooks related to inference and testing of the proposed machine learning models. The full training pipeline and raw dataset are not included due to data-sharing restrictions. This release is intended for the demonstration and reproducibility of the model evaluation workflow using processed sample data.
Efficient fertilizer use is important for sustainable intensification, yet uniform recommendations tend to ignore sharp spatial and seasonal variability in soils, climate, and crop response. This study develops a machine learning-based and constrained optimization framework to generate site-specific recommendation for nitrogen (N), phosphorus (P₂O₅), and potassium (K₂O) using a national-scale dataset of 7,180 Moroccan cereal data-points spanning three seasons and eight regions. A diverse suite of 47 model variants (linear, kernel, tree-based, ensemble, stacking, and neural architectures) was compared under random and temporal sampling regimes to evaluate interpolation versus forecast performance. The best-performing model achieved high yield-prediction accuracy under the random split (sMAPE
- Built a scalable machine learning and optimization pipeline on 7,180 Moroccan field trials (three seasons, eight regions), benchmarking 47 model variants (linear, kernel, tree-based, ensemble, stacking, and neural architectures) under random and temporal splits with model interpretability, and benchmarking 10 optimization algorithms (deterministic, stochastic, metaheuristic, learning-based, and hybrid) using top-performing machine learning models.
- Under the random regime, the best-performing model achieved a strong yield prediction accuracy of sMAPE ≈ 4.5% (
$R^2$ ≈ 0.96), capturing strong nonlinear effects driven by geospatial, seasonal, and nutrient-soil interaction features. Under the temporal (out-of-distribution) regime, the best-performing model reached sMAPE ≈ 17.8% ($R^2$ ≈ 0.17), where spatial structure and regional shifts were the dominant explanatory factors. - Metaheuristic optimization (Simulated Annealing, Bayesian Optimization, and Particle Swarm Optimization) generated site-specific NPK recommendations, increasing yields by up to 683 kg/ha (≈ 20% over a 3.4 t/ha baseline) while simultaneously improving nutrient-use efficiency under environmental constraints.
├── data/ # Dataset documentation and access notes
├── notebooks/ # Notebooks for data processing and feature engineering, model training, feature analysis, and optimization
├── results/ # Model evaluation, ensemble results, feature analysis, and optimization outputs
├── paper/ # Manuscript, figures, and supplementary material
├── README.md # Project overview and usage instructions
└── requirements.txt # Python dependencies
If you use this work, please cite:
Ennaji, O., Belgaid, A., & El Allali, A. Machine Learning-Based Optimization of Site-Specific NPK Fertilizer Recommendation. Mohammed VI Polytechnic University, Morocco. 2025.
- Oumnia Ennaji – [email protected]
- Abdelghani Belgaid – [email protected]
- Achraf El Allali (Corresponding Author) – [email protected]
This project is released under a non-commercial research use only license. Use, distribution, and modification are permitted strictly for academic and research purposes. Commercial use is prohibited without prior written permission.