Soil Quality Prediction — Bangladesh

RMSTU CSE MID-2 ML Research Project
AUTHORS: M Abdur Rabbi Tota  |  Prathay Barua  |  Md Mynuddin
Model: Random Forest + Bayesian Optimization (Optuna)  |  Dataset: SPAS-Dataset-BD  |  Target: AP Ratio (Production / Area)

Location and Crop

District
Season
Crop Name

Climate Conditions

10 45
20 100
10 50
5 42
20 100
10 100

Prediction

Engineered Features (auto-computed)

Model Metrics (test set)

Metric Value
0.5372
MAE 1.1596
RMSE 1.9064
MAPE 160404606.47%
10-Fold CV RMSE 1.8696
p-value (paired t-test) 0.0001

Best Hyperparameters (Optuna, 50 trials)

Parameter Value
n_estimators 251
max_depth 23
min_samples_leaf 3
min_samples_split 10
max_features sqrt

AP Ratio = Total Production ÷ Cultivated Area.
A higher ratio means more output per unit of land — a proxy for soil productivity under the given crop, season, and climate conditions.

Three Bangladesh-specific features are computed automatically from your inputs:

  • Monsoon Moisture Index — combines humidity, temperature, and monsoon season weight (1.5× for Kharif, 0.8× for Rabi).
  • Saltwater Intrusion Risk — non-zero only for the 16 coastal districts; peaks in Kharif 2 when tidal surges are worst.
  • Seasonal Soil Stress — measures stress from diurnal temperature and humidity swings, highest in Kharif 2.

These features were validated with SHAP; all three rank in the top contributors to model predictions.