Welcome to the Medical Cost Insurance Prediction Project! This application predicts medical insurance costs using advanced machine learning models. Built with a React frontend and a Flask backend, it allows users to input personal details (age, BMI, smoker status, etc.) and receive accurate cost predictions. The project showcases a full-stack data science pipeline, from data preprocessing to model deployment, finalized on June 21, 2025.
- Predict insurance costs with a Hybrid Ensemble model (XGBoost, CatBoost, LightGBM) achieving RΒ²: 0.8871, RMSE: $4048.21
- User-friendly React interface styled with Tailwind CSS
- Robust Flask API serving ML predictions
- Comprehensive documentation and Git workflow for collaboration
The project uses the insurance.csv dataset (1,338 records) with features like age, sex, bmi, children, smoker, and region to predict charges.
The backend processes data and runs ML models, while the frontend provides an interactive UI.
Ideal for:
- Data science enthusiasts
- Developers
- Academic projects
- Frontend: React, Tailwind CSS, JavaScript (CDN-hosted)
- Backend: Python, Flask, Scikit-learn, XGBoost, CatBoost, LightGBM
- Tools: Git, Node.js (for frontend dev), Python 3.8+, Conda/Pip, Vercel (for deployment)
health-cost-insurance/
β
βββ frontend/ # React frontend code
β βββ public/
β β βββ index.html
β βββ src/
β β βββ App.jsx
β β βββ components/
β β βββ styles/
β βββ package.json
β βββ vercel.json
β
βββ backend/ # Flask backend code
β βββ api/
β β βββ app.py
β βββ models/
β βββ data/
β β βββ insurance.csv
β βββ requirements.txt
β βββ utils/
β βββ vercel.json
β
βββ docs/
β βββ Medical_Cost_Insurance_Prediction_Final_Documentation.md
β
βββ .gitignore
βββ README.md
- Git, Node.js v16+, Python 3.8+, Conda (optional)
- Vercel CLI (
npm install -g vercel) - GitHub and Vercel accounts
git clone https://github.com/your-username/health-cost-insurance.git
cd health-cost-insurancecd backend
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
python api/app.pyRuns at: http://localhost:5000
cd ../frontend
npm install
npm startRuns at: http://localhost:3000
Open your browser at: http://localhost:3000
Input your details (age, BMI, etc.) to get a predicted insurance cost.
git init
git add .
git commit -m "Initial commit: Project setup"
git remote add origin https://github.com/your-username/health-cost-insurance.git
git branch -M main
git push -u origin maingit pull origin main # Get latest
# Edit files
git add .
git commit -m "Your message here"
git push origin maingit checkout -b feature/your-feature
# Make changes
git add .
git commit -m "Add feature"
git push origin feature/your-featureOpen a Pull Request on GitHub to merge.
backend/requirements.txtmust include:
flask==2.0.1
gunicorn
scikit-learn
xgboost
catboost
lightgbm
pandas
numpy
flask-cors
backend/vercel.json:
{
"version": 2,
"builds": [
{
"src": "api/app.py",
"use": "@vercel/python"
}
],
"routes": [
{
"src": "/(.*)",
"dest": "/api/app.py"
}
]
}- Push to GitHub:
cd backend
git add .
git commit -m "Configure Flask for Vercel"
git push origin main- Deploy on Vercel:
- Import GitHub repo
- Set backend/ as root
- Choose "Other" framework
- Deploy & get backend URL
frontend/vercel.json:
{
"rewrites": [
{
"source": "/api/(.*)",
"destination": "https://health-cost-insurance-backend.vercel.app/api/$1"
}
]
}-
Update API URLs in
App.jsxto use/api/ -
Push changes:
cd frontend
git add .
git commit -m "Configure React for Vercel"
git push origin main- Deploy on Vercel:
- Import GitHub repo
- Set frontend/ as root
- Detects React, deploys automatically
- Open frontend Vercel URL
- Test predictions
- Fix any errors via logs
| Model | RΒ² | RMSE ($) |
|---|---|---|
| XGBoost | 0.8685 | 4517 |
| Hybrid Ensemble | 0.8871 | 4048.21 |
The Hybrid Ensemble combines XGBoost, CatBoost, and LightGBM to explain 88.71% of the variance with low prediction error.
See docs/Medical_Cost_Insurance_Prediction_Final_Documentation.md for full analysis.
- Fork the repo
- Create a branch:
git checkout -b feature/your-feature - Commit:
git commit -m "Add feature" - Push:
git push origin feature/your-feature - Open a Pull Request
- Data preprocessing
- Model training
- Evaluation metrics
- Visualizations
See: docs/Medical_Cost_Insurance_Prediction_Final_Documentation.md
- Dataset:
insurance.csv(public domain) - Libraries: Scikit-learn, XGBoost, CatBoost, LightGBM, React, Flask
- Inspired by real-world healthcare cost prediction problems
Feel free to open an issue or contribute. Happy coding! π