An end-to-end deep learning pipeline for autonomous steering angle prediction using the Udacity Self-Driving Car Simulator.
- Team Members
- Project Overview
- Roadmap
- Data Collection
- Data Processing & Augmentation
- Dataset & DataLoader
- Model Architecture
- Training
- Deployment & Real-Time Inference
- Installation
- Usage
- Results & Visualizations
- License
- Acknowledgments
- Flie Structure
- Abdelrhman Salah Kamal Sayed
- Ibrahiem Montaser Fathy Elsfary
- Adel Tamer Adel Badran
- Sarah Elsayed Ahmed Zahran
- Hend Khalid Anwar Elkholy
- Doha Mohamed Abd Almajeed Darwish
- Hasnaa Mohamed El Dasoky
This repository implements a complete behavioral cloning pipeline that predicts continuous steering angles directly from front-facing RGB camera images. The system is trained on data collected from the Udacity Self-Driving Car Simulator and achieves real-time autonomous driving performance within the same environment.
Key Features:
- Multi-camera data utilization (center, left, right) with steering angle correction (Β±0.15)
- Steering distribution balancing via histogram-based undersampling (25 bins, 1600 samples per bin)
- NVIDIA PilotNet-inspired preprocessing pipeline (crop β YUV conversion β blur β resize to 66Γ200)
- Comprehensive data augmentation including random flip, pan, affine scaling, and brightness adjustment
- Real-time inference server with dynamic throttle control and steering smoothing
| Phase | Status |
|---|---|
| Data acquisition & preparation | β Completed |
| Preprocessing & augmentation | β Completed |
| Dataset & DataLoader | β Completed |
| NVIDIA Model implementation | β Completed |
| Training & validation pipeline | β Completed |
| Deployment (simulator control) | β Completed |
Primary Source: Udacity Self-Driving Car Simulator (recording mode)
Dataset Used: Kaggle β zaynena/selfdriving-car-simulator
Dataset Structure:
track1data/β 31,845 samples (Track 1 only)track2data/β 65,484 samples (Track 2 only)dataset/β 97,329 samples (both tracks combined) β Used in this project
Data Format (driving_log.csv):
Each row contains seven columns: Center, Left, Right, Steering, Throttle, Brake, Speed. The three camera images are synchronized at each timestep, and side cameras are augmented with Β±0.15 steering correction to simulate recovery behavior from lane edges.
Steering Distribution:
The original dataset exhibits significant bias toward near-zero steering angles. To address this imbalance, histogram-based undersampling is applied across 25 bins, limiting each bin to 1600 samples. Negative steering values indicate right turns, while positive values represent left turns.
1. Crop: y=60:135, x=0:320 (remove sky and vehicle hood)
2. Convert: RGB β YUV color space
3. Blur: Gaussian blur (3Γ3 kernel)
4. Resize: 200Γ66 (width Γ height) using INTER_AREA interpolation
5. Normalize: [0, 255] β [0, 1]
6. Transform: HWC β CHW tensor format- Random Horizontal Flip (50% probability): Mirrors image and inverts steering angle
- Random Horizontal Pan (Β±10% range): Simulates lateral position variation with corresponding steering adjustment
- Affine Scaling (1.0β1.4Γ, 50% probability): Varies apparent distance to simulate depth changes
- Brightness Adjustment (-0.8 to +0.2, 50% probability): Simulates varying lighting conditions
All augmentations are implemented using Albumentations for efficient GPU-accelerated performance.
- Custom PyTorch
DrivingDatasetclass with integrated augmentation support - Multi-camera expansion: 3Γ data multiplication (center, left, right cameras)
- Train/validation split: 80/20 with fixed random seed (
random_state=42) - Batch size: 64
- DataLoader workers: 4 with pinned memory for optimized GPU transfer
Final Dataset Sizes:
- Training: ~77,863 samples (after balancing and camera expansion)
- Validation: ~19466 samples
The system implements an end-to-end learning approach using a Convolutional Neural Network to map raw pixel data from a single front-facing camera directly to steering commands, optimizing all processing steps simultaneously. This architecture is based on NVIDIA's proven 2016 design for autonomous driving.
Network Structure:
The network consists of 9 layers: one normalization layer (flatten), five convolutional layers, and three fully connected layers. Input images are converted to YUV color space before being fed into the network.
This project also explores a Vision Transformer approach for end-to-end steering angle regression, extending beyond traditional CNN-based methods.
- Backbone:
vit_tiny_patch16_224(via timm), pretrained on ImageNet, selected for optimal balance between computational efficiency and feature extraction - Custom Head: Classification head replaced with a regression MLP comprising linear layers, GELU activations for smoother gradients, and dropout (0.1) for regularization
- Global Pooling: Global Average Pooling (GAP) condenses spatial patch tokens into a comprehensive global context vector
Training Strategy (Two-Stage Fine-Tuning):
To prevent catastrophic forgetting of pretrained weights, a two-stage optimization strategy is employed:
- Warm-up / Linear Probing (Epochs 0-3): Backbone frozen; only the regression head is trained with learning rate 1e-4 to adapt to the driving domain
- Full Fine-Tuning (Epoch 4+): Entire network unfrozen with learning rate decayed to 1e-5 (weight decay 1e-6) to refine feature representations while preserving prior knowledge
The training process utilizes Mean Squared Error (MSE) loss for steering angle regression with the Adam optimizer. Model checkpoints are saved at regular intervals, and the best-performing model based on validation loss is preserved for deployment.
A compact, high-performance autonomous driving engine using Adaptive Response Control for smooth, stable, and intelligent real-time driving.
- Curve Severity Classification (STRAIGHT β EXTREME)
- Dynamic Steering Boost (1.0Γ β 2.7Γ)
- Adaptive Smoothing per curve category
- Multi-Profile Throttle Control
- Braking (negative throttle)
- Reverse throttle in EXTREME recovery
- Safe acceleration + speed targeting
- Speed Bands for stability (3 β 23 km/h)
- Camera Correction for left/right feeds
- Real-Time Image Recorder
- Professional Console Telemetry
| Category | Boost | Response | Speed |
|---|---|---|---|
| π΄ EXTREME | 1.5β2.7Γ | 95% | 3β7 km/h |
| π VERY_SHARP | 1.4β1.9Γ | 90% | 5β10 km/h |
| π‘ SHARP | 1.3β1.7Γ | 85% | 7β13 km/h |
| π’ MEDIUM | 1.2β1.5Γ | 70% | 9β17 km/h |
| π΅ GENTLE | 1.1β1.3Γ | 55% | 15β20 km/h |
| βͺ STRAIGHT | 1.0Γ | 40% | 18β23 km/h |
- Negative throttle for braking
- Reverse pulses for EXTREME recovery
- Category-based acceleration
- Throttle smoothing 0.35 β 0.85
A fully integrated autonomous driving loop using Socket.IO enables comprehensive system monitoring:
The simulator streams live telemetry (camera and sensor data) via Socket.IO to a Python inference server running the PilotNet model, which returns steering and throttle commands instantaneously. All data is simultaneously broadcast through Socket.IO to a rich web dashboard displaying live camera feed, 3D animated steering wheel visualization, and real-time Plotly charts including steering angle timeline, speed and acceleration plots, steering distribution histogram, speed vs. angle heatmap, key events chart with detailed event list, and an overview chart combining speed and steering angle.
A separate FastAPI endpoint with a neon-themed UI provides instant steering prediction from uploaded images. All components are synchronized in real-time for comprehensive system monitoring.
python auto_mode.py model.pth [image_folder] [options]
Options:
--camera {center,left,right} Camera stream selection (default: center)
--steer_correction FLOAT Left/right camera correction (default: 0.2)
--alpha FLOAT Steering smoothing factor (default: 0.2)
--max_limit FLOAT Maximum speed on straight sections in km/h (default: 15.0)
--max_throttle FLOAT Maximum throttle value 0-1 (default: 0.8)
--port INT Server port (default: 4567)git clone https://github.com/your-username/self-driving-car-bc.git
cd self-driving-car-bcpython -m venv venv
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windowspip install --upgrade pip
pip install -r requirements.txt# Basic usage with center camera
python drive.py nvidia_model.pth
# Advanced usage with custom settings
python drive.py best_model.pth run_images/ \
--camera center \
--max_limit 20.0 \
--max_throttle 0.9 \
--alpha 0.3- Launch the Udacity simulator
- Select the desired track
- Click "Autonomous Mode"
- The simulator automatically connects to
localhost:4567
π¦ Loading model from: nvidia_model.pth
β Model loaded and set to eval
πΈ Saving images to: output_images
π Starting server on port 4567 ...
π― Enhanced angle boost in curves
π’ Lower speeds for safety
β‘ Max throttle: 0.65
The model successfully completes full autonomous laps on Track 2 with smooth cornering, dynamic speed adjustment based on curve severity, and stable recovery from lane deviations using side camera training data.
Ensure you are using the correct checkpoint format. If issues persist, try:
torch.load(model_path, map_location='cpu')- Verify firewall settings allow connections on port 4567
- Ensure the simulator is in Autonomous Mode
- Confirm no other process is using port 4567
- Reduce
--max_limit(try 10-12 km/h) - Increase steering smoothing with
--alpha(try 0.3-0.4) - Consider retraining with additional data or improved distribution balancing
- Increase
--alphafor more aggressive smoothing - Verify the model was trained with sufficient augmentation
This project is licensed under the MIT License
- Udacity for the Self-Driving Car Simulator and curriculum inspiration
- NVIDIA for pioneering end-to-end learning research: End-to-End Deep Learning for Self-Driving Cars
- Kaggle dataset provider: zaynena
- PyTorch and Albumentations communities for excellent tools and libraries
CNN-Based Behavioral Cloning for Autonomous Driving/
βββ Deployment/ # Deployment and inference components
β βββ predictor/ # Standalone prediction service
β βββ sim_server/ # Simulator communication server
β βββ sim_web/ # Web-based visualization dashboard
βββ Saved Models/ # Trained model checkpoints
β βββ vit_model.pth # Vision Transformer weights
β βββ nvidia_model.pth # NVIDIA PilotNet weights
βββ Installation/ # Setup and configuration files
β βββ requirements.txt # Python dependencies
β βββ setup_instructions.md # Detailed installation guide
βββ Notebooks/ # Training notebooks
β βββ Self_Driving_Car_Sim.ipynb # NVIDIA PilotNet & VIT training
βββ README.md # Project documentation
Happy autonomous driving! ππ¨

