A Python-Based Machine Learning Visualization Tool for ARFF Datasets
ARFF Tree Explorer is a Python-based desktop application designed to load ARFF datasets, train multiple classification models, and visualize decision trees in an intuitive, beginner-friendly GUI.
The tool makes learning ML more accessible by combining:
- Python GUI (Tkinter)
- ARFF file processing
- Decision tree visualization
- Classifier performance comparison
- User-friendly workflow for education & research
This system acts as a lightweight, open-source alternative to Weka.
- Load ARFF files
- Automatic preprocessing
- Evaluate models using 5-fold cross-validation
- Compare accuracy of:
- J48 (Entropy-based Decision Tree)
- REPTree
- Random Forest
- Decision Stump
- Logistic Regression (LMT Approximation)
- Select model from dropdown
- Train automatically
- Visualize decision trees using Matplotlib
- Explore splits, nodes, and decision paths
- Built fully in Python
- Handles categorical encoding
- Model training + evaluation in real-time
- Simple, minimal, student-friendly interface
User GUI (Tkinter)
โ
Data Processing (Pandas + ARFF Loader)
โ
Model Training (Scikit-learn)
โ
Classifier Comparison / Tree Visualization
โ
Matplotlib Output (Decision Trees)
| Component | Technology |
|---|---|
| Language | Python 3.x |
| GUI | Tkinter |
| ML Models | Scikit-learn |
| Data Loader | scipy.io.arff |
| Visualization | Matplotlib |
| IDE | VS Code |
ARFF-Tree-Explorer/
โโ main.py
โโ assets/
โ โโ screenshots/
โ โโ icons/
โโ sample_datasets/
โโ requirements.txt
โโ README.md
pip install pandas scikit-learn matplotlib scipypython main.py- Upload ARFF file
- System loads โ preprocesses data
- Compare classifiers
- Visualize decision trees
- ARFF parsed using
scipy.io.arff - Categorical features encoded with
LabelEncoder - Models trained via Scikit-learn
- Tree visualized using
plot_tree()
- Evaluates multiple ML models
- Uses 5-fold cross-validation
- Outputs accuracy scores
- Trains selected classifier
- Generates complete tree plot
- Supports J48, REPTree, RandomForest, etc.
- Fast dataset loading
- Accurate ML model comparison
- Clear visual decision trees
- Smooth GUI interaction
- Tested on multiple UCI ARFF datasets
- Only ARFF supported (no CSV/XLSX)
- GUI may freeze with large datasets
- No threading / background tasks
- No exporting of tree images
- No hyperparameter tuning
- Support CSV, XLSX, JSON
- Multi-threading for large data
- Save/Export trees
- Add hyperparameter tuning panel
- Zoomable, interactive tree viewer
- Dark mode for GUI
(Complete documentation, flowcharts, case studies, screenshots)
๐ Download:
Download / View Project Report (PDF)
- Md. Zehadul Islam
- Md. Abdullah Al Moin