This repository contains the technical proof of concept developed as part of my bachelor thesis.
The project explores how analog shelf information in brick-and-mortar grocery retail can be transformed into structured digital data using computer vision. More specifically, it focuses on detecting price tags in shelf images, extracting relevant information such as product names and prices, and storing the results in a machine-readable format.
The repository includes code for:
- model training for price tag detection
- price tag detection on shelf images
- cropped tag extraction
- structured information extraction
- basic end-to-end processing
This is not a full production system or consumer-facing platform. It is a research-oriented prototype that demonstrates the technical feasibility of the proposed approach.
model_training/– notebooks and artefacts for model selection, hyperparameter search, and final trainingdetect/– detection logic for localizing price tagsextract/– extraction logic for reading product information from detected tagsstore/– components for storing structured outputsutils/– helper functionsgradio_app.py– simple interface for testing the pipeline
The full pipeline can be tested through the Gradio app.
- Create and activate a virtual environment.
- Install the required dependencies:
pip install -r requirements.txt
- Add your own OpenAI API key to the .env file:
OPENAI_API_KEY=your_api_key_here
Start the App
Run:
python gradio_app.py
After that, Gradio will provide a local URL in the terminal which you can open in your browser.
Notes
The full training dataset is not included in this repository due to storage constraints. It can be made available upon request.
Some folders may contain experimental or generated artefacts created during development and testing.