Skip to content

kitperform/strictly-jupyter

 
 

Repository files navigation

Strictly Open

Goal: To be the most structured, versatile, up-to-date and open Strictly Come Dancing results data resource known to us.

Project stage

The project is currently in very early stages of discovery

  • Work done so far:
    • Researched other existing projects: Four Tens, StrictlyDB. Verdict: open source projects such as Four Tens have some analysis, but not up-to-date. StrictlyDB is a website, which is closed-source and the data doesn't appear to be open.

    • Analysis of Wikipedia data. Verdict: very comprehensive and complete. Some data can be derrived from other sections, so not all of it needs processing

    • Produced a Highly normalised data model created based on analysis of Wikipedia data: https://github.com/kitperform/strictly-jupyter/blob/main/data-model.md

    • Evaluated scraping tools: Scrapy and Beautiful Soup. Verdict: use Beautiful Soup - it's comprehensive, initally got carried away with the virtue of Scrapy based on reading, but actually Beautiful Soup fits much better. Bertie had recommended this as well.

    • Avoiding the "run before walk" problem - started preparing to evaluate other projects in Jupyter but not got that far.

    • Thinking about what questions I want to answer, some notes here:

    • Next steps:

      • add following as tickets into GitHub projects and do them:

setting up on Ubuntu

sudo add-apt-repository universe sudo apt install python3 python3-pip ipython3 sudo apt install python3.12-venv python3 -m venv venv source venv/bin/activate pip3 install jupyter jupyter notebook deactivate

notes on which environment to use

https://stackoverflow.com/a/47559925/227926

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.8%
  • CSS 1.6%
  • JavaScript 1.3%
  • HTML 1.0%
  • Cython 0.6%
  • C++ 0.3%
  • Other 0.4%