Strictly Open

Goal: To be the most structured, versatile, up-to-date and open Strictly Come Dancing results data resource known to us.

Project stage

The project is currently in very early stages of discovery

Work done so far:
- Researched other existing projects: Four Tens, StrictlyDB. Verdict: open source projects such as Four Tens have some analysis, but not up-to-date. StrictlyDB is a website, which is closed-source and the data doesn't appear to be open.
- Analysis of Wikipedia data. Verdict: very comprehensive and complete. Some data can be derrived from other sections, so not all of it needs processing
- Produced a Highly normalised data model created based on analysis of Wikipedia data: https://github.com/kitperform/strictly-jupyter/blob/main/data-model.md
- Evaluated scraping tools: Scrapy and Beautiful Soup. Verdict: use Beautiful Soup - it's comprehensive, initally got carried away with the virtue of Scrapy based on reading, but actually Beautiful Soup fits much better. Bertie had recommended this as well.
- Avoiding the "run before walk" problem - started preparing to evaluate other projects in Jupyter but not got that far.
- Thinking about what questions I want to answer, some notes here:
  - https://github.com/kitperform/strictly-jupyter/blob/main/scraping-evaluation.ipynb
  - also:
    - what makes a successful contestant
    - has scoring inflated over the years
    - who is more generous with scores
    - how much does judges vote versus viewer vote help contestants progress
- Next steps:
  - add following as tickets into GitHub projects and do them:
    - explore other projects applying Pandas to get a feel of analysis and further insight into what questions I want to answer
      - continue on with: https://github.com/kitperform/strictly-jupyter/blob/main/four-tens-experimentation.ipynb
    - prototype Beautiful Soup code to scrape data into my database model, start experimenting
      - write code to link tables via foreign keys
    - give a lightning talk at PyData Southampton on TBD

setting up on Ubuntu

sudo add-apt-repository universe sudo apt install python3 python3-pip ipython3 sudo apt install python3.12-venv python3 -m venv venv source venv/bin/activate pip3 install jupyter jupyter notebook deactivate

notes on which environment to use

https://stackoverflow.com/a/47559925/227926

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data/wikipedia		data/wikipedia
venv		venv
README.md		README.md
data-model.md		data-model.md
four-tens-experimentation.ipynb		four-tens-experimentation.ipynb
scraping-evaluation.ipynb		scraping-evaluation.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Strictly Open

Project stage

setting up on Ubuntu

notes on which environment to use

About

Uh oh!

Releases

Packages

Languages

kitperform/strictly-jupyter

Folders and files

Latest commit

History

Repository files navigation

Strictly Open

Project stage

setting up on Ubuntu

notes on which environment to use

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages