Skip to content

denisecase/buzzline-06-world

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

buzzline-06-world

Python Binder License: MIT Jupyter DuckDB

From Streaming to Historical Analysis

Streaming data comes in many forms. We've looked at routing data in motion using Kafka pipelines. This project is an example of the analytics for a historical review of streaming data. It's an example of a common concern: once real-time signals are captured, how do analysts detect patterns in the data?

Batch and streaming use much the same detection logic, just in a different context. Many of our prior tools will be useful in this analysis as well.

Overview

Two independent worlds are represented in separate DuckDB files (civic_world_a.duckdb and civic_world_b.duckdb). Each world provides information about social media posting behavior. To protect privacy, the platform provider doesn't expose the acount holder - or the exact content they posted. Instead, they implement a "privacy-preserving" API that shares aggregate information. The public or analytics researchers can access this proposed API to look for indications of possible coordination and/or manipulation.

Purpose: Data analysts explore the two worlds of simulated behavior to see if they can detect possible coordination.

Challenge

  1. Analyze civic_world_a.duckdb and civic_world_b.duckdb.
  2. Determine which world shows organic civic discourse and which world shows coordinated manipulation.
  3. Use the provided Jupyter Notebook (analysis.ipynb) for analysis.

Interactive Charts:

  • The notebook uses interactive Plotly charts which do not render on GitHub.
  • Recommended: See Getting Started below to get the analysis working locally on your machine.
  • For an interactive preview, see MyBinder Analysis Notebook (it's free, please be patient).

Project Organization

/data/
  /worlds/                 # DuckDB files for analysis
    civic_world_a.duckdb
    civic_world_b.duckdb

/docs/                     # Background information

/notebooks/
  analysis.ipynb           # Partially implemented analysis

Prepared Views

These six prepared views help compare the two worlds:

  • View 1: Compares burst and synchrony across topics
  • View 2: Examines temporal posting patterns (the "when")
  • View 3: Analyzes account age and automation correlation
  • View 4: Measures content coordination (duplication patterns)
  • View 5: Identifies highest-scoring suspicious events
  • View 6: Synthesizes all signals for an overall assessment

Each view tests a hypothesis about coordinated vs organic behavior.

Getting Started

  1. Copy this template repo into your GitHub account.
  2. Clone your new buzzline-06-world repository down to your machine.
  3. Create and activate your local project virtual environment (.venv) and install key tools.

Follow the standard project setup described at pro-analytics-01 for more detailed instructions.

CheatSheet: Commands to Manage Virtual Environment

These commands:

  1. Create a local project virtual environment in a folder named .venv.
  2. Activate the virtual environment.
  3. Install and upgrade key tools in .venv.
  4. Install and upgrade required project dependencies.
Windows PowerShell (recommended Option A + requirements.txt)
py -m venv .venv
.\.venv\Scripts\activate
py -m pip install --upgrade pip setuptools wheel
py -m pip install --upgrade -r requirements.txt
Windows PowerShell (advanced Option B + pyproject.toml)
uv venv
.\.venv\Scripts\activate
uv pip install --upgrade pip setuptools wheel
uv pip install -e ".[dev]"
Mac/Linux/WSL (recommended Option A + requirements.txt)
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install --upgrade pip setuptools wheel
python3 -m pip install --upgrade -r requirements.txt
Mac/Linux/WSL (advanced Option B + pyproject.toml)
uv venv
source .venv/bin/activate
uv pip install --upgrade pip setuptools wheel
uv pip install -e ".[dev]"

After Making Useful Changes

Execute notebooks.

git add .
git commit -m "custom message"
git push -u origin main

Example Analysis Charts

Examples of comparing the social media behavior of the two worlds (without compromising privacy).

Timeline Analysis

New Accounts Posting via APi

About

Analyze simulated social media streaming data to detect coordination

Resources

License

Stars

Watchers

Forks

Contributors