DATASCRAPING/CODING

LATENT IMAGE ANIMATOR

Duration

Spring 2025

1 month

DATA SCRAPING & LATENT MOTION

Tools

Terminal, LIA, Python

Project Overview

This project explores web scraping methods to collect large datasets and experiment with different ways to visualize them. To approach this, I wanted to work with a dataset that was both engaging and rich in visual potential. Being interested in boxing and Muay Thai, I found an online archive containing photos of title fights from 1964 to 2025. This dataset provided a unique opportunity to analyze decades of combat sports history through imagery, allowing me to experiment with different ways to represent trends, patterns, and narratives within the sport. By extracting and organizing this data, I wanted to explore how visual elements, such as color, motion, and composition, can convey information beyond traditional charts and graphs.

Initial Data Scraping

Using ParseHub, I scraped data from Sport Photo Gallery to compile an archive of boxing title fight images from 1964 to 2025. To capture the data effectively, I ran two separate web scrapers: one to extract images only and another to include both images and fight names. Below, I’ve attached a screenshot of my workflow on ParseHub to illustrate the process.

Data Visualization

I used Jupyter Notebook to perform data visualization on image datasets using multiple dimensionality reduction techniques, including t-SNE (t-Distributed Stochastic Neighbor Embedding), Pyramid (multi-scale image representation), CNN (Convolutional Neural Networks feature extraction), and UMAP (Uniform Manifold Approximation and Projection). These methods allowed for effective high-dimensional data exploration by projecting the image feature spaces into lower dimensions for visualization.

Chronological Data Visualization

This part of the project focuses on creating a chronologically organized video that sequences all images by year, from the earliest to the latest. Using Python and OpenCV, the images are first sorted based on their metadata (such as file names, EXIF data, or manually provided timestamps). The sorted images are then compiled into a video, ensuring a smooth and visually coherent transition from one year to the next.

date sorter code provided below:

import json
import re
import pandas as pd

# Load the JSON file
file_path = "/Users/thipopp21/Desktop/run_results.json"
with open(file_path, "r") as f:
data = json.load(f)

# Extract the list from the JSON file
image_list = data.get("selection3", []) # Ensure it retrieves the correct list

# Function to extract the year from the title
def extract_year(text):
match = re.search(r"\b(18\d{2}|19\d{2}|20\d{2})\b", text) # Match years from 1800–2099
return int(match.group(0)) if match else None # Return year as an integer or None if not found

# Process the list to extract years
sorted_images = [
{
"image": item["image"],
"title": item["selection4"],
"year": extract_year(item["selection4"])
}
for item in image_list
]

# Remove entries where no year was found
sorted_images = [img for img in sorted_images if img["year"] is not None]

# Sort the list by year (oldest to newest)
sorted_images.sort(key=lambda x: x["year"])

# Save the sorted list to a new JSON file
sorted_file_path = "/Users/thipopp21/Desktop/sorted_images.json"
with open(sorted_file_path, "w") as f:
json.dump(sorted_images, f, indent=4)

# Convert to DataFrame and save as CSV
df = pd.DataFrame(sorted_images)
df.to_csv("sorted_images.csv", index=False)

print(f"Sorted images saved to {sorted_file_path}")

Initial Result

Edited Video

The final output of my project is a one-minute animated video depicting a boxing match, created by sequencing a series of still images. To maintain a cohesive visual experience, I rendered the video in monotone that the audience remains focused on the movement and intensity of the fight rather than being distracted by the colors from different images. Additionally, I refined the animation by cutting out certain frames from the initial sequence. This eliminated abrupt transitions and enhanced the overall fluidity of the motion.

I wish I had started with a dataset that had more visual coherence. While I aimed to capture the intensity of a boxing fight, the variations in the original images made it challenging to create 'seamless' transitions. Converting everything to monotone helped unify the visuals, but I still had to cut certain frames to smooth out the motion.

After learning this technique, I’m excited to apply it to future projects with a more carefully curated dataset. I’d definitely want to animate something where the transitions feel more natural and fluid, allowing for a smoother, more immersive visual experience.