Discovering Movie Magic: A SQL Analysis of Top IMDB Films Using Python and DuckDB
Want to dive deep into the cinematic world's best movies? This project offers a fun and insightful SQL analysis of top-rated IMDB films, using Python and DuckDB. We will explore the secrets hidden within movie data, and you can follow along!
Unveiling Movie Trends using SQL: What You’ll Discover
This project is all about using SQL to analyze IMDB movie data. Here's what makes it engaging:
- SQL-Powered Insights: Extract valuable information about directors, genres, scores, and more with efficient SQL queries.
- Interactive Results: View your analysis in dynamic, sortable, and filterable HTML tables.
- Reproducible Workflow: The entire project is built with Quarto, ensuring easy replication and modification of the analysis.
Think of it as becoming a data-savvy film critic!
Project Highlights: DuckDB, Python, and Interactive Tables
This project leverages powerful tools to bring movie data to life:
- DuckDB: A fast and efficient in-process SQL database, perfect for analytical queries on the
movies.db
database. - Python: Used to connect to DuckDB, execute SQL queries, and display results.
- Quarto: Combines code, output, and narrative into a single, easy-to-read document.
- Itables: Display dataframes as interactive HTML tables within the Quarto report.
Get Started: Installation Guide
Ready to start your own IMDB movie analysis with SQL? Here's how to set up the project:
-
Clone the Repository:
-
Create a Virtual Environment: Using uv package manager for faster performance.
This isolates the project dependencies.
Usage: Running the Analysis
Transform the Quarto notebook into an interactive HTML report:
-
Render to HTML:
-
Interactive Preview:
This will open the report in your browser.
Dataset Details
The core of this project is the movies.db
SQLite database. Put it inside the data/
folder and ensure it contains a movies
table with these columns:
title
director
year
rating
genres
runtime
country
language
imdb_score
imdb_votes
metacritic_score
License
This project is open-source under the MIT License. Feel free to use, modify, and share it!