Canadian Legal Problems Survey: Data Dashboard
The Canadian Legal Problems Survey is a national survey of Canadians’ experiences with the justice system. This post describes the construction of a data dashboard for the main CLPS dataset, using Streamlit and Altair.
The Two-Hour Job Search: A Book Summary
Steve Dalton’s ‘The Two-Hour Job Search’ rejects applying to online postings as a viable method of job searching. Instead, he offers an alternate strategy of targeting desired companies, identifying contacts within those companies, and setting up informational meetings with these contacts, in order for them to advocate for you. This blogpost summarizes the Two-Hour Job Search method.
Canadian Legal Problems Survey: Codebook Extraction
The Canadian Legal Problems Survey is a national survey of Canadians’ experiences with the justice system. While the main data is available as a CSV, the codebook containing the mappings of survey responses and their codes is only available as a PDF. This post describes how I extracted data from the PDF codebook and built a simple web app for browsing the extracted data.
Exploring Boardgames Part Two: Exploratory Data Analysis
In this post, I describe exploratory data analysis on the data set of 100,000 boardgames obtained in part one. I investigate the “geek rating” of a boardgame, which I use to narrow the scope of the analysis. Then, I give an overview of general characteristics of this boardgame dataset, and answer the question “Are we in a golden age of boardgames?” by looking at how boardgame ratings have changed over time.
Exploring Boardgames Part One: Data Download and ETL
What is the golden age of board games? In this blog post, I describe code to fetch and clean data about 100,000 board games via the BoardGameGeek.com XML API, in preparation for analysis in part two.
Will It Complete? Predicting Starbucks Offer Completion Rates
Can we predict which customers will complete Starbucks rewards offers based on demographic data? I explore this question by generating various machine learning models on a simulated dataset of 17000 customers.
It’s mostly about bedrooms — a brief saunter into a month of Airbnb listings in Vancouver
What dictates the price of an Airbnb listing in Vancouver? I analyzed a dataset Airbnb listings from the month of April 2021, and fitted a linear model. Spoiler alert, it’s mostly about the number of bedrooms.