Hello! 😃

My name is Andrew K.O. Wong. I am a former cell biologist 🧫🔬🦠🧬 who is currently interested in data science 📊📈🤖. Below are posts about some of my personal data analysis/data science projects. Happy reading!

Canadian Legal Problems Survey: Data Dashboard

The Canadian Legal Problems Survey is a national survey of Canadians’ experiences with the justice system. This post describes the construction of a data dashboard for the main CLPS dataset, using Streamlit and Altair.

Photo: [2hourjobsearch.com](https://2hourjobsearch.com/)

The Two-Hour Job Search: A Book Summary

Steve Dalton’s ‘The Two-Hour Job Search’ rejects applying to online postings as a viable method of job searching. Instead, he offers an alternate strategy of targeting desired companies, identifying contacts within those companies, and setting up informational meetings with these contacts, in order for them to advocate for you. This blogpost summarizes the Two-Hour Job Search method.

Canadian Legal Problems Survey: Codebook Extraction

The Canadian Legal Problems Survey is a national survey of Canadians’ experiences with the justice system. While the main data is available as a CSV, the codebook containing the mappings of survey responses and their codes is only available as a PDF. This post describes how I extracted data from the PDF codebook and built a simple web app for browsing the extracted data.

Exploring Boardgames Part Two: Exploratory Data Analysis

In this post, I describe exploratory data analysis on the data set of 100,000 boardgames obtained in part one. I investigate the “geek rating” of a boardgame, which I use to narrow the scope of the analysis. Then, I give an overview of general characteristics of this boardgame dataset, and answer the question “Are we in a golden age of boardgames?” by looking at how boardgame ratings have changed over time.

🔥🔥 A solo mode game of Terraforming Mars 🔥🔥

Exploring Boardgames Part One: Data Download and ETL

What is the golden age of board games? In this blog post, I describe code to fetch and clean data about 100,000 board games via the BoardGameGeek.com XML API, in preparation for analysis in part two.

Photo by [Asael Peña](https://unsplash.com/@asaelamaury?utm_source=medium&utm_medium=referral) on [Unsplash](https://unsplash.com/?utm_source=medium&utm_medium=referral)

Will It Complete? Predicting Starbucks Offer Completion Rates

Can we predict which customers will complete Starbucks rewards offers based on demographic data? I explore this question by generating various machine learning models on a simulated dataset of 17000 customers.

Photo by [Roberto Nickson](https://unsplash.com/@rpnickson?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText) on [Unsplash](https://unsplash.com/photos/fAa25CyYtrg?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText)

It’s mostly about bedrooms — a brief saunter into a month of Airbnb listings in Vancouver

What dictates the price of an Airbnb listing in Vancouver? I analyzed a dataset Airbnb listings from the month of April 2021, and fitted a linear model. Spoiler alert, it’s mostly about the number of bedrooms.