Emily's Data Blog

Probability: Expectation, Variance, and Conditional Probability

Welcome back to my journey through prepping for the Google Product Analyst interview! Today we’ll dive into some basic probability concepts, like expectation and variance, and explore conditional probability. I’ll probably write further on more probability concepts in another post, but for now we’ll start with this since those were the topics explicitly called out in the recruiter’s email for prepping for this interview.

Emily's Data Blog

Hypothesis Testing: Assumptions and Non-Parametric Tests

This post is the fourth of a four-part series on Hypothesis Testing. Part 1 here. Part 2 here. Part 3 here.

Emily's Data Blog

Hypothesis Testing: Proportion Tests Without Bootstrapping and Chi-Squared Tests

This post is the third of a four-part series on Hypothesis Testing. Part 1 here. Part 2 here. Part 4 here.

Emily's Data Blog

Hypothesis Testing: T-Tests, ANOVA, and Paired T-Tests

This post is the second of a four-part series on Hypothesis Testing. Part 1 here. Part 3 here. Part 4 here.

Emily's Data Blog

Hypothesis Testing: Testing Proportions and Types of Errors

This post is the first of a four-part series on Hypothesis Testing. Part 2 here. Part 3 here. Part 4 here.

Emily's Data Blog

Using LaTeX in Jekyll with MathJax

You may have noticed that my last blog post had some LaTeX-looking formulae formatted all nicely inline. This, sadly, took me the better part of the last few hours which only took away from my actual studying time, so I figured a blog post was in order to save everyone else the headache. To do this, I combed through multiple Stack Overflow pages and blog posts, but what ultimately saved me was simply RTFM.

Emily's Data Blog

Sampling and Bootstrapping

The first topic I wanted to review was sampling methods, particularly resampling as this was on the list of topics to review that was provided to me. For the purposes of this post, I’ll be reviewing the entirety of the topics presented in the DataCamp course Sampling in Python, which touches on sampling, resampling (or sampling with replacement), selection bias, sample size and population parameters, the Central Limit Theorem, bootstrapping, standard error, and confidence intervals. This was broken into four sections which I will follow here, the most important being the final section which focuses on bootstrapping.

Emily's Data Blog

Prepping for the Google Product Analyst Interview

Well, it’s certainly been awhile (six years?!) since I’ve written in this blog. The last time I wrote, I was wrapping up my time at the data science bootcamp, Metis. Since then, I’ve worked at Spotify for five years with a short stint at Universal Music Group after that. Now I’m interviewing for the Product Analyst: Data Science role at Google and in preparation for the technical screen, I’d like to write down a little bit about the material I’ve been reviewing.

Emily's Data Blog

Iterative Improvements

One takeaway that I really gained first-hand from working on my most recent project is the concept of iterative improvements. In the past, I have always seen iterative software development as a barrier to accomplishment, because I struggled to consider work ‘finished’ until it was packaged in a convenient, ready-to-go takeaway. Palantir’s guiding principle states it nicely by quoting Frederick P. Brooks, ‘Successful software always gets changed.’ Iteration is not a symbol of incomplete or poorly constructed code, but rather indicative of a paradigm of constant improvement.

Emily's Data Blog

Playlistr

Today’s post is very important to me, because I’ll be talking about Playlistr, my application for creating playlists based off of a given artistic sound. This project is my final data science “passion project” for Metis, but hopefully just one of the first of many data science investigations throughout my career. The past three months have been some of the most challenging, rewarding and knowledge-filled of my life. But beyond that, I feel incredibly grateful to have had the opportunity to hone my data science skills in an environment surrounded by peers of extremely high caliber who are equally passionate about their projects as I was about mine.

Emily's Data Blog

Modeling Refugee Crises

Over the past three weeks here at Metis, we’ve completed another machine learning project. The structure of this project was intended to expose us to classification algorithms, MySQL, and Amazon AWS. Although the project was first presented as a group project, we had the option to diverge into individual projects using the same dataset.

Emily's Data Blog

Film Sales in International Markets

The past two weeks have flown by. This Friday I finished a project regarding the percentage of the foreign gross total of a film that comes from Hungary. Although Hungary is a small country, I made the assumption that the foreign distribution of films has a long tail. The motivation of this project was to use machine learning to determine which countries would be the best smaller countries to distribute a film in to maximize foreign gross.

Emily's Data Blog

Week One Roundup

The first week of Metis Cohort 6 has come to a close! We’ve done so much in so little time, mainly starting (and finishing) Project Benson. The task was to use MTA turnstile data, along with other data sources, to make recommendations to a fictional nonprofit supporting women in technology on where to place its street teams. The purpose of the outreach was both financial and awareness-raising, so we had to think creatively about what to emphasize and the approach to take.

Emily's Data Blog

Hi there, I'm Emily.

I’m a data scientist with interests ranging from social networks to biostatistics to data science for social good. I am fascinated by the possibility to solve real-world problems at the intersection of coding, statistics, and mathematical modeling. This is the beginning of my journey documenting that experience.