John and Jane. Why are you so illusive? Where o where art thou? In part 1, we used MTA turnstile data to find possible locations. That helped us narrow our search down to 24 stations, but I’m certain we can do better. Please show us the way US Census demographics data.
The Hunt Goes On
These are our 24 prime subjects from part 1:
['86 ST', '34 ST-HERALD SQ', 'GRAND ST', 'CHAMBERS ST', '125 ST', 'JAMAICA CENTER', '42 ST-BRYANT PK', '82 ST-JACKSON H', '59 ST', '34 ST-PENN STA', 'JUNCTION BLVD' 'SPRING ST', 'BOWLING GREEN', 'JAY ST-METROTEC', '50 ST', 'FULTON ST', 'CROWN HTS-UTICA', 'CANAL ST', '34 ST-HUDSON YD', '14 ST', '23 ST', '72 ST', 'WORLD TRADE CTR', '167 ST']
Before we go any further though, let’s first learn about our coffee drinkers.
Who are J&J?
J&J are typical coffee drinkers. They are public transit commuters earning over $75000. They are unmarried women younger than 34. They are never married employees over the age of 16. They are people with masters, professional, or doctorate degrees. How can the census help us find these? Surprise, all of these have specific codes in the census that we can query!
We can use these census codes to find J&J:
B01001_001E, B08119_036E, B13001_007E, B13001_008E, B12006_004E, B12006_009E, B15002_016E, B15002_017E, B15002_018E, B15002_033E, B15002_034E, B15002_035E, B19301_001E
Pulling census data is pretty straight forward now. All we need to do is use Google API to provide us with coordinate locations for our stations that can be easily mapped to the census tracts. Since the surrounding areas around stations might occupy multiple tracts, we need to proportionalize the tract data to reflect this.
The code to do this can be found here.
We’re so close. Let’s rank our findings and get J&J their Joe already! Here’s the code to do just that: ranking.py
John and Jane
Welcome, welcome John and Jane! We’ve got the best coffee for you at 167 St., 86 St., and Crown Heights - Utica Stations. Come on over and don’t forget to bring Jake and Jill too :]