Tag Archives: Data

Four Years of Tracking my Metro Trips

Not much has changed, and yet everything has changed.

The last two years have been… unusual to say the least. With the pandemic, lockdown, and working from home, I overhauled my commute, just like many others. And my Metro data tells the tale:

Four years of Metro Trips, logged

You can see the exact moment when the world shut down. From mid-March 2020 through Mid-January, 2021, I took a grand total of seven unlinked Metro trips. By mid-march 2021, I had resumed a new commute pattern, mostly for the purposes of daycare drop-off.

Metro during the pandemic has been different. 92.4% of my trips in 2021 have been on a 7000 series train, and the remainder have been on 3000 series. I haven’t ridden either a 6000 or 2000 series train since the pandemic began.

Some trivialities:

  • The entire 7000 series fleet is in service, and I snagged a ride on the newest car (Number 7747) on August 17, 2021.
  • Of the current fleet, I’ve ridden 1137 of the 1284 cars in service (what counts as “in service” is a bit of a moving target), or 88.6%
  • My most frequently ridden car is 7673, which I’ve caught ten times over four years.
  • The longest gap between cars I’ve managed is 1,382 days, on car 7111. That’s 3.78 years of the current 4 year timespan.

With life changes, so to will my commute patterns. The ‘new normal’ remains uncertain for balancing working in the office vs. working at home. My kid is now attending a school within walking distance. I’ve felt comfortable on transit during the pandemic with a mask and with relatively sparse crowds, but the pandemic’s continued impact on travel demand is painfully uncertain.

Who knows what the next year will bring?

Two years of tracking my Metro trips

Two years ago, I started tracking my WMATA rides for extremely trivial reasons. After a while, my curiosity is now ingrained as a habit, a small bit of gamification of my commute (even if that game is basically Calvinball).

(Since I originally started doing this to see how quickly I would ride in the same car, I should note that on three occasions, I’ve ridden the same car twice in one day. The longest gap between rides on the same car is 707 days between rides on car 3230.)

With two years of data in the books, I thought I’d share some highlights: 2,013 unlinked trips, using 986 unique railcars, covering about 73% of the current fleet. Seventy five of those cars will never be ridden again, retired as part of the 5000 series. From the current fleet, there are 333 cars I’ve yet to ride, and an additional 113 that have been retired before I had a chance to ride. The fleet makeup is constantly evolving as Metro continues to accept new 7000 series cars, so the precise numbers change often.

My obsession has provided means to monitor the introduction of the newest members of the 7000 series, with 706 cars of the 748 ordered now in service.

Beyond changes in the Metro fleet, I’ve been able to document changes in my own life – different daycares, different jobs, and different commutes. I’ve also noticed how Metro changes their operations and railcar assignments as they take on major track work and as their fleet evolves.

Some charts:

Most of my trips are commute trips; red bars correspond to WMATA’s peak fare periods.
I’ve ridden all but 2 of the 2000 series cars that are in service; I rode 75 of the 5000 series before they were retired.
More than half of my rides have been on 7000 series trains.
My regular trips make frequent use of all services except for the Red Line. Most trips are still tied to the Orange/Silver/Blue lines, serving my home station.

I was also curious if I could put two years of tracked trips into one chart, so here’s an annotated version:

You can see the retirement of the 5000 series cars, the slowly increasing size of the 7000 series fleet, re-assigment of cars around the system, particularly in response to this summer’s lengthy Platform Improvement Project shutdown. The retirement of most of the 5000-series fleet also shifted the 6000 series – previously common on the Green line, but then shifted to the Orange/Blue/Silver lines.

You can also see how some cars tend to stick to certain portions of the system. This is easiest to see with the Red Line, since it’s both the most isolated from other lines and the line I ride the least. Before the PiP, you can see how most 3000 series cars with numbers above ~3175 were assigned to the Red Line. Likewise, most 7000 series trains between ~7150 and ~7300 are also isolated on the Red Line, except for a few weeks during this summer’s shutdown.

More charts from my obsessive Metro trip tracking

After fiddling with my spreadsheet full of tracking the individual metro railcars I’ve ridden, I’ve got a few more charts to show my year-plus worth of Metro trips.

Part of the reason I didn’t have these charts before is that dealing with time as a field in Excel/Google sheets is kinda a pain. It’s not always a clear number, but I was nevertheless able to sort it out.

So, some charts:

Trip Distribution: What does my overall trip distribution look like? Surprise, surprise! It’s peak-heavy.

The red lines indicate the break points for WMATA’s fare changes. A few obvious trends emerge:

  • Most of my rides are during the peak, right around peak commuting times.
  • Most of my off-peak riders are mid-day, using Metro to attend out of office work meetings, etc.
  • My PM commute is bi-modal, often due to two separate trips as I usually do day-care pickup.
  • Very few evening trips (again, likely thanks to that day-care pickup)

Railcar Distribution: One of the other observations was the unequal distribution of railcars across the system, particularly for the 3000 series.

I made these distribution charts for each rail car series. For example, here is the 6000 series:

You can see I’ve ridden cars across the entire 6000 series fleet. I’ve ridden in two of those cars five times each. As of the creation of this chart, I’ve ridden in 97 of the 192 cars in the 6000 series.

The distribution of my rides in the 7000 series is different:

The pattern is different, due to the continual expansion of the 7000 series fleet. The lower number cars are older and have this been around longer, and with more chances for me to ride them. And the distribution reflects that (note that this chart goes up to the eventual size of the 7000 fleet, which is not yet in full service).

But if you look at the 3000 series, the pattern is different:

As you can see, I haven’t ridden many cars above number ~3150. The reason is that I seldom ride the Red Line, and most of those cars appear to be isolated on the Red Line:

(Apologies for the automatically adjusting vertical scale) Obviously, this is not a huge sample, but the only Red Line 3000-series trips I’ve taken are on the older half of the fleet.

The Red Line is the most isolated line on the system. Also, I ride it the least (and therefore am unlikely to pick up small changes to the fleet management practices).

The next big fleet milestone will be the arrival of the full set of 7000 series railcars, along with the retirement of the 5000 series. That will probably trigger the last round of shifting yard assignments for a given fleet until the arrival of the proposed 8000 series.

One year of tracking my Metro trips

Three months wasn’t enough for me, I needed to spend an entire year compulsively tracking my Metro rides. I know I’m an outlier on this, but it’s been a fun way to ‘gamify’ my commute.

Some fun facts from a year on the rails:

  • The newest car: 7547 (August 30, 2018)
  • The oldest car: 2000 (November 9, 2017)
  • Most frequent car: five trips on 5088, interestingly all of them Orange Line rides.
  • 882 total unlinked trips
  • 592 unique railcars; of which:
    • 373 I’ve ridden in once
    • 160 I’ve ridden in twice
    • 48 I’ve ridden in three times, etc…

Or, to put it in visual terms:

Screen Shot 2018-09-24 at 2.21.41 PM

Given the ever-shifting total size of the fleet (thanks to both new car deliveries and old car retirements), my best guess is that I’ve ridden at least once in 50.8% of the WMATA fleet over the past year.

Screen Shot 2018-09-24 at 2.22.00 PM

Some fun observations:

Lots of 5000 series cars are now out of service. Some reporting suggests only 62 out of the original 192 5000 series cars remain in service. I’ve recorded trips on 75 unique 5k cars, some of which are surely retired by now.

Screen Shot 2018-09-24 at 2.22.21 PM

Railcar types are not evenly distributed across the network: Of my total rides (opposed to unique rail cars), 41% are on 7000 series train, an increase from my trip share after 3 months. Some of that is surely due to 5k car retirements and ongoing 7k deliveries, but some might also be due to my changing commute and because the railcars are not evenly distributed across the entire network. 

For example, a large portion of the 3000-series fleet appears to mostly stay on the Red Line. I’ve recorded lots of trips on cars in the 3000-3150 range (none of them on the Red Line), and far fewer on 3150+.

I ride the Red Line the least often, and thus don’t often encounter those cars, and given that the Red Line is the most isolated in the network, I’m not sure how frequently those cars ‘migrate’ to other rail yards.

Midway through this year, my toddler started at a new daycare located on the Green/Yellow lines. My old commute, both to/from work as well as daycare, was located entirely along the Orange/Blue/Silver trunk. And while the 7k cars are used all over the system, riding the Green Line more often sure seems to mean more rides on 6k trains (only 9.2% of my rides in December; compared to 16% now).

Screen Shot 2018-09-24 at 2.28.35 PM

Unlinked trips by Line

Adding in more Green/Yellow trips subtly changes the fleet mix. My rides on the Green Line are almost exclusively on 6k or 7k trains. The only exception (for two trips) occurred during the concurrent Major Improvement Projects on both the Red Line and the BL/OR/SV between August 11-26 2018, which surely scrambled all sorts of fleet practices.

Screen Shot 2018-09-24 at 2.29.06 PM

Railcar share by Line; total trips by line at the top

I ride least frequently on the Red Line, but even that small sample shows a pattern of only riding 7k and 3k cars. Likewise, some of the 3000 series cars on the Red Line tend to stay there.

Screen Shot 2018-09-24 at 2.29.47 PM

The vast majority of my trips are weekday trips. My weekend use has dramatically declined. Part of that is certainly my lifestyle, pushing a stroller around. But poor weekend service with extensive track work doesn’t help.

Still, fare policy impacts my rides. Since obtaining the SelectPass, I find myself far more likely to take incremental short trips. For example, a two stop rail trip just to beat the heat instead of walking? No problem.

Methodology:

I use a simple Google form to collect the data. I only collect two bits of information via the form: the car number and line color (I do also have an open-ended text field for any notes). Submitting data via the form adds a timestamp. This helps minimize the data input.

I considered adding additional data fields, such as origin/destination station, but opted not to do so. As a result, I don’t have any information on fares, delays, most frequent stations, etc. 

I collect data on unlinked trips, so any single journey with a transfer is recorded as two unlinked trips. I’ve also (occasionally) moved cars on a single trip due to a hot car, and those trips are recorded as unlinked trips.

Also, my riding habits are not random. Aside from my regular commuting routes, I’m often riding to or from daycare with my toddler. Traveling with a stroller puts open space at a premium, which means I’m more likely to pick the 7th and 8th cars on the train when riding with the stroller. This might not matter much with the older railcars, but might skew the data a bit with the 7000 series and the A/B cars. 

Populating DC

Things going up. CC image from flickr.

Things going up. CC image from flickr.

Some assorted Census/demographic items from recent days:

DC’s population is closing in on 600,000 residents.  One of Ryan Avent’s commenters (rg) notes the historical issues with the accuracy of the Census Bureau’s annual population estimates for cities and urban areas:

Building on what Eric wrote: throughout the late 1990s, the Census Bureau estimated that the District was hemorrhaging population, right up to the 1999 estimate. Lo and behold, when they actually conducted the Census in 2000, it turned out that the 1999 estimate was off by tens of thousands of people: in 1999 the Census Bureau estimated the District’s population was 519,000; the 2000 Census counted 572,000 people in the District!!! They were WAY OFF in 1999. I write this not to trash the Census Bureau but to note that their estimates can be quite suspect. In the case of urban areas, it seems that their methodology, at least in 1990s, was biased against urban areas. So, do not be surprised if the actual 2010 Census count is much higher than this 2009 estimate.

This is indeed true.  The 1990 Census put DC’s population at 606,900.  That same year, the population estimate for the city pegged the population at 603,814 (the decennial census is a snapshot of the nation on Census Day, April 1 of each 10th year – the population estimates are supposed to be a snapshot of July 1 of each year…), and things went downhill from there, at least in terms of the estimates:

Year    Population    Change
1990    603814
1991    593239    -10575
1992    584183    -9056
1993    576358    -7825
1994    564982    -11376
1995    551273    -13709
1996    538273    -13000
1997    528752    -9521
1998    521426    -7326
1999    519000    -2426

This decade hasn’t seen the same massive declines from year to year, yet it remains to be seen if the positive signs from the population estimates will translate into the same kind of bump seen from the 1999 estimate to the 2000 Census.  Compare the previous decade to this one:

Year    Population    Change
2000    571744
2001    578042    6298
2002    579585    1543
2003    577777    -1808
2004    579796    2019
2005    582049    2253
2006    583978    1929
2007    586409    2431
2008    590074    3665
2009    599657    9583

Either way, the 2010 Census effort will be vital for the city.

More is better: Various folks chime in on the new growth  – Loose Lips, taking note of the Post’s article, for example.

D.C. Council member Jack Evans (D-Ward 2), whose district stretches from Georgetown to Shaw, gave credit to former mayor Anthony A. Williams (D) for the city’s apparent population rebound. Williams, who was in office from 1999 to 2007, set a goal in 2003 of adding 100,000 residents in a decade. Williams invested heavily in development, improving city services and reducing crime.

“The whole image of the District of Columbia began to change from a dangerous, dirty, unsafe place to a very different city,” Evans said.

Council member Jim Graham (D-Ward 1) dates the changes to 2005, with the construction of thousands of downtown apartments. The ensuing influx, Graham said, changed the character of his ward, including neighborhoods near the Columbia Heights Metro station, 14th and U streets, and the eastern end of the U Street corridor.

“We’ve always felt that we were having this population growth, but it just wasn’t being reflected in the data,” Graham said.

Indeed – and the best way to get the data to reflect the on-the-ground reality is to have a strong showing for the 2010 Census.

Domestic Migrants: Ryan Avent and Matt Yglesias look at the primary cause in the uptick in DC’s population – domestic migration.  People are moving here, as a net positive, for the first time in a long time.

Data Types: Jarrett Walker notes some changes in the way detailed economic and transportation data will be collected and organized.

Overall, the neighborhood-level American Community Survey is going to be a great thing.  It will present in rolling averages of the last five years, so it will show a bit if a lag, but it’s an important step.  You can’t fix what you can’t quantify.

That last sentence brings to mind one of the City Paper’s quotes of the year, from former City Administrator Dan Tangherlini:

Optimism without data is really just an emotion.