Metrorail Data Download, October 2014
This new data download from October 2014 includes ridership from the five new Silver Line stations.
Over the past few years we’ve been making ridership data available for download and analysis by the online community. We have received some requests for full origin-destination (O/D) data sets that include the new Silver Line ridership.
These data sets include ridership from October of 2014, and are available by period (AM Peak, midday, etc.) or by quarter-hour interval, for all stations including the five new Silver Line stations. Both sets include daily averages for weekdays, Saturdays, Sundays and Columbus Day.
Note, the quarter-hour data file is to big to open in Microsoft Excel.
Have fun playing around with this data and let us know in the comments what you find. Make sure you check out the other assessments of Silver Line ridership we’ve done.
Jan 29, 2015, 10:00 AM Update: Files have been updated to include total and average travel times for each station pair.
Feb 02, 2015, 11:00 AM Update: Files have been updated to separate Columbus Day from Saturdays using a new column “Holiday”.
I remember when I was told years ago this data was not available due to security risk. Thanks for releasing it to the public.
Transparency is good. Lots of data to crunch or look at. Thanks Michael and Shyam.
Thanks for releasing this data, it is very helpful for researchers like myself. Can you explain the meaning, or difference in meaning between the Number Rider SUM field and the AvgRidership Field? This would be very helpful! Thanks!
@Mark F
We’re happy to post this data. Please make sure to share with us any revelations and/or data visualizations with us.
Number Riders SUM is the total monthly number or riders making that trip during that time period and service type. AvgRiders is the monthly AVERAGE, meaning the SUM divided by the number of days in the month of that service type. SUM = 20, service type = Saturday, therefore Average = 5, as there were 4 saturdays in Oct 2014.
Let us know if you have any additional questions.
Is data available to determine the mode choice of getting to the station.
1. Bus (Smart Trip Reader)
2. Drive (Parking Lot Reader)
3. Other (Kiss/Ride, Bike/Ped, Cash Bus User, HOV)
In regards to a potential I-66 BRT system, I am interested in the %’s for the Vienna station
Mode of access is determined from our regular ridership survey, last conducted in 2012. For Vienna we see 26% Bus (including shuttles), 10% non-motorized and 64% private vehicle (including, drop-off, taxi, park and ride, and riding with someone else).
For a visual breakdown by station, check out this post from 2013:
https://planitmetro.com/2013/09/30/how-do-metrorail-riders-get-to-their-station-in-the-morning/#more-6414
Thanks for this, it’s great! Can’t wait to dig into it. Is there any chance that we could get data broken down by not just entrance time but exit time too simultaneously?
@Asaf Reich
Hi, Asaf:
When working with this data on fifteen-minute intervals, including the exit interval could double or triple the size of the data source depending on how reliable the travel times were over the month. I could investigate including total and/or average travel minutes for each record. Would that be of some assistance?
@Steve, @ Michael P
You’re quite welcome! Democratizing the information is part of good planning, and if you have any insights from the data (or anything else on PlanItMetro) do let us know!
SK
@Michael
Yes, I realize that including exit interval would make the data significantly bigger. Having just the average travel time for each current record would still be useful, though!
@Asaf Reich
I’ve updated the data files to include total and average travel times.
@Michael
Great, thanks :-)
The average ridership and total monthly ridership don’t seem to match up correctly. For weekdays, the average ridership is 1/22 the total ridership, but there were 23 weekdays in October 2014. (weekend data appears correct – there were 4 weekends and the Sat/Sun averages are each 1/4 of the total)
Can you advise on which number was taken directly from the raw-data number and which you derived, so that I can be sure to use the correct figures?
@SavetheBlueLine
Monday, Oct 13 was Columbus Day. Therefore, the monthly totals should get divided by 22 to get a daily average. Normally Monday holidays get a “Service Type” of “Saturday(Special)” but this time around Columbus Day got flagged as a Saturday. I’ll look into reposting the data with Columbus Day separated from Saturday. But 22 is the divisor for Weekday totals.
When entrance stations and exit stations are identical, I assumed that was when the rider left the station without riding. But the average travel times are a bit odd, often over 20 minutes, sometimes over an hour. What causes that?
And what was up with Anacostia? Over 7% of rides that began at Anacostia also ended there!
@Tree
Good question; I’ve been looking at same-station entries and exits myself. There are a few explanations:
1 – A lot of this is employees who are active in the station or the system, and who aren’t tapping out in the expected way.
2 – “Bailout” trips during an incident or disruption, when customers enter, see a disruption, and then exit to find another way.
3 – Station managers fixing inverted entries/exits, where cards get out of sync.
4 – Perhaps confused visitors or tourists. These trips tend to spike on weekends, and at stations like Arlington Cemetery, Smithsonian, and National Airport.
5 – Incomplete rail trips from a bus bridge during weekend trackwork.
6 – Surprisingly, some of it may actually be customers purposefully walking in one entrance and out the other. This data only shows station, but if you look at it by mezzanine you can see, for example, folks entering the north (parking) entrance at Anacostia, and exiting the south (neighborhood/bus) entrance in the morning, and reverse in the evening. Same at Dupont Circle. Maybe to get out the rain?
7 – People who just change their minds, for whatever reason.
Hope this helps! But generally, they should represent less than 1% of riders, and I wouldn’t put much weight on the trip duration field for these transactions.
The system-wide average is around 1%, though there are a few that have averaged more than 2% in this period and the previous May. But Anacostia’s same-station rate is in a whole different league for October 2014.
@Tree
Yes, and weeding out employees should cut this down even more.
This is great data, and we’re looking forward to including useful visualizations soon on our Gofairfax (GoFFX) Silver Line/Dulles corridor event mobile app and site.
Hi Michael,
thanks for providing the data. Would like to get in touch with you directly to demonstrate some options. please send me an email. best regards, peter
Metro entrances and exits by station across the whole of the system. Thanks for releasing these data! http://www.reddit.com/r/dataisbeautiful/comments/2ud8u9/washington_dc_metro_total_entrance_and_exits_by/
This is great – is it possible to also get data on average transfer times between lines and/or on average train travel times/speeds between adjacent stations?
@Jillian
Hi, Jillian:
We generally use 5 minutes for transfer times during peak periods: a few minutes walking and half a headway. During peaks, that’s actually generous. Off peak, the headways increase so you’d have to adjust your transfer times accordingly.
The station to station travel times are best found in our GTFS distribution, available here:
http://wmata.com/rider_tools/developer_resources.cfm
Let us know if you have any additional questions.
Thanks Michael!
I was wondering if you had data for every month or if someone looked at the effects of the Stadium-Armory fire?
@Jeff
We published non-OD ridership (entries) for all stations by month for Fiscal Year 2015. Download the data through the Tableau links – does this help?
That data won’t take you up through the Stadium-Armory fire in Sept. 2015, however. We have taken a look at this incident from a ridership perspective and found a small dip, but it’s very small (and beyond Stadium-Armory station itself, the effect is hard to isolate amidst everything else that is happening). A post for the future perhaps?
@Justin
Oh great point about the Tableau data thanks.
Very interesting about the fire, that’s good the effect wasn’t much. Thanks Justin!