
Our first featured project is from one of the legends in congressional data and data journalism, Derek Willis of the Merrill College of Journalism at the University of Maryland. This spring, he released Congress Press, a spiffy update of his efforts to collect and share congressional members’ press releases he first started at ProPublica.
Congress Press contains 26 years worth of data from more than 860 members of Congress, which makes for nearly 675,000 individual releases. As Willis explains on his website, Users can download JSONL files by month that include full text, the member’s unique bioguide id, and other personal information. The press releases are gathered via web scrapers that Willis has optimized using Claude Code. The code and data for the project are on GitHub.
When we recently talked to Willis about the project, he said he initially set up the bulk download site to assist academics studying political communication looking for data. He’s also heard from civic app developers building legislative information tools that will incorporate this information. Down the road, he’s hopeful that local journalists or civic groups could develop their own applications that filter by metropolitan area or topic.
Press releases, of course, are only one way members communicate with constituents, the press, and other members. They tend to be more formal and even keeled rhetorically, Willis noted, than blast emails or social media posts. That contrast itself is interesting.
