Congressional Hackathon 7.0: Coding, Collaboration, and Culture Change on Constitution Day

The Congressional Hackathon that took place this past Constitution Day on September 17, 2025, is a rarity in today’s Washington. Civic minded citizens joined with congressional staff in a day-long discussion inside the U.S. Capitol on strengthening Congress by improving and democratizing its technology. More information from the event, including video and a report, will be published on the Legislative Branch Innovation Hub.

The bipartisan nature of these events is a recurring theme, with senior Republicans and Democrats always offering opening remarks and providing space and support inside Congress. Indeed, the Speaker broke news at the Hackathon, announcing the House would be distributing up to 6,000 user licenses to congressional staff to use Microsoft’s M365 Copilot AI-powered chatbot for a year.

This is the seventh such Congressional Hackathon, with the first organized back in 2011. This iteration also was the fourth annual event in a row. I’ve been to them all and this event brought forth a groundswell of energy. Some of it came from the new coding breakout session, where almost 90 participants built tools side-by-side with congressional staff that provide insight and services to people in Congress, across government, and around the country. It reminded me of the civil society-organized hackathons of the 2010s that had so much energy and potential. We all owe a debt of appreciation to Speaker Johnson, Democratic Leader Jeffries, and CAO Catherine Szpindor for co-hosting.

Collaboration and culture change

One thing that has changed over the last fifteen years is the nature of congressional-public collaboration. In the beginning, some inside the congressional firmament were opposed to releasing data to the public. Even using the word “hack,” as in hackathon, was controversial. It took bipartisan leadership from Reps. Boehner, Cantor, McCarthy, Hoyer, Honda, Issa, and many others (including many unsung staff in political and non-political offices) to bend the arc. Now there is a culture shift, where many in the Legislative branch embrace modernization and the pockets of resistance have grown smaller and less influential.

Since the Hackathon became an annual event in 2022, presenters in its “lightning round” phase have steadily demonstrated more and more new projects and tools utilizing increasingly accessible legislative data. The origins of the culture change in Congress were a fight over whether data about legislation should be made publicly available. The result wasn’t just the publication of bulk access to legislative data, but the creation of the Bulk Data Task Force, a like-minded group of internal and external stakeholders that has been meeting quarterly for 13 years, encouraging and supporting technological modernization. One outcome was the Library’s granting of a longstanding public request for publication of congress.gov data as an API, which launched in October 2022. That API, its manager announced, has received 1.3 billion requests over its three year existence.

The Hacking Session

Last week, the progress made in bringing Congress into the digital age allowed developers for the first time to work on projects collaboratively during a morning session of the Hackathon. Some participants utilized a nascent Legislative branch data map, the existence of which had been requested by Appropriators. The map is an effort to identify across the Legislative branch the different sources for congressional data and drew rhetorical inspiration from the 2013 executive order on making open and machine readable the new default for government information, the 2018 Open Government Data Act, and advocacy from public-interest minded groups.

Presentation on the Legislative Data Map, featuring Steve Dwyer and Daniel Schuman

The map data was seeded from my 2023 biased yet reliable guide to sources of information and data about congress and collaborations from governmental members of the Congressional Data Task Force, although its GitHub repository quickly drew pull requests from hackathon participants as well. In other words, anyone can suggest items to add to the list of Legislative branch data sources and that list can point to official and non-official sources for data. This is important because it’s very hard to find data about Congress and a lot of the data is published in ways that are hard to reuse.

The map improves the findability of information and makes it discoverable when someone has done the hard work to refine that information into a useful format. It was my pleasure to present, along with the House Digital Service’s Steve Dwyer, on this new map that will catalyze the ability of technologists to build tools for and about Congress.

The data map became immediately handy during the coding portion of the hackathon, where Steve and I highlighted it to the coders who were present in the side room.

Among the hacking projects presented at the end of the day were:

  • Income tax analysis by congressional district, looking at the changes in taxation resulting from abolishing the SALT deduction, from Policy Engine. (Click through to see the visualizations.)
  • CongressTrack, an effort to show the extent to which politicians keep their campaign promises.
  • Committee Companion, which uses AI to generate well-formatted transcripts from committee proceedings.
  • Capitol Voices, which creates AI-powered congressional hearing transcription and analysis. It also identifies the speakers, themes, and sentiment analysis.
  • Who is Talking in Congress, which breaks down the unique speakers identified in a transcript and their last names.
  • Congressional YouTube Dashboard, a Civic Tech DC congressional-tech project that identifies how often House committees remember to publish the event IDs in YouTube videos. It is only through these event IDs that the Library of Congress identifies that a particular YouTube video is associated with a particular hearing. Notably, this project shows which committees are good at adding these IDs and which ones are lagging. Click through this link to see the visualization of the best and worst performing committees.
  • Witness Witness, a proof of concept that creates a central tracker for all House and Senate hearing witnesses. They identified more than 55,000 unique congressional witnesses across 28,000 hearings. They wrote about their efforts on LinkedIn and posted this live demonstration. (Note: be patient while it loads… it’s worth it.)
  • Witness Visualizer, which creates an interactive knowledge graph visualizing the relationships between (House) witnesses, committees, topics, and organizations.
  • Disbursements, which moves House Disbursement (i.e. spending) data into a database and provides the ability to search and filter the data as well as for users to ask questions (via AI) concerning the dataset.

Additional projects. A few presentations during the lightning round also have great relevance to the development of technology tools and we want to highlight them here.

  • Hearings, presented by Abigail Haddad, is a great project that uses logic to identify the relevant YouTube video for a House hearing even when the committee has failed to include an event identifier. The live demo, focused on the House Energy and Commerce Committee, shows where the developer was able to match a hearing name with the correct YouTube video.
  • Craig Butler from the House Digital Service and Pat O’Brien from the Senate Democratic Steering and Policy Committee demonstrated an extension of the old Capitol Bells App to digitize the old clock alert system in buildings on the congressional campus. This new system retrofits congressional clocks to transmit a digital signal when vote alarms sound, pushing out notifications and creating new entries in HouseCal.
  • Bill Tracer, developed by Ruzanna Gaboyan and Philip Golczak, creates a track-changes comparison between the introduced and engrossed versions of a bill, with a particular focus on appropriations bills. (The House has built its own comparative print tool, but it’s not available to the public. GovTrack.us has a version of this, and Bill Tracer is working off GovTrack data.)
  • We already mentioned the presentation from the Library of Congress’s Andrew Weber on the Congress.gov API, but we would be remiss not to mention that the API now has some CRS reports and House Roll Call votes. If you want new features for the API, a great place to request those features are on the Library’s GitHub repository, through the Library’s survey form, at the quarterly Congressional Data Task Force meetings, and at the Congress.gov public forum on September 30th.

Other presentations during the lightning round of the afternoon’s session demonstrated the innovative potential in using large language models and machine learning with the data available through the congres.gov API. Two projects, including one developed by a pair of Fairfax County high school students, addressed accessibility issues with government publications for people with disabilities. The nonprofit Policy Engine introduced their system that extrapolated the economic impact of bills onto the state and district level.

But Wait, There’s More

Congressional App Challenge Director Joe Alessi spoke about the program, where middle and high-school students from across the country built apps. The program is intended to foster an appreciation for computer science and STEM, with each congressional district choosing a winner and it culminating in an event, House of Code, in Washington, D.C.

There also were a series of breakout groups on various policy topics. Issues included modern committees, constituent services and communications, legislative data, AI, and cybersecurity. The groups met for an hour and reported out at the end of the conference. We were unable to collect all their recommendations, but you can learn more when the official hackathon report and video are released.

If you’re interested in prior hackathons, check out our write-ups:

Legislative Branch Data Map

The House Digital Services Steve Dwyer and the Congressional Data Coalition's Daniel Schuman unveil version 0.1 of the Legislative Branch Data Map at the Congressional Hackathon 7.0 on September 17, 2025. Photo credit to Josh Tauberer.

One important project catalyzed by the 2025 Congressional Hackathon was the coming together of a Legislative branch data map, the existence of which had been requested by Appropriators. The map is an effort to identify across the Legislative branch the different sources for congressional data and drew rhetorical inspiration from the 2013 executive order on making open and machine readable the new default for government information, the 2018 Open Government Data Act, and advocacy from public-interest minded groups.

The map data was seeded from my 2023 biased yet reliable guide to sources of information and data about congress and collaborations from governmental members of the Congressional Data Task Force, although its GitHub repository quickly drew pull requests from hackathon participants as well. In other words, anyone can suggest items to add to the list of Legislative branch data sources and that list can point to official and non-official sources for data. The map improves the findability of information and makes it discoverable when someone has done the hard work to refine that information into a useful format.

Here is where you can find and suggest edits to the Legislative Branch Data Map, version 0.1.

Library of Congress Sets Congress.gov Forum for Sept. 30, 2025

A modified Library of Congress logo that adds 1s and 0s to the text of the word Library and includes the phrase 2025 Congress.gov Public Forum.

The Library of Congress will hold its annual Congress.gov Public Forum on September 30th, 2025, from 1-3:30 PM ET. The hybrid forum will allow people to attend in person at the Library of Congress’s Jefferson building or online. Here is how to RSVP.

Continue Reading

Congressional Hackathon 7.0 announced for September 17

The Seventh Congressional Hackathon will take place on Wednesday, September 17th, from 1-6 pm in the CVC Auditorium at the U.S. Capitol. The non-partisan event was jointly announced by Speaker Johnson and Democratic Leader Jeffries and will be co-hosted by the House Chief Administrative Officer. The event is open to the public, and pre-registration is required. From the announcement:

Continue Reading

Save the date: Congressional Data Task Force Meeting Scheduled June 10, 2025

Congress has announced the next Congressional Data Task Force will next meet on June 10, 2025, from 2-4 pm. This will be a hybrid meeting, with the opportunity to join online or in person in Longworth B-248. Registration is required.

Continue Reading

Library of Congress Publishes Some CRS Reports as HTML and via API

Today the Library of Congress began publishing (some) Congressional Research Service reports as HTML and making (some) reports accessible via API. See the announcement here.

Continue Reading

Congressional Hackathon 6.0

Congressional Hackathon 6.0 took place on September 19, 2024 at the U.S. Capitol, co-hosted by Speaker Mike Johnson, Leader Hakeem Jeffries, and Chief Administrative Officer Catherine Szpindor. The event brought together congressional stakeholders to explore the role of digital platforms in the legislative process. After the event, organizers released video from the full proceedings as well as a highlights reel, and are expected to release a report summarizing the proceedings. You can find official resources on previous hackathons here.

Continue Reading

Library of Congress Public Forum: September 8, 2024

The Library of Congress hosted a public discussion on Congress.gov on September 18, 2024, the fifth such forum it has held. You can watch video of the forum or read the Library of Congress’s summary of the discussion. We published a summary of prior forums from 2023, 2022, 2021, and 2020. They are held pursuant to direction from Congress, which required the Library of Congress to meet with the public concerning access to data from Congress.gov. More than fifty people attended in person and 400-500 people were expected to participate online.

Continue Reading

Congressional Job Listings as Data

The House of Representatives publishes job postings for personal, committee, and leadership offices in the form of a weekly email sent to subscribers. This means that if you’re a job seeker, you must read each PDF each week to see whether there’s a job that might interest you. And if you’re interested in monitoring job posting generally, or sharing those postings with others, you’re out of luck — the information is not published in a data format that allows for re-sharing and analysis. (More on that in a minute).

Continue Reading

Recap of Congressional Data Task Force Meeting on December 12, 2024

The Congressional Data Task Force met for the first time in its typical forum setting since June, as the sixth Congressional Hackathon represented its fall quarterly meeting in September. The video of the event, presenter slides, and agenda are available at the Legislative Branch Innovation Hub. It’s worth noting that the Congressional Hackathon is now an annual event.

The congressional staff directory recommended by the House Select Committee on the Modernization of Congress received its first batch of users last month, representatives from the office of the Chief Administrative Officer shared. About 100 staff are participating in a private beta test of what’s called LegiDex, including its mass email function that makes it possible to send targeted emails to specific staffers organized by issue area, party, state, title, and so on. Directory data comes through multiple data sources, including data payroll daily, and includes about 30 different role titles the team developed. Staff can edit their offices’ information manually including issue areas covered by legislative staff. Integration of committee assignments is planned. 

At this moment, LegiDex is only available to House users as the team needs personnel data from the Senate and some legislative branch agencies for full functionality Legislative branch-wide. Staff in the Senate and the Congressional Budget Office can see a limited demonstration of the platform, however, and will have full usability as they share their data. The intention is to make this available to everyone across the Legislative branch.

CAO also is migrating HouseNet to the AWS cloud and will turn off the old site in a few weeks. The move is intended to improve access from mobile devices and integrate better with other systems, including LegiDex. 

CAO also demonstrated Persona, a tool the House Digital Service developed internally to help it better understand its users’ needs, pain points, and context within the congressional system. CAO staff interviewed members and staff to develop sample profiles of the type of work done across member offices, leadership offices, and committees daily. It also displays organizational charts for personas within those offices all to help legislative branch staff better understand the complexity and relationship networks within the House.  

The Secretary of the Senate’s office is nearing completion of a report of a working group studying access and preservation of congressional video. They have settled on a cloud infrastructure platform and have developed frameworks for addressing long-term preservation of what will be considerable data. One of their intentions is to create an archive of past senate floor proceedings and make it available to integrate into other information sources. There are no plans, however, for a repository for older videos. The working group has not decided whether to share the report publicly. 

The Secretary’s office also completed converting the old Capitol Bells app, which alerts users to updates in the House and Senate legislative call systems run by the Architect of the Capitol, into an API. Users on the Senate intranet can see a description of what the bells mean and receive alerts on things like adjournment. It’s only available to the Senate at the moment, but the office intends to make it available to other congressional data partners and the public in time.

Display of roll call votes on Congress.gov is receiving a speed upgrade as the Clerk of the House has authorized the site to consume chamber vote data. This authorization also means that the same data will be in the Congress.gov API. The Clerk’s API updates every 15 minutes during votes, so the updates won’t come in real time, but within 30 minutes at most, the Clerk’s Office explained. 

The Clerk’s Office also is launching a new internal committee portal for tracking committee activity, including votes. The office is working first on a system of unique identifiers for individual committee votes for an electronic tally sheet for roll call votes. 

To help a statutory requirement to track and report on expired and expiring appropriations authorization annually, the Congressional Budget Office has developed an LLM process to shorten an incredibly time consuming process. Currently, CBO staff have to search both public laws and appropriations bills and track down individual appropriations manually to compile the report. A team leveraged the Clerk’s Comparative Print Suite to identify changes in the US Code and trained the LLM on sections relevant to authorization language to highlight relevant public laws for the report. The team is now pursuing developing a prototype, potentially with the help of Amazon’s Generative AI innovation Center or universities. This project came directly from the second yearly internal CBO hackathon last August. 

The Government Publishing Office has passed the half-way point in digitizing the Congressional serial set volumes, the nearly 16,000 bound books that collect the records of each Congress. GPO has broken these massive volumes into individual reports and other documents so users do not have to come through hundreds of pages in a specific volume. Nearly 72,000 congressional reports and 36,000 documents, journals, rules, and manuals have been digitized from 8,500 volumes. It’s unclear if GPO can prioritize, however, more recent volumes for digitization.   

In December, GPO also released code to provide access to the US Statutes at Large from 2002 back to 1789 in USLM XML on GovInfo and via API. It also will start making XML and graphics files from the collection available on GovInfo. The process of posting all XML files will proceed incrementally to assure quality control and likely will take a few years to complete. 

GPO marked the one-year anniversary of its digital collection of congressionally mandated reports, which now include 550 titles from 70 federal organizations. The Congressionally Mandated Reports Act requires reports mandated to Congress and specific committees to be submitted to GPO in a digital format. Because the Clerk does not receive and is not required to compile a list of reports required by committees, GPO is learning about the scope of the collection as it goes along.

A working group of House, Senate, Library of Congress, and GPO staff have launched a project to model House and Senate committee and conference reports in USLM XML going forward. It has created a sample data set and will post progress on schema and samples on a GPO Github repository.

GPO also announced it has launched the first user acceptance testing phase for its XPub bill drafting platform.

Finally, GPO shared that it has digitized the congressional pictorial directories dating back to 1951.

The Congress.gov team reported it is on track to meet the March 2025 mandate to publish Congressional Research Service reports on the website. HTML, PDF files, and metadata will be available for the reports in the Congress.gov API. This is the first time HTML will be released publicly.

The Library of Congress also announced a victory for legislative branch interoperability in the creation of links in congress.gov to the GovInfo collection of statute compilations and links to GPO files for public laws and statutes at large. 

The Library declined to say whether the report on the September public meeting would be made publicly available as appropriators directed.

The Clerk’s office indicated they would work to make information about new members of Congress for the start of the 119th Congress publicly available as soon as possible. Of particular note for data users, the subcommittee codes will be released in the XML file. The release schedule for unofficial member elect data is the PDF of new members will be released tomorrow and the XML file the following week. 

Congratulations and thank you to Wade Ballou, who is retiring from serving as the House’s legislative counsel.