The Government Publishing Office grabbed the spotlight at the Congressional Data Task Force meeting on December 13 by announcing that it is launching a Model Context Protocol server for artificial intelligence tools to access official GPO publication information. The MCP server lets AI tools like ChatGPT and Gemini pull in official GPO documents when answering questions.
Here’s why this matters: Large Language Models are trained on large collections of text, but that training is fixed at a point in time and can become outdated. As a result, an AI may not know about recent events or changes and may give confident but incorrect answers.
Technologies like an MCP server address this problem by allowing an AI system to consult trusted, up-to-date sources when it needs them. When a question requires current or authoritative information, the AI can request that information from the MCP server, which returns official data—such as publications from the Government Publishing Office—that the AI can then use in its response. Most importantly, the design of an MCP server allows for machine-to-machine access, helping ensure responses are grounded in authoritative sources rather than generated guesses.
Adding MCP creates another mechanism for the public to access GPO publications, alongside search, APIs, and bulk data access in XML and JSON. It is a good example of the legislative branch racing ahead to meet the public need for authoritative, machine-readable information.
GPO’s Mark Caudill said his office implemented the MCP both to respond to growing demand for AI-accessible data and to avoid having to choose the “best” AI agent. This is in line with GPO’s mission of being a trusted repository of the official record of the federal government. With a wide range of AI tools in use, from general use ones like ChatGPT and Gemini to more specific ones geared toward legal research, GPO’s adoption of MCP allows it to be agnostic across that ecosystem.
A user would configure the LLM of their choice to connect to GovInfo’s MCP, allowing it to draw data from GPO publications rather than being limited to its training data. How well the model interprets those publications and returns quality answers to users is beyond GPO’s control.
GPO also has expanded access to data in ways that don’t involve AI, including expansion of its customizable RSS feeds for users interested in specific types of documents or the latest data from specific federal offices or courts. The office also offers a robust set of APIs that can be searched in a number of ways and connect to related documents like congressional hearings and the Congressional Record. GPO API keys are free.
The House Digital Service team inside the Office of the Clerk recapped the launch of the taxonomy for its casework aggregator tool called CaseCompass, which is previewed at the Congressional Hackathon. The public can download the taxonomy JSON and leave comments on the GPO Innovation Github. Ultimately, the taxonomy will form the backbone of a platform House offices can use to track casework trends by agency, regions of the country, and specific issue area.
Building off the public launch of the legislative branch data map at the most recent Congressional Hackathon, House Digital Service’s Steve Dwyer and American Governance Institute’s Daniel Schuman demonstrated a prototype interface that contained more than 130 data sets. The data map, which is essentially a giant spreadsheet, describes the source of data, what it contained, its format, its provenance, and more. When launched, the data map interface will allow users to search, download, and provide feedback on the holdings. It is their hope that all stakeholders, inside and outside the government, will be able to contribute datasets. They hope to release the interface publicly in early 2026.
I also presented on a project I recently completed to provide more information about members of Congress. Over the last few months I built a dataset that connected representatives’ membership in the 10 major ideological factions in the House to their member bioguide data. In other words, I identified each member of Congress and the major ideological caucuses to which they belong, such as the New Democrats, Freedom Caucus, and so on. Most caucuses publish their rosters on their websites, but that information is not included in the Clerk’s public-facing official member data. Using the Wayback Machine and some secondary source websites, I was able to fill in members’ caucus membership going back to the 117th Congress.
As I explained on the Congressional Data Coalition blog, we have made this data available to GovTrack and hope researchers can utilize it there to reveal things like the density of caucus membership on specific committees, how caucuses are represented on bill co-sponsorships, and how members of caucuses voted on bills and amendments to better understand chamber political dynamics like caucus discipline, cross-caucus collaboration, and issue overlap. The data can be downloaded at this link.
The Office of the House Clerk is days away from launching its committee portal pilot project, which when complete will allow committee staff a central place to manage much of their work. The pilot will include tracking bill referrals to committees, committee votes, and provide vote information to the House Clerk. Eventually, this will be a mechanism for the public to submit their written testimony and truth-in-testimony forms, for the Clerk to publish aggregate vote and witness data, request Official Reporters, and so on. The committee portal has been a long time coming, with Appropriators originally expressing an interest in some aspects going back to FY 2021. The Clerk provided a major public update in the middle of last year.
The Clerk’s Office and the Office of the Secretary of the Senate also discussed their progress on a pilot for making the House’s comparative print suite platform available in the Senate, which has gathered positive user feedback so far. We hope to someday see the results of the comparative print suite—comparing introduced legislation against changes to the U.S. code—included on Congress.gov.
The Secretary of the Senate has finalized a report on how to move forward with its portion of the Congressional Video Project. House and Senate partners are still working out a uniform way to publicly host floor video that will be easy to maintain. Both chambers and the Library of Congress have committed to providing historical floor proceedings on a player that can be embedded in Congress.gov, with video hosting and preservation issues coming into play. We have also encouraged the inclusion of additional House and Senate committee videos on Congress.gov.
The Library of Congress discussed its ongoing user experience research project for Congress.gov, which it will use to improve the site. It completed a round of user interviews in November and December and will conduct an additional round in the early spring. People interested in participating in that next round can email user-research-info@loc.gov to sign up.
Notably, the Library will share the annual report from the Congress.gov team around the same time. The annual report from the Congress.gov team is a big deal and we are very pleased to see the Library willing to preliminarily discuss the report in public at this meeting. The Library detailed breaking out projects by efforts in progress, those it’s deferring for data partners, and those to explore “resource permitting.” We’d strongly encourage the Library, for its efforts in progress, to provide a rough timeline for completion, even if it is short term, medium term, and long term. We’d also encourage the Library, for those requested projects it is looking to data partners to fulfill, to explore what steps the Library could take to clean up already-existing data generated by data partners or to draw from where civil society has already implemented a solution.
