Next Steps in Congressional Openness: News from the May Bulk Data Task Force Meeting

The 21 year-old legislative information website THOMAS will be retired on July 5 was the top news from last Wednesday’s congressional Bulk Access to Legislative Data public meeting. The fact that THOMAS was shutting down was not news, but the timing was.

While it didn’t generate a story in the press, two other developments are particularly important regarding how Congress engages the public. For the first time, the meeting was webcast and panelists—who came from offices and agencies throughout the legislative branch—responded to questions from people inside and outside the room. This will soon become regular practice; and video will shortly be available. Even more striking, Congress is responding to technical comments made on GitHub to the data it releases, creating an ongoing, real-time conversation about public access to legislative information with all the relevant stakeholders. This is a big deal. 

Four substantive issues were covered at the meeting.

First up was the mechanism by which Congress is publishing nearly all of the data behind THOMAS in a structured format that computers can process. It is this new form of publishing data that allows for THOMAS’s retirement. Many organizations that were scraping THOMAS for its data—in essence, arduously transforming information on its web pages into a database—can now pull the information from the new structured data repositories. The presentation centered on the Bill Status XML files, which describe where legislation stands in the legislative process, and the draft XML user guide, a user guide to the data.

Significant conversation focused around when Congress/GPO updates the information it pushes out to the public. Early publication meets the needs of some—who want to get alerts to their users–but risks being incomplete. Later publication is complete but not timely for some of its uses. Ultimately, the conversation resolved around two publication times and the creation of an alert so that people know when the data update is finished. The issue of a public API also came up, with the answer that Congress is focusing on bulk access first and the API is a more complex political question.

Second, the House Rules Committee made a presentation on its publication of the House Manual and Rules in a structured data format. Previously, the information was only published as a PDF, which makes it just about impossible to make use of the contents of the document. The new publication format supports many new uses, including comparing how the Rules of the House of Representatives have changed over multiple Congresses. (They have published the Rules for the 114th, 113th, and soon will publish them for the 112th.)

The review process to transform the PDFs to structured data was so detailed that errors in the House Manual were identified and corrected, and hopefully all the documents will be easier to maintain in the future. By way of background, the Rules Manual is authored by the Parliamentarian but published by the Rules Committee. The Committee is looking for other documents that it can publish—we suggested the rules of the various committee should also be published in a structured data format. The Rules Committee deserves tremendous credit for transforming these crucial documents into data, which will better serve the needs of the House and the general public.

Third, I discussed the recommendation of the Congressional Data Coalition regarding where appropriators should focus their energies with respect to opening up Congress. Coming from the discussion, we were invited to submit recommendations at the start of the new Congress geared towards the work of the Bulk Data Task Force, which will allow us to include recommendations that go beyond what appropriators usually can do. We are very fortunate to be able to help support a discussion on improving openness.

Finally, we were reminded that the Legislative Data and Transparency Conference  is set for June 21. This is the 5th annual conference focusing on opening up information about the legislative process. (Register here.) The Committee on House Administration is taking suggestions for topics that should be discussed.

All in all, it was a productive meeting and an indication that Congress’s commitment to opening up–with particular emphasis in the House–is bearing fruit, to the benefit of Congress and the American people.