Report from the 2015 Legislative Data and Transparency Conference

The House of Representatives recently held its fourth annual Legislative Data and Transparency Conference. The conference provides a unique opportunity for members of Congress, congressional staff, legislative support office staff, legislative support agency staff, and the public to meet and discuss efforts to make congressional information more useful and more widely available to stakeholders inside and outside Congress. As in prior years, it was a significant success.

This blogpost covers highlights of the conference, but if you want more, here are video, slides, and the agenda for the day’s activities.

To start, I would like to note that a representative of the Secretary of the Senate’s office was one of the presenters. Although the Senate has had staff attend in the past, this is the first time a representative participated in the panels. It is an indication of the Senate’s increasing involvement in efforts to open up legislative data.

For me, the most noteworthy presentations concerned modernization efforts around House committee hearing reports and the new “legislative lookup and link” tool. Other highlights included:

  • Amendment Impact Program. The House is continuing to work on its amendment impact program (demonstrated at the 2014 conference), which will show in real time how an amendment would change a bill and how a bill would change the law. When finished, this will transform for the better how everyone reads legislation and make it a lot harder to hide provisions in obscure language.
  • Streaming and archived committee video. The House is moving its video streaming/recordings from Ustream to YouTube, which will allow each committee to stream more than one hearing at the same time. In addition, the Library of Congress will start adding unique hearing IDs to its archive of House videos so it is possible for computers to programmatically connect video of a hearing with its identity (i.e. the topic of the hearing, who testified, when, etc.) When finished, videos will become a lot more shareable.
  • House Member Information. The House Clerk’s office is publishing significantly more information about each member of Congress in a machine-readable format, including committee assignments; the number that may be appointed by the majority or minority; committee/subcommittee addresses; and the member rank on the committee. A user guide is now available online. All of a sudden, it will be a lot easier for civic hackers to connect the dots.
  • Senate Bill Summary and Status Information. The Government Publishing Office is just about ready to release in bulk bill status and summary information for Senate legislation. Here’s the announcement from December 2014. This will make it much easier for civic hackers to gather basic information about congressional operations. GPO also now has a GitHub account.
  • Senate data. The Senate likely will next tackle publishing treaties in machine-readable formats.
  • The Law Library of Congress is moving to update on a more frequent basis, with recent updates including the additional of legislative alerts for particular bills. The appropriations-tracking tables also have been updated. The Library is still working migrating THOMAS information to Once finished, THOMAS can be retired; and is becoming more useful and user friendly for the general public.

A New Way of Compiling and Publishing Committee Reports

One of the unresolved weaknesses in transparency around congressional activities arises from committee hearing reports. These documents traditionally have been expensive to produce and often times there is a delay of more than a year between when a hearing takes place and a report is published. Even worse, sometimes reports are never published. It is difficult to know, but it appears that producing a committee hearing report can be a serious burden on staff and an expensive proposition when it comes to typesetting and publishing as a paper document. (Go here for slides from the presentation.)

Hearing reports usually are comprised of several kinds of documents: a transcript of the proceedings, documents submitted for the record (including written testimony and documents entered into the record), and committee’s findings. (A markup report for legislation would also likely include the bill text, any changes in the text, vote information, and other matters, although this was not specifically discussed at the conference and may be outside of the scope of the project.)

Gathering and proofing each of these documents for inclusion in a final document can be time consuming. Often times, transcripts are not verbatim transcriptions of what was said, and it may take a long time for staff to sign off on the remarks. Documents may be submitted for the record in multiple formats, which may be incompatible with advanced processing. As a result, submitted documents may be published as the equivalent of non-searchable picture files. And, most of the time, there is no table of contents at the front or index at the back. All of these documents must be compiled and sent to GPO, which apparently typesets them by hand.

The committee hearing modernization effort turns all of this on its head. In part, it allows committees to automatically compile and electronically typeset the report without having involve GPO at that stage. Documents submitted for the record likely will be required to be in a certain format, so that they can easily be ingested into the system. Transcripts will become more like verbatim accounts, with corrections only for typographical and grammatical issues, allowing a much faster turn-around time. And with all of these documents properly formatted, printing turn-around time can be a matter of days, not months (or longer). Adding a table of contents, index, and even links from the transcript to a corresponding video snippet all becomes feasible. And the documents contained in the report have an independent existence outside of the print document and can be accessed through other document management systems, like

GPO will still print the documents, but they will receive letter-perfect versions from the committees. Processing will be minimal.

Part of what became apparent during the presentation is that, in recent years, committees have not spend much time thinking about the way their reports look or how they meet the legislative and archival purposes for which they were created. Committees have been doing the same thing year-after-year because that is the way it always has been done.

By rethinking report publication in the electronic age, committees and congressional staff are reexamining the role and purpose of the reports themselves. Indeed, part of the Q&A following the presentation concerned how members of the public could more easily submit testimony and associate their comments with the reports. It also it may become possible to more easily identify and track witnesses who have testified at different hearings.

What we saw was a work in progress, but the progress appears to be in the right direction: faster, cheaper, and more useful. The House clearly is at the beginning of a multistage process.

Legislative Lookup and Link

Ed Grossman, from the House’s Office of Legislative Counsel, gave a fascinating preview of a new tool to improve transparency for legislative references. It will be first made available to congressional staff and then ultimately the public. Here are the presentation slides and demo website. (As a side note, it was a real pleasure hearing from the Office of Legal Counsel, as much of their work is internal-facing so there’s less opportunity for public dialog.)

When legislation is drafted, it is essential to know what law is being amended… and to be able to see the text of that law being amended. For reasons I have discussed elsewhere, for the U.S Congress, that is a difficult proposition.

The legislative lookup and link tool is designed to address several concerns. It makes it possible for staff to make connections between references in legislation and the provisions to which they refer. It addresses the House’s new rule that requires inclusion of citations for amendment and repeal. And it makes it possible to actually see and make use of well-formatted legislative text (i.e. the letter of the law).

The legislative lookup and link tool is composed of three parts. It contains a parser, which identifies references in text to another provision of law and interprets it into a standard way of referring to a provision of law. It resolves the citation string generated by the parser into that standard reference. And it provides a hyperlink to the appropriate provision of law. (It also displays the full text, if available.)

The hyperlink will connect with multiple government databases, including the US Code, compilations of public laws, and bills.

One of the thorny issues arises from trying to resolve citations that are at the subsection level or below. Another challenge is handling multiple string citations.

The tool currently is in beta. It currently:

  • Interprets text
  • Creates a canonical reference
  • Returns alternative citations (such as the law’s popular name)
  • Returns a link to the content itself
  • Shows the content in a preview window
  • Allows conversion between popular names and public law; and between public law and the US Code.

Outstanding issues include:

  • Being able to drill down inside PDFs
  • Creating a style sheet for some XML
  • Handling complex references
  • Identifying models for ongoing support and maintenance
  • Addressing/publishing compilations of the laws as amended and gaps in accessible public laws.

Once completed and publicly available, this new tool could address important public access issues to the laws as passed (and amended) by Congress. One work-around, in the meantime, has been built by Joe Carmel at Legislink, which publishes all laws passed by Congress from 1789 forward, and allows look up by the various kinds of commonly-used citations.


Below is the video from the conference.