Bulk Data Task Force Reports Major Strides at October 2019 Meeting

The Bulk Data Task Force (BDTF) is essentially the justice league of legislative data. 

The task force convenes each quarter, bringing together the people in charge of managing Legislative Branch data—like the House Clerk, Secretary of the Senate, GPO, and Library of Congress—as well as outside stakeholders. Together the group works to make legislative data freely accessible to all.

The task force convened last week at the Legislative Data and Transparency Conference.

Here are the highlights:

Continue Reading

7th Annual House Legislative Data and Transparency Conference Announced

The seventh annual Legislative Data and Transparency Conference has been announced!

On Thursday October 17th, agencies, data users, and transparency advocates will come together to discuss Congress’s efforts to make legislative information available to the public as data.

The conference covers what’s working well, what’s not, and provides an opportunity to hear from and meet with the people working to make things better.

You can RSVP for the Thursday, October 17, 2019 event here.

You can find recaps of prior conferences and links to video from the conferences here:

Recap of the July 2019 Bulk Data Task Force Meeting

Last week the Bulk Data Task Force (BDTF) convened internal and external stakeholders to discuss, you guessed it, congressional data. 

Established in 2012, the BDTF brings together parties from across the legislative branch—including the House Clerk, the Secretary of the Senate, Government Publishing Office (GPO), Library of Congress (LOC), and more—as well as external expert groups to make congressional information easier to access and use.

Scroll down for a list of tools, both currently available and in the works, as well as announcements from the meeting. 

New Tools

In development phase

“Track changes” for legislation: The Clerk is working on a platform that will allow for comparing versions of legislation; staff will be able to see how an amendment changes a bill and a bill changes a law. A version of the tool is already available to the House Office of Legal Counsel and a minimum viable product will be available to legislative counsels in August or September. The full version of the tool could be available at the end of next year, but TBD if it is for internal use only.

In research phase

Automated bill sponsorship tool. There are about 135,000 co-sponsorships on bills every Congress; the Clerk’s office currently spends five hours of each day in session collecting handwritten sponsor sheets and inputting names. The Clerk is examining the viability of creating an automated tool that provides a list of bills available for co-sponsorship online and, through secure means, allow Members to request their names be added to a bill. 

Unique identifiers for lobbyists. Currently, lobbyists are assigned unique identifiers (IDs) but those are not disclosed to the public; this makes tracking lobbyist activity very difficult. For example, if someone fills out their lobbying forms and there’s a typo, or they write their full name one year and a nickname the following year, there’s no way to tell that all these forms are covering one individual’s activity.

In discussion phase

A live feed of House floor votes. No plans have been made yet. 

Available now: 

An API for bill status in the Bulk Data Repository. You can find the GovInfo API here; to access it you will need a key from APIkey.data.gov.

Standardized committee witness forms in PDF format. Documents’ naming convention is “TTF” so if you’d like to look up witness truth in testimony forms you can go to Docs.House.Gov and search for “TTF.” 

Sites

The public can give feedback and submit requests for documents, data, and fixes at github.com/usgpo/bill-status.

Durable links to government information, can be found at GPO’s link service.

RSS feeds for content and metadata can be found at govinfo.gov/feeds.

When in doubt, check out the Legislative Branch Innovation Hub, home base for legislative data

The United States Web Design System is an open source site that brings together government engineers, content specialists and designers to make building government sites easier.

The Tech Timeline covers congressional tech history from the first House telephone in 1880 to the first House website in 1994, plus everything before and after.

New Sites

The Clerk’s Consensus Calendar tracks bills with 290 or more sponsors. According to a new House rule for this Congress, each week the Speaker must pick one of the bills with 290 or more sponsors for 25 legislative days for consideration on the floor. 

HouseLive.gov is being moved to a beta version of Live.House.gov.

The in-house video clipping tool has been replaced by FloorClips.House.Gov.

In August, ClerkPreview.House.gov will move out of beta and become Clerk.House.gov. Scrapers using the site may be disrupted or broken. 

Announcements

The 2019 Data Transparency Conference will be happening this fall, specific date TBD. Suggestions for topics and dates can be submitted on the github innovation site.

FDSys is officially fully retired. The old federal digital system was replaced by GovInfo.gov which has been online in beta since 2016 and out of beta since January 2018. 

Thomas was retired in 2016 and its replacement Congress.gov has had several upgrades. For example, you can now sort search results by subcommittee and historic committee names will auto-populate. Looking ahead, the Library hopes to offer email notifications for committee hearings and meeting information. 

You can trust the  Government Publishing Office: GPO was certified as an ISO 16363 trustworthy digital repository. It is the first U.S. organization to earn the certification, and the second in the world. 

Save the Date: BDTF Meeting on July 9

The Next Bulk Data Task Force will be on Tuesday, July 9, from 11:00 – 12:00, in Cannon B03 Cannon. If you cannot make it in person, it is possible to join remotely via a Zoom conference. (Contact the Clerk to make arrangements).

On the agenda:
1. Introductions/ BDTF Background
2. Project Updates
• GPO
• LOC – Congress.gov
• Clerk – Comparative print / New Clerk website / HouseLive
3. 2019 Data Transparency Conference
4. Questions/ Discussion

The last meeting was in late October, 2018, and there was a lot of news.

More information on the BDTF can be found on the Legislative Information Resource Hub, and don’t miss the recent Congressional Transparency Caucus event that featured 10 technology tools for legislating and oversight.

Transparency Caucus: innovative tools and technologies for Congress

On June 7, the Transparency Caucus of the U.S. House of Representatives hosted a remarkable forum inside of the United States Capitol that featured ten presentations from government officials and members of civil society on innovative tools and technologies. Following is a run down of who spoke and the services, tools and projects they shared:

Video from the event is available below.

What You Need to Know About the November 2018 Bulk Data Task Force Meeting

This past Thursday’s Bulk Data Task Force meeting had a ton of info about technology in Congress. Here are some highlights:

Info about newly elected members of the House will be online in a structured data format from the House Clerk’s office by Nov. 13, and updated weekly thereafter, an amazing turnaround for data that historically has been hard to come by until January.

The Library is behind on publishing CRS reports. All “R series” reports should be up by the end of April, with the remaining reports expected by Sept. 30th. The statutory deadline was this past Sept. 18th. As of Sunday, CRSReports.Congress.gov had 1,251 reports. In the meantime, you can read all the current reports at EveryCRSReport.com.

A consolidated calendar for House and Senate meetings won’t be launched by the Library for the December 21st deadline, but a first phase will be completed in the first quarter (i.e. by Friday, March 29); it is expected to include information about all congressional proceedings. Integration of links to videos of proceedings may take longer. In the meantime, use GovTrack’s congressional committee calendar.

The refresh of Bioguide.Congress.gov — the website containing the names, photos, and biographies of every member of Congress — is on track. By the end of the March 2019, the information will be published as structured data and put on a secure (HTTPS) website. The long term goal is to create a publicly available API.

The House’s Truth in Testimony forms will become webforms. For now, they’ll generate PDFs that the committees will post individually to their webpages, but the long term goal is for the data to go into a central repository for publication, making it possible to track people as they testify before multiple committees.

GPO released an initial set of 40 Statute Compilations as a pilot on govinfo last week. More is coming. The compilation includes public laws that either do not appear in the U.S. Code or that have been classified to a title of the U.S. Code that has not been enacted into positive law.

Want to see how a draft bill would change the law in real time? The House is still working on it. The target date for making that tool available to Congressional staff is the end of 2019. And to the public? No date is set.

A few quick resources :

— GPO’s Developer Hub is a great resource for data stored on govinfo.gov.

— The Leg Branch Innovation Hub highlights tech-friendly leg branch activities. It includes info on bulk data task force meetings.

— Interested as Congress.gov rolls out new features? They’ve got a listserv for that.

We should note that five of the projects described above were required either in the FY 2017 or FY 2018 legislative branch appropriations bills. When the video from the proceedings are available, we will include it below.

Recap of the 2018 Legislative Data and Transparency Conference

Congress held its sixth annual Legislative Data and Transparency Conference this past week.

Lest we forget, these conference are extraordinary. Hosted by the House of Representatives Committee on House Administration, it pulls together the vast majority of the internal and external Congressional stakeholders to talk in detail about the House and Senate’s operations. It provides a forum for candid questions and conversations with the people who are the decision-makers and the implementers, and changes are often made in response to the conversations.

Unlike prior years, I’m reluctant to do this write up because so much has happened and I’m sure I will leave out important items. I’m publishing my real-time notes here, and I hope that you’ll forgive the impressionistic nature of this write-up.

Legislative Branch Innovation Hub

GPO has launched a legislative branch innovation hub website, built on GitHub, that “seeks to highlight Legislative branch activities that use technology to cultivate collaboration, foster data standardization, and increase transparency.” Wow, right?

GPO is encouraging the public to file pull requests and otherwise to use this as a platform to communicate with Congress about technology and transparency. But even more than that, the website crosses silos and will be used by multiple stakeholders inside Congress. For example, It seems to encompass the Clerk’s vision of a resource for where to find legislative branch data.

Comparative Prints Projects

The House Clerk is continuing its efforts to provide real-time comparison between documents. Its long term version is to deploy platforms to all House staff and others to create, on demand, visuals of changes in important documents. This includes changes between bill proposals and current law; and also how an amendment would change a bill. Here’s an example of a comparative print.

Updated Clerk’s website

The Clerk of the House is continuing to modernize its website, which can be viewed in alpha at Clerkpreview.house.gov. By end of 2018, the Clerk intends to update the help and resources, member election stats, disclosures, and notes there’s a new version of HouseLive.

API for GovInfo

GPO has launched an API for most of the information contained on GovInfo, its gigantic website of government reports and documents. This is a big deal, as the API will provide an incredible useful complement to bulk access. At the moment not included are the documents published in bulk, like legislation, but they are taking requests for additional items to make available through the API.

Senate website

The Senate’s website is now mobile friendly and they’ve completely redesigned the information architecture behind that site.

Ask Alexa

The Clerk’s office came up with a clever project to integrate their data holdings with Alexa, the web app for Amazon’s echo. You can ask whether the House is in session, who your representative is, and what meetings are happening today. Submit ideas for other questions Alexa should answer here.

USLM

A lot of what makes it possible to put legislative information to use is having it generated in a structured data format. The Congress has been working hard to make its information available in the United States Legislative Model (USLM).

The Congress has developed a new USLM schema (version 2.0) and are asking for feedback. They’re also currently working on moving public bills and statutes into USLM, which will empower many of the new tools they wish to develop (such as important updates to legislative drafting).

Congress.gov

The Library of Congress is continuing to make welcome incremental improvements in Congress.gov, such as efforts to track committee names as they’ve changed over time and improve saved alerts. (See their enhancements timeline.)

The Library didn’t answer address questions about integration of CRS reports onto the Library’s website, although they showed a few images and suggested a September 18, 2018 implementation date. (Their implementation plan suggests non-compliance with both the requirements of the law and best practices for creating web resources.) They didn’t get into the joint House/Senate committee calendar, which will shortly be required by law. They did give a presentation on their app challenge, won by a high school student for a neat visualization of treaty info, and some of the experiments being conducted by LC Labs.

Other Presentations

The conference featured a number of great presentations, by Ed Walters’ presentation on “9 ways the government can work with private publishers on public access” stood out. It’s worth watching it on video.

Additional Resources

For more, the conference agenda is here; video of the conference is here; and my recap of prior conferences are here.

House Legislative Data and Transparency Conference Announced

The House will be holding its sixth annual Legislative Data and Transparency Conference. If you haven’t been before, the conference focuses on the Congress’s efforts to make legislative information available to the public as data, and provides an opportunity to hear from and meet with the people working on making it happen.

To RSVP for the July 12, 2018 event, please go here, and for more information about the Conference, visit the Committee on House Administration’s website.

You can find recaps of prior conferences and links to video from the conferences here:

House Passes the Best Leg Branch Approps Bill in 8 Years

On Friday, the House of Representatives passed the best legislative branch appropriations bill since Republicans took power in 2010. Unlike many prior appropriations bills, which often undermined the House’s capacity to govern through deep budget cuts, this legislation contained provisions to strengthen the House and set the stage for further improvements. In addition, it was created in a bipartisan manner, drawing on the hard work of Reps. Kevin Yoder and Tim Ryan and their staff.

Greater transparency

The House included provisions to improve the transparency of its operations. (For more on these items, read the testimony of the Congressional Data Coalition.)

It required the Library of Congress to publish a unified calendar for hearings and markups. This will make it possible — at long last — for the general public to have a central place where it can see all the committee proceedings in one place.

In addition, the House will make committee witness disclosure forms available online. These witness disclosure forms were initially created to track the activities by lobbyists, but the way they are gathered and published makes them unsuitable for that purpose. A central repository of electronic data about witnesses will help bring this disclosure provision to life.

The House will also begin to publish bioguide information as structured data, which will support civil society and others in tracking the work of members of Congress.

The bill also directs GPO to explore the costs of publishing the Statutes at Large in a digital format. These documents are all the bills enacted by Congress. Demand Progress/The Congressional Data Coalition was the first entity to publish a comprehensive set of the law online; and the Library of Congress belatedly followed. But the text of the laws aren’t available as data, which we would need to be able to show how the laws have changed over time, or how a bill would change a law. (For more, read this primer from the Data Coalition).

Capacity to Govern

The appropriations bill also sets the stage for the House to work better.

The House will commission a study on congressional staff pay and retention, including a comparison of congressional staff pay against the executive branch as well as its inquiry into whether staff are receiving equal pay for equal work. This look at the staff who work in the House is timely because it will help ensure that Congress has the staff necessary to do its job, and that some of the problems raised by the #metoo movement are appropriately ventilated and addressed. It should hopefully set the stage to address the House’s undercapacity and diversity problems. (For more, please read our testimony.)

The bill also includes a study by CRS on establishing a technology assistance office and identifying the resources available to members of Congress on science and technology. This change is sorely needed and long overdue, as the recent hearing on Facebook demonstrated. While the House did not include an amendment to restore $2.5 million in funding for the Office of Technology Assistance, the margin in favor improved, and had bipartisan support. (For more, read the testimony of the R Street Institute.)

Similarly, the GAO will conduct a study on avenues for whistleblowers to connect to the proper congressional offices. This could potentially lead to significant cost savings, as improved communications will help root out waste, fraud, abuse, and malfeasance. Ultimately, we believe the House should establish an office that provides internal support and external guidance for whistleblowers. (For more, read the testimony of the Government Accountability Project.)

Funding

In this bill, the House began to reinvest in its staff after a generation’s worth of harmful cutting. The very modest 1.7% increase in the Member Representational Account and the slightly larger increase in the account for House Salaries, Officers, and Employees is essential to the House fulfilling its duties, especially considering overall funding for the House of Representatives is down by 10% since FY 2010. This essential funding for the legislative branch is tiny compared to the enormous amounts spent by the executive branch — 0.1% of the total federal budget — and this legislation will begin to restore a little balance to the branches.

What’s Missing

We are impressed by all that was packed into the legislative branch appropriations bill, but we should note a few items that we would have like to have seen included:

  • Providing select staff with appropriate clearances to support congressional oversight of the intelligence community. (For more, see the testimony of Mandy Smithberger.)
  • Strengthening GAO’s hand when it comes to reviewing waste, fraud, and abuse in the Intelligence Community. (For more, see the testimony of Kel McClanahan.)
  • Improving lobbying disclosure by fixing how data is released to the public. (For more, see the testimony of Sheila Krumholz.)

What’s Next

This upcoming week, Senate Legislative Branch appropriators will consider their own appropriations bill. Demand Progress Action’s written testimony requests that they address the following items:

  • Just as the House has done, the Senate should review Legislative Branch salaries for parity with the executive branch as well as examine internal pay disparities by gender and race.
  • Publish the Senate’s Official Personnel and Official Expense Account Report as data, not just a PDF, as the House does with its Member Representation Account information. This will make it possible to easily follow how the Senate spends money on its self.
  • Create a website for the Legal Treatise known as the Constitution Annotated. The Constitution Annotated explains the US Constitution as it has been interpreted by the Supreme Court, but the way it is currently released to the public online makes that document virtually unreadable.
  • Create a Chief Data Officer for the legislative branch, to help facilitate the publication of Congressional information, provide support to offices, and serve as a point of contact for the public.

In addition, we join R Street’s call for a study into creating a technology assessment office in Congress. And, as a member of the Congressional Data Coalition, we strongly support its call for the Library of Congress to establish a Public Information Advisory Committee that would facilitate the Library working with public stakeholders on how it makes information available to the public.

This has been a remarkably productive subcommittee from a transparency perspective. Just last year it required the Library of Congress to publish CRS reports online, which is something we continue to monitor closely. With the departure of Rep. Yoder to another subcommittee, we will see what the 116th Congress will bring on the House side, and of course will be keeping an eye on the Senate.

Resources

(cross posted)

Save the Date: Bulk Data Task Force Meeting February 8

The House of Representatives will hold its next Bulk Data Task Force meeting February 8 from 10:30-12. More information to come.