Blog Post: Highlights from the June 2023 Congressional Data Task Force Meeting

Introduction

The Congressional Data Task Force meeting held on June 22, 2023, brought several significant updates and announcements in the realm of congressional data management and accessibility. Video and slides are available here.

Key Personnel Changes

  • Clerk Johnson’s Resignation: Announced her departure, effective June 30.
  • Deputy Clerk Kevin McCumber: Sworn in as acting Clerk, officially taking office in July.

Historical Insight

  • Workload Increase: The number of bills in the 117th Congress rose to 11,461, reflecting an increase of 1,000 bills over the previous congress.

Reports from the Library of Congress

Updates by Kimberly Fergusson, LOC:

  • Feedback: Encouraged feedback through the site’s feedback form
  • Congress.gov API: everything on Congress.gov will soon be available from the congress.gov API. There’s a changelog on GitHub that talks through the API milestones and is a good place to submit updates.
  • Future Plans: Announced a public forum at the Library of Congress in September.

Updates by Robert Brammer, LOC:

Government Accountability Office (GAO) Innovations

Andrew Kurtzman, Assistant Director, Innovation Lab:

  • Project Sia: A platform for Congressional Activity Monitoring, which consolidates information sources for easier accessibility and analysis. Specifically, surfaced GAO reports that are relevant to congressional hearings (identified the top 3), also identified emerging areas of congressional interest (especially when there were no matches). Lead: Andrew Kurtzman
  • Ran from Sept 2020 -2022. Working to release a new version internally to GAO.
  • Issue: too many information sources – were scraping the committee websites manually. Needed a single information source that covered all committees.
  • Major use cases: what is congress saying; gauge congressional interest; monitor hearings; help identify areas of emerging interest
  • Future Developments: Working on integrating Congress.gov API for broader data access (it covers much but not all committee information, with limitations noted in information scope. Interested in other information, such as press releases, which are unavailable on congress.gov. We hope to parse PDF/XML versions of legislation to identify mandates for GAO to do work.
  • Took 6 data scientists over six months.
  • GAO currently publishes their reports online as PDFs, but may consider publishing them in another format, such as plain text, to be able to conduct searches.

Government Publishing Office (GPO) Updates

Lisa LaPlant and Amanda Dunn, GPO:

  • GovInfo Expansion: Added 700 House and Senate hearings from 1946-1982; House reports from the 94th congress serial set (1975-1976)
  • Coming soon: GPO API and USLM 2.0.x: GPO has had a public API for years, and is currently developing a search service API – the ability to perform searches and get reports back in JSON format. Hope to get a sample working this summer. For example, could do bill comparisons and get machine readable search results back.
  • USLM: USLM 2.0.x schema is moving out of draft status. Look for sample amendment files soon.
  • XPUB: Upcoming release of XPUB system – will have congressional bills and public laws. Responsive HTML format for bills is coming soon.

ACCESS TO CONGRESSIONALLY MANDATED REPORTS ACT

Presentation by Amanda Dunn, ACMRA’s project manager

  • ACMRA Guidance recently released; agencies must certify they are complying and provide an agency point of contact
  • Timeline for implementation is 180 days after enactment, which was June 21, 2023
  • Expect to have the portal live by 12/23/23 and available to the public
  • Reports will be required as PDFs as well as open formats such as XLS and TXT
  • Using GPO’s content management system, ask ASKGPO, to manage the receipt of the documents
  • Working on a new CMR collection on govinfo
  • Working to identify early adopters to validate the process
  • Outreach? Working on table of reports to track agency submissions. Unclear whether GPO will reach out to agencies that do not submit on time. Still considering how outreach would work
  • API? As design what metadata we collect, will have info available through GPO’s API
  • A lot of functionality will be possible as move the suite of publications into an XML-based workflow.
  • Says agency could choose to withhold a report or redact it (note: this is an incorrect answer)

Working Groups and Projects

Congressional Staff Directory – Steve Dwyer:

  • Exploring the creation of a comprehensive legislative branch staff directory, including legislative issue coverage.
  • We view this as a data problem: trying to find or gather data that does not exist
  • Will soon provide a report on implementation of this project
  • Broad vision with an incremental plane: create a modern data graph to make better use of this data for us and (hopefully) for the public as well

Digitization of Congressional Documents – Kimberly Fergusson:

  • Efficiency in Digitization: Collaboration efforts to digitize historical materials while avoiding duplication of work. Digitization of historical materials is labor intensive

Congressional Video Preservation – Arin Shapiro:

  • Video Accessibility: Efforts to standardize the transmittal of congressional proceedings and enhance video accessibility.
  • Have a draft report – hope to finalize it in the next few months
  • Getting close in the Senate to delivering committee URLs to the LC for videos

Legislative Branch XML Technical Working Group – Kirsten:

  • Progress in XML integration for legislative data.

Senate Update – Arin Shapiro:

  • Getting close in the Senate to delivering committee URLs to the LC for videos.
  • There are multiple sources for committee meeting proceedings. Daily Digest will agree to add additional information to their source, which allows us to replace our internal calendar of committee events so we have one source of information. This allows us to get back to one calendar
  • New video: Just released a new player for floor proceedings with enhanced closed captioning capabilities; also updated for committees
  • Closed captioning: Exploring ways to make closed captioning available for all hearings. Trying to make unofficial closed captioning available is near real time. This is not the same as transcripts, but should be available same day. The more modern video feed allows for the separation of text from the video, but requires coding skills to separate the data feeds. No plan for the Senate to make the text available for the feed.

Notable Developments and Discussions

  • Clerk Report (Kirsten Gullickson): Ongoing projects include LIMS project and comparative suite. Hope to release comparative print house-wide soon. Still working on centralized committee portal, but nothing to report ATM. Still talking with the Senate about lobbyist disclosures and unique IDs.
  • House Digital Service (Ken Ward): Introduction of a committee deconfliction tool for scheduling. Officially launched to all 32 committees at the end of March. Looking to add caucus events. Floor schedule information is coming from an internal API – note there is no authoritative source for what’s scheduled on the House floor.

Public presentation on committee.report by Daniel at Demand Progress Education Fund

  • Automatically transforms committee reports into ePUB formats
  • Available here

Prior Meetings for which we’ve published a summary

2023: March 2023 CDTF Meeting | June CDTF Meeting | September LC Virtual Public Forum | September Hackathon 5.0 | December CDTF Meeting (scheduled)

2022: December 2022 | September CDTF Meeting | September LC Virtual Public Forum | June CDTF | March BDTF | April Hackathon

2021: July BDTF | September LC Virtual Public Forum

2020: September LC Virtual Public Forum

2019: July BDTF | October BDTF |

2018: February 2018 (available upon request) | June LDTC | November BDTF |

2017: April BDTF (available upon request) | June BDTF (available upon request) | December Hackathon

2016: May BDTF | June LDTC (and this)

2015: May LDTC | October Hackathon

2014: February BDTF | June LDTC | December BDTF

2013: February BDTF | May LDTC |

2012: April LDTC |