Congressional Hackathon 5.0: September 14, 2023

The Speaker of the House, Minority Leader of the House, and the CAO co-hosted Congressional Hackathon 5.0 on September 14, 2023. There’s video from part 1 and part 2 of the event, a summary video, and the official report has just been released. The following is our recap of the event.

Continue Reading

Congressional Data Task Force Meeting Set for March 19, 2024

The Congressional Data Task Force announced it will hold its next public meeting on March 19, 2024, from 2-4pm ET. This hybrid meeting will occur both in person and virtual.

Continue Reading

Congressional Calendar 2024 for Google and iCal

The House of Representatives and Senate follow their own calendars for when they are in session. They each make their respective information publicly available, but generally not in a digital format. The print PDFs that they publish likely are hard to read for people with visual disabilities. And there’s no official combined version.

In light of these omissions, I’ve gone ahead and published a combined House and Senate calendar in a digital format. You can use the following link to access the calendar for your Google calendar account, access a public-facing iCal version, and view the calendar from a web browser.

Continue Reading

Congressional Data Task Force Meeting on December 19, 2023

The Congressional Data Task Force held its third quarterly meeting on December 19, 2023, in the Longworth House Office Building. The agenda, video, and slides are available here. Next year’s meetings are tentatively scheduled for: March 19, June 6, and December 12, 2024.

Highlights and Key Takeaways

Continue Reading

Congressional Data Task Force Meeting Set For December 19, 2023

The next Congressional Data Task Force Meeting is set for December 19, 2023 from 2:00 – 4:00 pm EST.

Continue Reading

Library of Congress Virtual Public Forum: September 13, 2023

The Library of Congress held its Virtual Public Forum on Congress.gov on September 13, 2023. Video of the proceedings is available here.

Continue Reading

Congressional Hackathon 5.0 Set for Sept. 14, 2023

Today Speaker McCarthy and Minority Leader Jeffries announced Congressional Hackathon 5.0, set for September 14, 2023 at the US Capitol. The official announcement is here. Here’s how they describe it:

“This event will bring together a bipartisan group of Members of Congress, Congressional staff, Legislative Branch agency staff, open government and transparency advocates, civic hackers, and developers from digital companies to explore the role of digital platforms in the legislative process. Discussions will range from data transparency, to constituent services, public correspondence, artificial intelligence, cybersecurity, committee hearings, and the broader legislative process.”

Continue Reading

Blog Post: Highlights from the June 2023 Congressional Data Task Force Meeting

Introduction

The Congressional Data Task Force meeting held on June 22, 2023, brought several significant updates and announcements in the realm of congressional data management and accessibility. Video and slides are available here.

Key Personnel Changes

  • Clerk Johnson’s Resignation: Announced her departure, effective June 30.
  • Deputy Clerk Kevin McCumber: Sworn in as acting Clerk, officially taking office in July.

Historical Insight

  • Workload Increase: The number of bills in the 117th Congress rose to 11,461, reflecting an increase of 1,000 bills over the previous congress.

Reports from the Library of Congress

Updates by Kimberly Fergusson, LOC:

  • Feedback: Encouraged feedback through the site’s feedback form
  • Congress.gov API: everything on Congress.gov will soon be available from the congress.gov API. There’s a changelog on GitHub that talks through the API milestones and is a good place to submit updates.
  • Future Plans: Announced a public forum at the Library of Congress in September.

Updates by Robert Brammer, LOC:

Government Accountability Office (GAO) Innovations

Andrew Kurtzman, Assistant Director, Innovation Lab:

  • Project Sia: A platform for Congressional Activity Monitoring, which consolidates information sources for easier accessibility and analysis. Specifically, surfaced GAO reports that are relevant to congressional hearings (identified the top 3), also identified emerging areas of congressional interest (especially when there were no matches). Lead: Andrew Kurtzman
  • Ran from Sept 2020 -2022. Working to release a new version internally to GAO.
  • Issue: too many information sources – were scraping the committee websites manually. Needed a single information source that covered all committees.
  • Major use cases: what is congress saying; gauge congressional interest; monitor hearings; help identify areas of emerging interest
  • Future Developments: Working on integrating Congress.gov API for broader data access (it covers much but not all committee information, with limitations noted in information scope. Interested in other information, such as press releases, which are unavailable on congress.gov. We hope to parse PDF/XML versions of legislation to identify mandates for GAO to do work.
  • Took 6 data scientists over six months.
  • GAO currently publishes their reports online as PDFs, but may consider publishing them in another format, such as plain text, to be able to conduct searches.

Government Publishing Office (GPO) Updates

Lisa LaPlant and Amanda Dunn, GPO:

  • GovInfo Expansion: Added 700 House and Senate hearings from 1946-1982; House reports from the 94th congress serial set (1975-1976)
  • Coming soon: GPO API and USLM 2.0.x: GPO has had a public API for years, and is currently developing a search service API – the ability to perform searches and get reports back in JSON format. Hope to get a sample working this summer. For example, could do bill comparisons and get machine readable search results back.
  • USLM: USLM 2.0.x schema is moving out of draft status. Look for sample amendment files soon.
  • XPUB: Upcoming release of XPUB system – will have congressional bills and public laws. Responsive HTML format for bills is coming soon.

ACCESS TO CONGRESSIONALLY MANDATED REPORTS ACT

Presentation by Amanda Dunn, ACMRA’s project manager

  • ACMRA Guidance recently released; agencies must certify they are complying and provide an agency point of contact
  • Timeline for implementation is 180 days after enactment, which was June 21, 2023
  • Expect to have the portal live by 12/23/23 and available to the public
  • Reports will be required as PDFs as well as open formats such as XLS and TXT
  • Using GPO’s content management system, ask ASKGPO, to manage the receipt of the documents
  • Working on a new CMR collection on govinfo
  • Working to identify early adopters to validate the process
  • Outreach? Working on table of reports to track agency submissions. Unclear whether GPO will reach out to agencies that do not submit on time. Still considering how outreach would work
  • API? As design what metadata we collect, will have info available through GPO’s API
  • A lot of functionality will be possible as move the suite of publications into an XML-based workflow.
  • Says agency could choose to withhold a report or redact it (note: this is an incorrect answer)

Working Groups and Projects

Congressional Staff Directory – Steve Dwyer:

  • Exploring the creation of a comprehensive legislative branch staff directory, including legislative issue coverage.
  • We view this as a data problem: trying to find or gather data that does not exist
  • Will soon provide a report on implementation of this project
  • Broad vision with an incremental plane: create a modern data graph to make better use of this data for us and (hopefully) for the public as well

Digitization of Congressional Documents – Kimberly Fergusson:

  • Efficiency in Digitization: Collaboration efforts to digitize historical materials while avoiding duplication of work. Digitization of historical materials is labor intensive

Congressional Video Preservation – Arin Shapiro:

  • Video Accessibility: Efforts to standardize the transmittal of congressional proceedings and enhance video accessibility.
  • Have a draft report – hope to finalize it in the next few months
  • Getting close in the Senate to delivering committee URLs to the LC for videos

Legislative Branch XML Technical Working Group – Kirsten:

  • Progress in XML integration for legislative data.

Senate Update – Arin Shapiro:

  • Getting close in the Senate to delivering committee URLs to the LC for videos.
  • There are multiple sources for committee meeting proceedings. Daily Digest will agree to add additional information to their source, which allows us to replace our internal calendar of committee events so we have one source of information. This allows us to get back to one calendar
  • New video: Just released a new player for floor proceedings with enhanced closed captioning capabilities; also updated for committees
  • Closed captioning: Exploring ways to make closed captioning available for all hearings. Trying to make unofficial closed captioning available is near real time. This is not the same as transcripts, but should be available same day. The more modern video feed allows for the separation of text from the video, but requires coding skills to separate the data feeds. No plan for the Senate to make the text available for the feed.

Notable Developments and Discussions

  • Clerk Report (Kirsten Gullickson): Ongoing projects include LIMS project and comparative suite. Hope to release comparative print house-wide soon. Still working on centralized committee portal, but nothing to report ATM. Still talking with the Senate about lobbyist disclosures and unique IDs.
  • House Digital Service (Ken Ward): Introduction of a committee deconfliction tool for scheduling. Officially launched to all 32 committees at the end of March. Looking to add caucus events. Floor schedule information is coming from an internal API – note there is no authoritative source for what’s scheduled on the House floor.

Public presentation on committee.report by Daniel at Demand Progress Education Fund

  • Automatically transforms committee reports into ePUB formats
  • Available here

Prior Meetings for which we’ve published a summary

2023: March 2023 CDTF Meeting | June CDTF Meeting | September LC Virtual Public Forum | September Hackathon 5.0 | December CDTF Meeting (scheduled)

2022: December 2022 | September CDTF Meeting | September LC Virtual Public Forum | June CDTF | March BDTF | April Hackathon

2021: July BDTF | September LC Virtual Public Forum

2020: September LC Virtual Public Forum

2019: July BDTF | October BDTF |

2018: February 2018 (available upon request) | June LDTC | November BDTF |

2017: April BDTF (available upon request) | June BDTF (available upon request) | December Hackathon

2016: May BDTF | June LDTC (and this)

2015: May LDTC | October Hackathon

2014: February BDTF | June LDTC | December BDTF

2013: February BDTF | May LDTC |

2012: April LDTC |

A Biased Yet Reliable Guide to Sources of Information and Data About Congress

Big Picture

1/ There’s big gaps in the data story

2/ Even when there’s data, it may not tell the whole story

  • Info about Congress isn’t entire reliable, even when it is official, e.g., the Congressional Record (“revise and extend”)
  • Congress historically is a paper-based institution, driven by people with agendas, and it has inconsistent archival practices, e.g. GPO established in 1860, National Archives created in 1934
  • Its institutions are built to solve a particular problem, not work for all time. Plus there’s a lot of turf wars, e.g., the former THOMAS.gov
  • Analyses, even by experts, can be unreliable because of the source data or unexpected actions. See, e.g. CRS report on the number of staff in an office (done by counting phone numbers) or the various supplementals

3/ The people who dogfood the data, such as Josh Tauberer at GovTrack, Derek Willis formerly of ProPublica, and OpenSecrets, are often forced to build additional reliability and usability into the data than that available from official sources.

4/ This presentation is idiosyncratic and focuses on particular use cases. Major topics include:

  • Federal spending information
  • Oversight and accountability
  • Legislation
  • Congressional committees
  • Information about Congress
  • Money in politics and ethics
  • Other interesting and important stuff
Continue Reading

House Publishes More Earmarks Request Data, Which We Enhance

At the end of last week, the House Appropriations Committee published all earmark requests for FY 2024 on the committee’s website, including publishing them as a spreadsheet. This is great and welcome news. For the first time, the appropriations spreadsheet separated member names into different columns and included state, district, party, and recipient address. This makes the information significantly more usable. Thank you.

In fact, it’s so usable, we spent a little time over the weekend making it even more robust. We enhanced their spreadsheet by adding bioguide IDs for each member, appropriations subcommittee codes, a standardized recipient address (with help from ChatGPT), and extracted the recipient state and zip code. We have been playing around with using the AI to categorize whether the recipient entity is a non-profit or a governmental entity. We can imagine a lot of use cases for this cleaned-up data.

The spreadsheet is available online here. We are continuing to tinker with it.

Continue Reading