Congressional data: A primer for non-geeks

When the legislative branch started publishing bills and resolutions online in 1995, it was heralded as a revolution in government transparency. The public at last had easy and immediate access to the text of legislation and citizens could better hold their elected officials accountable.

But technology has progressed quite a distance since then, and the internet has become a far more dynamic platform for information. Given the massive quantity of content produced on a daily basis by the government alone, transparency can no longer defined by the existence of information as text. Instead, transparency should be defined as the accessibility and usability of information as data.

So what exactly is the difference between text and data? Text is a series of static characters and words formatted to be presented in a certain way on a page. Humans are able to derive meaning from text by reading and understanding it, but computers can do little more than store it or display it for a human to read. Data is different. Data consists of pieces of information linked to identifiers and variables. When data is published in certain formats, computers are able to automatically find and compile specific, identified pieces of information. This saves humans countless hours and improves the accuracy and completeness of the information they gather.

To simplify, think about the difference between text and data in terms of information in the news. When a major weather event occurs, media outlets typically release articles about how the weather anecdotally impacted local individuals. These articles are text. Media outlets also release numbers and charts that show the temperatures and wind speeds over the same period of time. This is data. While the text may detail the effect of the event through an (implicitly or explicitly) editorialized lens, the data provides concrete information that consumers can apply widely and compile with other datasets to derive new meaning.

So what would this text-to-data transform mean for Congress? It would mean releasing official documents, membership information, committee reports, and other pieces of legislative information in formats ready for computer processing; establishing authoritative identifiers for the many entities involved in governmental processes; and having that content available for download from a machine-crawlable location.

Despite notable effort in some quarters, Congress unfortunately has not kept up with these data demands. As of now, legislative data is difficult to access, outdated, and low quality. Private developers who build software around this data must inefficiently pull from multiple sources, amplifying the potential for errors, inconsistencies, and inaccurate information. While recent improvements in data accessibility have been encouraging developments (including various new bulk data downloads from the Government Printing Office), they do not adequately support today’s data processing capabilities.

Not only do developers deserve better, the public deserves better.

The Congressional Data Coalition seeks to bring together programmers, data scientists, activists, and policy experts of all ideologies to encourage Congress to improve the access and usability of legislative data. This will allow developers and civic hackers to provide consumers with more authoritative, reliable, and timely information on the on-goings of Congress, leading to a more transparent government and better informed public.

It is time for Congress to free its data.


The original text of the Freedom of Information Act

The Freedom of Information Act was enacted twice, and the one that we know and celebrate is, technically, not the one that became law. This early history of FOIA provides an interesting case study in the complexities of the codification of our federal statutes.

What we commonly consider the Freedom of Information Act, S. 1160 in the 89th Congress, was signed by President Johnson on July 4, 1966. It became Pub.L. 89–487 / 80 Stat. 250. Its effective date was one year later on July 4, 1967, and in fact it never became law: it was repealed before its effective date. More on that below.Continue Reading

More on counting laws and discrepancies in the Resume of Congressional Activity

After my last post yesterday about Congress incorrectly counting the new laws in 2013, Daniel Schuman (of CREW) suggested that I look at previous installments of the Resume of Congressional Activity to see if there were other long-standing discrepancies in these historical counts of the number of laws passed by each Congress.

I went through each of the PDFs listed at… and compared the totals by Congress (a Congress is a two-year period of legislative activity), and then compared those totals to other sources.Continue Reading

Timeline of US legislative documents and data

  • The message, “What hath God wrought?” sent later by “Morse Code” from the old Supreme Court chamber in the United States Capitol to Samuel Morse’s partner in Baltimore, officially opened the completed telegraph line of May 24, 1844. (1)
  • The private firm, Little, Brown, and Company, began publishing the Statutes at Large under authority granted by a joint resolution of the 28th Congress. (1)
  • Charles Lanman, an author and former secretary to Daniel Webster, assembled the first collection of biographies of former and sitting Members for his Dictionary of Congress. (1)

Transparency and liberty

John McGinnis has some kind words for work I oversee at the Cato Institute in a recent blog post of his entitled: “The Internet–A Technology for Encompassing Interests and Liberty.”

As he points out, the information environment helps determine outcomes in political systems because it controls who is in a position to exercise power.

The history of liberty has been in no small measure the struggle between diffuse and encompassing interests, on the one hand, and special interests, on the other.  Through their concentrated power, special interests seek to use the state to their benefit, while diffuse Continue Reading