December 21, 2010

As I had foreshadowed, the US Census released its first set of data today: population counts for the several states.  The country as a whole had a 9.7% increase in population and Massachusetts (where I was counted) went from 6.35 million to 6.55 million people.  Although the Census hasn’t released more specific geographic data yet, I would guess that Cambridge is still the 4th or 5th most populous city in the Commonwealth, after Boston, Worcester, and Springfield.  The District of Columbia finally reversed its decades of population loss by gaining 29,664 people since 2000.  Unsurprisingly, the biggest gainers were from the South and West, with Nevada showing a 35.1% increase.  California now has a population that exceeds 37 million, though still dwarfed by Uttar Pradesh which is the most populous subnational entity. The only state to lose population was Michigan; I wonder if Dave Bing can do anything about that; maybe bring back Eric Devendorf to promote the state with other prominent residents.  Unsurprisingly, the political implications of the recently released census data have been analyzed by Nate Silver.

Although the US Census has been a source of data for hundreds of years, and indeed its tabulation was one of the launching points for IBM, the world is increasingly awash in “big data”.  (As a side note, I’ve been becoming more and more of a fan of Canadian conventions w.r.t. punctuations outside of quotation marks.)  As an example sticking to government, the CIO of the United States, Vivek Kundra, has been big on releasing data through data.gov.  As he points out in a recent report, he is also big on cloud computing and cloud-based Infrastructure-as-a-Service (IaaS) offerings.  These areas definitely seem like very interesting areas for research.

Indeed, I have a feeling that the connection between information technology and information theory might grow in the future.  You might recall Ronald Kline’s article on the emergence of “Information Technology” as a keyword.  Now that Google has put up a source of big data and an analytics tool, one might even try to roughly test Kline’s basic argument in an empirical way.  This is what you get when you plug in terms such as “information theory” and “information technology”:Google Books Ngram ViewJust as Kline describes, there was a blossoming of discourse  around “information technology” in the mid-1960s when prominent humanists and social scientists proclaimed the advent of a new type of society based on the processing of information.  Moreover, as Kline describes, by the early 1980s, several discourse communities including policy analysts, business writers, managers, information scientists, and social scientists had adopted the term.

Of course, one can also observe the great flowering of the term “information theory” after 1948 and its seeming decline after 1967.  Not only might might big data be a boon for historians, as I have suggested here, it might also be useful for textual analysis of a different kind.

As you know I had once suggested that the phrase “arbitrarily small probability of error” is a formula in information theory, just like there are formulae in epics and greek mathematics.  Though a rather weak test, let us see what the data shows:

Google Books Ngram ViewThere is seemingly some connection, though it is perhaps not too strong.

In what I had just done, it was nice that an analytics tool was provided with the data.  But what should one do when there is no tool provided?  Our colleagues at the Watson Research Center have developed a general tool called Many Eyes that is something of a general analytics platform.  So let’s see what it can do: I’ll take some data from data.gov and plug it into Many Eyes and see what happens.  I took data from November 1983 on state-by-state percentage of households with telephone service and got this:

Telephone Penetration by State, November 1983 Many EyesUnfortunately, there is no “imagesc” feature for Many Eyes, so it isn’t too useful on the present data set.  I did the same thing for November 2009 data and get:

Telephone Penetration by State, November 2009 Many EyesAgain, it isn’t too useful as presented.  Separately, however, one can compute the change and plot that to get:

Telephone Penetration by State, Change from 1983 to 2009 Many EyesNow you can see something; applying analytics does require  a modicum of skill.  Somehow Maryland, Illinois, and Montana have less telephone penetration in 1983 than 2009.  I wonder why.  Excluding Montana, regions that are typically thought of as rural seem to have had the biggest gains, places like the West Virginia and the Deep South.  Now if only this was put together with real historical and sociological research, perhaps I would have a story to go along with the statistics on the adoption of this technology.  I don’t think it is viral, but perhaps you have more insight into how to use data analytics to study the the history of science and technology.

Google Books Ngram View



  1. […] Ashvins The Ultimate Machinists « 308,745,538 The Blogosphere December 27, 2010 Although this blog is written as a conversation between […]

  2. […] Ashvins The Ultimate Machinists « 308,745,538 The Blogosphere December 27, […]

  3. How Ironic. I can’t wait to see Watson strut its (his/her?) stuff.

  4. […] metrics framework) started out as a work of data curation, but led to new discoveries.  Connecting culturomics with Zipf’s Law is another example where large scale curation led to new discoveries.  But […]

  5. […] issue.  (You have posted images generated using Many Eyes and the Google Data Explorer on the blog previously; those platforms are really excellent for what they are intended to do.)  I think that even […]

  6. […] I may have foreshadowed, Ron Kline’s book, The Cybernetic Moment (that I helped with a little bit), also uses […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: