Archive for March, 2011


Just Right

March 22, 2011

Señor Jorge Luis Borges, light has always been, and I think will always be, a metaphor for wisdom (or knowledge or information or data).  From that perspective, the observation that everyone (current and historical) spends about the same proportion on lighting can be related to your interest area of information overload.  Just like there is a ‘right’ amount of light based on the technologies and resources available to a person, maybe there’s a ‘right’ amount of information as well, and maybe overload is just a temporary transient situation.  We don’t need the example of Funes the Memorious to tell us that infinite memory and infinite information processing is a burden that no one wants.  However, as stated in this article:

“Humans make errors. We make errors of fact and errors of judgment. We have blind spots in our field of vision and gaps in our stream of attention. Sometimes we can’t even answer the simplest questions. Where was I last week at this time? How long have I had this pain in my knee? How much money do I typically spend in a day? These weaknesses put us at a disadvantage. We make decisions with partial information. We are forced to steer by guesswork. We go with our gut.

“That is, some of us do. Others use data.”

This data is coming from all types of sensors and other modern technologies.  You may ask that if there is a human need for a fixed amount of information and we didn’t have all these sensors back in the day, how was the need for information satisfied?  My feeling is that first, the need was satisfied by looking out for tigers, wolves and poisonous mushrooms, and later in the quest for self-consciousness.

“Samkhya marked a new stage in self-consciousness.  People in India were becoming aware of a self that was obscured by the confusion of daily life, hidden in our bodies, fettered by our instincts, and only dimly aware of itself.” (p. 229-230, The Great Transformation: The Beginning of Our Religious Traditions by Karen Armstrong)

“Long before Freud and Jung developed the modern, scientific search for the soul, the yogins of India had already begun to explore and analyze the unconscious realm with unprecedented vigor.  These vrittis and vasanas had to be annihilated, ‘burned up.’  Only then could the self detach itself from the chaos of its psychic life, throw off the toils of nature, and experience the bliss of moksha.  And this herculean feat could be achieved only by sheer mental force.” (p. 232)

Let me finish up this post with an extended quotation from my magazine article about business analytics.  I promised I would elaborate more on the article and I will in due time.

“Baniya merchants of the Mughal Empire, burgher merchants of the Swedish Empire, and chonin merchants of the Tokugawa Shogunate had the same questions on their mind as businesspeople do today. To which townspeople should I sell my wares? Of folks that buy from me, are there any that might stop buying from me? Which groups buy which goods? Which saris should I show Ranna Devi to make as much money as I can? How much timber will people want in the coming weeks and months?

“The world has changed over the centuries with globalization, rapid transportation, instantaneous communication, expansive enterprises, and an explosion of data and signals along with ample computation to process them. In this new age, many continue to answer the aforementioned and other critical business questions in the old-fashioned way, i.e. based on intuition, gut instinct, and personal experience. In our globalized world, however, this is not sufficient anymore and it is essential to replace the businessperson’s gut instinct with science.

“That science is business analytics.”


Vincent Yan Fu Tan

March 15, 2011

We start off the interviews with Vincent Tan, currently a postdoctoral researcher at the University of Wisconsin and formerly a co-blogger of mine on the LIDS blog.  This interview has been edited for length.

Kush: You have done quite a bit of theoretical analysis on learning probabilistic graphical models (Markov random fields), especially in the asymptotic regime. Have any of your results been counterintuitive or surprising? Do you think that counterintuitive results advance the field more than intuitive ones?

Vincent Tan: Most of the results described in my PhD are intuitively pleasing. However, one of the results that I developed in collaboration with Choi, Anandkumar and Willsky is in the project in which we endeavoured to learn latent tree-structured graphical models. Such models are characterized by the fact that only a subset of variables is observed; the other variables are hidden. The use of latent variables in modelling has been shown to be an extremely useful construct. For example, in computer vision, a very active area of research is the use of the scene context, which can be regarded as a hidden variable. Indeed, if one knows that an image is that of an airport runway, then one would expect to find airplanes and certainly not animals. We were able to show that there are simple, intuitive algorithms that can consistently recover the latent tree structure under relatively mild assumptions on the underlying model. This result is somewhat counterintuitive because it seems impossible to identity probabilistic models consistently given only partial information, yet we are able to demonstrate that this can indeed be done with low sample and computational complexity.

Counterintuitive results challenge the scientist to examine prevailing assumptions and to think deeply about the consequences of the results. For example, in our project on learning latent trees, it begs the question whether we can further relax the existing assumptions or to develop new algorithms to learn latent graphical models which have loops and hence have greater modeling power. However, greater modeling power does not equate to better predictive power due in part to overfitting. Of course, counterintuitive results also have the ability to raise eyebrows in the community and for other scientists to take notice of one’s work (which can only bode well for one’s citation index). I believe that both intuitive as well as counterintuitive results are of value but care has to be exercised in the application of the latter to ensure that all assumptions made are valid.

Lav: In San Diego last month, Sergio Verdú presented Shannon’s inequality P_e \ge \frac{1}{6}\frac{ H(X|Y) }{ \log M + \log \log M - \log H(X|Y) }. Do you think it is good, bad, or ugly?

VT: I thought that the talk was very interesting. It’s a little known inequality that the information theory community ought to know about. Having said that though, this inequality seems to be of limited utility given that Fano’s inequality can do the job equally well in many scenarios. If Prof. Verdu had given a concrete example on how “Shannon’s inequality” can be used in a “real-life scenario” (and yield better results than Fano’s) then the information theory community will stand up and take notice. Otherwise, it’s as good as a homework problem. Come to think of it, it should be made a homework problem (with copious amounts of hints though).

K: You have a keen interest in public affairs as evidenced by your many letters in The Straits Times. How was it that you chose to enter into electrical and information sciences rather than pursue other academic interests?

VT: I was supposed to go into public service as a diplomat or a high school teacher after my undergraduate studies at Cambridge. The Public Service Commission of Singapore sponsored my undergraduate studies and expected me to work in the civil service upon graduation. However, I chose to do research in part because I enjoyed the mathematical content as an undergraduate and hoped to do more as a PhD student and later on as a researcher. Hence I decided to take up the A*STAR PhD fellowship to pursue my PhD at MIT. I am not ruling out a return to the public service in future.

I thought that the talk was very interesting. It’s a little known inequality that the information theory community ought to know about.  Having said that though, this inequality seems to be of limited utility given that  Fano’s inequality can do the job equally well in many scenarios. If Prof. Verdu had given a concrete example on how “Shannon’s inequality” can be used in a “real-life scenario” (and yield better results than Fano’s) then the information theory will stand up and take notice. Otherwise, it’s as good as a homework problem. Come to think of it, it should be made a homework problem (with copious amounts of hints though).

The Heart of Lightness

March 12, 2011

असतोमा सद्गमय।

तमसोमा ज्योतिर् गमया।

मृत्योर्मामृतं गमय॥

Viscount Marlow, I know you have some interviews in the works, but I’ve been hearing from certain elements of the informational rabble that posting gaps have been negatively impacting procrastination activities.  As such, let me put up a post (though my guess is that level of procrastination is independent of quality of procrastination activities).

Building on the curation/creation debate that has been going on, what do you think of using curation as a springboard for new creation: a kind of curation-based synthesis?  As one example, a paper on the C. elegans neuronal network that was recently published (and that has an interesting metrics framework) started out as a work of data curation, but led to new discoveries.  Connecting culturomics with Zipf’s Law is another example where large scale curation led to new discoveries.  But both of these examples are data analytics-based.

More in line with your take on curation/creation, e.g. by your mention of the Huffington Post, let me see if what I do next will be considered sponging or new synthesis.

So over the past few days, I’ve read:

One is a news article, the next is a blog entry, and the last is a book review, but all three are essentially about light bulbs.  The news article talks about the new laws that will strongly limit the sale of incandescent light bulbs, ending what might call the era of Edison.  The book review mentions how the advent of the Edison era was so significant, saying “More than any other breakthrough, the combination of electric power and lighting has changed the course of humanity for the better in a briefer span than all the advances before it, going back to the advent of fire itself.”  Having not gone to electrical engineering school at the beginning of the Edison era, I know very little about traditional lighting.  Though having gone to school as the Edison era ended, I did learn a little bit about nanophotonics.

Anyway, the most interesting among the three is the blog post, which talks across eras and says:

Consider what’s happened with lighting over the past three centuries. As people have switched from candles to oil-powered lamps to incandescent bulbs and beyond, the amount of energy needed to produce a unit of light has plummeted. Yet people have found so many new places to light that today we spend the same proportion of our income on light as our much poorer ancestors did in 1700

More specifically, this statement is based on a paper that shows historical data is “consistent with a simple expression in which per-capita consumption of artificial light varies linearly with the ratio between per-capita gross domestic product and cost of light.”  The authors make no serious attempt to explain the origin of the observed phenomenon.  They do conjecture, however, that similar phenomena might be found in historical studies of other things with environmental impact, such as transportation.  These are related to what is often called the rebound effect.

Let me put forth a potential explanation for this phenomenon.  Light is a well-known metaphor for truth: tamaso mā jyotir gamaya (from darkness, lead me to light), and its derivative Dark Side.  But moreover it can be thought of as a tool for shaping moral spaces.

Interestingly, there have been several studies in several areas showing a kind of psychological homeostasis and regulatory balancing with respect to things like risk or morality.

So the explanation I am putting forth is that expenditures on light are rather like altruistic behaviors and are held in balance through moral regulation.  Not only that, but with the increasing efficiency of lighting, social norms on what is meant by “well-lit” tend to increase.  Thus the dynamic interaction between morality and social norms leads to consistent income light fraction over time.

I wonder how psychophysical laws play into things.