Information Theorists: The 'Physicists' of Data Flow

Somewhere out there, a computer is trying to predict what you'll do next. 

"Information in Small Bits" was distributed to all participants at the ITA Workshop. The book combines comics with narration to introduce the basic notions of Information Theory-the theory that deals with how information is measured, processed, stored and compressed. 

Put like that, it sounds ominous, but data mining and prediction models are what get you a ride any time you need one or help you find the love of your life. There’s a demand for this kind of thing, and where there’s demand, there’s data.
 
Picture electrical engineers, mathematicians and computer scientists with hard hats and wrenches and you get a metaphorical idea of what information theorists do. Information theory – the back-end “plumbing” of network computing – is what makes data flow. Information theory is, as it sounds, theoretical, but the Information Theory and Applications workshop, held each spring in San Diego (and a huge draw with typically more than 500 participants) also focuses on the ways these theories can be applied, be it in business, finance, computer science or robotics.
 
The data, in the case of business and finance for example, are a user’s behavioral tendencies, her prior history or demographic information. Instead of flowing through pipes, the data flow through network channels, compressed and encrypted, encoded and decoded, diverted and redirected here and there by “gates” borne of mathematical algorithms and increasingly, machines that have learned to navigate the tides of data themselves. 

Knowing the “physics” of this data flow is crucial for making sense of it, and for mining data to build the consumer prediction models that underlie everything from Uber to Amazon. Speed and efficiency are key in answering the classic question of information theory: What is the minimal amount of information we need to solve the problem?
 
And these are no mere mathematical exercises. Participants at this year’s ITA conference – which is organized by the Information Theory and Applications Center at UC San Diego’s Qualcomm Institute – have enabled some of the most successful hedge funds, helped Uber develop its price-based scheduling algorithm and, well, ever hear of Amazon keyword search? They did that, too.
 
Five to 10 years ago, information theorists were focused on how electrical and computer engineers could support the physical hardware needed for data flow, such as wireless antennae and cables for high-speed networking (where the electrical engineers come in).
 
“Now the problem is: We have all this data … how can we mine it?” says Massimo Franceschetti, a professor of Electrical Engineering at UC San Diego, a QI affiliate and an organizer of the conference. “Big data has led to a resurgence of interest in optimization, of knowing how to use that data.”
 
The more data you have, the more data you have to work with. Patterns emerge, and thus, predictions. Increasingly, machines can be programmed to do this kind of work through machine learning, the underlying framework of artificial intelligence. With image recognition, for example, machines are being taught to autonomously see patterns in the data that humans can’t.
 
ITA plenary session speaker Richard Baraniuk of Rice University discussed the grand challenge of developing computational algorithms that match or outperform a human’s ability to recognize what he sees. For a computer to “recognize” a visual object requires it to infer the unknown object position, orientation, and scale – something humans do spontaneously. Similarly, recognizing speech involves knowing voice pronunciation, pitch, and speed. Baraniuk noted that “a new breed” of algorithms have emerged for such tasks that compare with human capabilities, or even exceed them.
 
“But a fundamental question remains,” noted Baraniuk. “Why do they work?”
 
That's where the mathematicians come in. Understanding this – by way of a new probabilistic framework – provides "insights into their successes and shortcomings, a principled route to their improvement, and new avenues for exploration," says Baraniuk.
 
Deep learning – a much-talked-about class of machine-learning methods – makes artificial intelligence even more powerful by enabling a machine to learn certain features of data and then act, rather than merely learning how to use algorithms to complete specific tasks.
 
Plenary session speaker Sanjeev Arora, of Princeton University presented an overview of the open questions concerning deep learning. Still “under construction” are methods for optimization, generalization and the ability of deep learning networks to represent interesting distributions – in other words, to make sense of chaotic data.
 
Although Barniuk and Arora both addressed deep learning, they are from diverse fields – an effort, by design, provide ITA participants complementary views of related topics. Other topics at the conference included resource scarcity, active learning, information theory & learning, and “the future of money.”
 
It’s that last category that puts the limitations of information theory into starkest relief. Financial markets are, to a large degree, unpredictable information systems, chiefly because they are social. The way information flows is driven by human behavior, and humans, well... who knows what we'll do. And what others will do because of it.
 
“Everybody wants to predict human behavior,” says Franceschetti. “But if you win and make some  money, it also means someone else is losing.  And you’d better keep your prediction method  to yourself. It can’t work if everybody is using it; or if someone finds a better one. The system, the variables, they're always changing.”