Ronnie05's Blog

Big Data: Controlling the beast by its horns

Posted in Mobile Data & Traffic by Manas Ganguly on February 20, 2012

(This is the third of series of posts on Big data and the Internet of Things. Read the first, second and third posts here.)

I look for hot spots in the data, an outbreak of activity that I need to understand. It’s something you can only do with Big Data.” – Jon Kleinberg, a professor at Cornell

Researchers have found a spike in Google search requests for terms like “flu symptoms” and “flu treatments” a couple of weeks before there is an increase in flu patients coming to hospital emergency rooms in a region (and emergency room reports usually lag behind visits by two weeks or so).Global Pulse, a new initiative by the United Nations, wants to leverage Big Data for global development. The group will conduct so-called sentiment analysis of messages in social networks and text messages – using natural-language deciphering software – to help predict job losses, spending reductions or disease outbreaks in a given region. The goal is to use digital early-warning signals to guide assistance programs in advance to, for example, prevent a region from slipping back into poverty.

In economic forecasting, research has shown that trends in increasing or decreasing volumes of housing-related search queries in Google are a more accurate predictor of house sales in the next quarter than the forecasts of real estate economists.

Big Data has its perils, to be sure. With huge data sets and fine-grained measurement, statisticians and computer scientists note, there is increased risk of “false discoveries.” The trouble with seeking a meaningful needle in massive haystacks of data, is that “many bits of straw look like needles.”
Data is tamed and understood using computer and mathematical models. These models, like metaphors in literature, are explanatory simplifications. They are useful for understanding, but they have their limits. A model might spot a correlation and draw a statistical inference that is unfair or discriminatory, based on online searches, affecting the products, bank loans and health insurance a person is offered, privacy advocates warn.

Despite the caveats, there seems to be no turning back. Data is in the driver’s seat. It’s there, it’s useful and it’s valuable, even hip. It’s a revolution. We’re really just getting under way. But the march of quantification, made possible by enormous new sources of data, will sweep through academia, business and government. There is no area that is going to be untouched.

Channelizing and Structuring Big Data: Data First Thinking

Posted in Mobile Data & Traffic by Manas Ganguly on February 16, 2012

(This is the third of series of posts on Big data and the Internet of Things. Read the first and second posts here.)

There is plenty of anecdotal evidence of the payoff from data-first thinking. The best-known is still “Moneyball,” the 2003 book by Michael Lewis, chronicling how the low-budget Oakland A’s massaged data and arcane baseball statistics to spot undervalued players. Heavy data analysis had become standard not only in baseball but also in other sports, including English soccer, well before last year’s movie version of “Moneyball,” starring Brad Pitt.

Artificial-intelligence technologies can be applied in many fields. For example, Google’s search and ad business and its experimental robot cars, have navigated thousands of miles of California roads, both use a bundle of artificial-intelligence tricks. Both are daunting Big Data challenges, parsing vast quantities of data and making decisions instantaneously.

The wealth of new data, in turn, accelerates advances in computing – a virtuous circle of Big Data. Machine-learning algorithms, for example, learn on data, and the more data, the more the machines learn. Take Siri, the talking, question-answering application in iPhones, which Apple introduced last fall. Its origins go back to a Pentagon research project that was then spun off as a Silicon Valley start-up. Apple bought Siri in 2010, and kept feeding it more data. Now, with people supplying millions of questions, Siri is becoming an increasingly adept personal assistant, offering reminders, weather reports, restaurant suggestions and answers to an expanding universe of questions.

Google searches, Facebook posts and Twitter messages, for example, make it possible to measure behavior and sentiment in fine detail and as it happens. In business, economics and other fields, decisions will increasingly be based on data and analysis rather than on experience and intuition.

Retailers, like Walmart and Kohl’s, analyze sales, pricing and economic, demographic and weather data to tailor product selections at particular stores and determine the timing of price markdowns. Shipping companies, like U.P.S., mine data on truck delivery times and traffic patterns to fine-tune routing. Police departments across the country, led by New York’s, use computerized mapping and analysis of variables like historical arrest patterns, paydays, sporting events, rainfall and holidays to try to predict likely crime “hot spots” and deploy officers there in advance. Data-driven decision making” achieved productivity gains that were 5 percent to 6 percent higher than other factors could explain.

%d bloggers like this: