Ronnie05's Blog

Big Data: Controlling the beast by its horns

Posted in Mobile Data & Traffic by Manas Ganguly on February 20, 2012

(This is the third of series of posts on Big data and the Internet of Things. Read the first, second and third posts here.)

I look for hot spots in the data, an outbreak of activity that I need to understand. It’s something you can only do with Big Data.” – Jon Kleinberg, a professor at Cornell

Researchers have found a spike in Google search requests for terms like “flu symptoms” and “flu treatments” a couple of weeks before there is an increase in flu patients coming to hospital emergency rooms in a region (and emergency room reports usually lag behind visits by two weeks or so).Global Pulse, a new initiative by the United Nations, wants to leverage Big Data for global development. The group will conduct so-called sentiment analysis of messages in social networks and text messages – using natural-language deciphering software – to help predict job losses, spending reductions or disease outbreaks in a given region. The goal is to use digital early-warning signals to guide assistance programs in advance to, for example, prevent a region from slipping back into poverty.

In economic forecasting, research has shown that trends in increasing or decreasing volumes of housing-related search queries in Google are a more accurate predictor of house sales in the next quarter than the forecasts of real estate economists.

Big Data has its perils, to be sure. With huge data sets and fine-grained measurement, statisticians and computer scientists note, there is increased risk of “false discoveries.” The trouble with seeking a meaningful needle in massive haystacks of data, is that “many bits of straw look like needles.”
Data is tamed and understood using computer and mathematical models. These models, like metaphors in literature, are explanatory simplifications. They are useful for understanding, but they have their limits. A model might spot a correlation and draw a statistical inference that is unfair or discriminatory, based on online searches, affecting the products, bank loans and health insurance a person is offered, privacy advocates warn.

Despite the caveats, there seems to be no turning back. Data is in the driver’s seat. It’s there, it’s useful and it’s valuable, even hip. It’s a revolution. We’re really just getting under way. But the march of quantification, made possible by enormous new sources of data, will sweep through academia, business and government. There is no area that is going to be untouched.

Channelizing and Structuring Big Data: Data First Thinking

Posted in Mobile Data & Traffic by Manas Ganguly on February 16, 2012

(This is the third of series of posts on Big data and the Internet of Things. Read the first and second posts here.)

There is plenty of anecdotal evidence of the payoff from data-first thinking. The best-known is still “Moneyball,” the 2003 book by Michael Lewis, chronicling how the low-budget Oakland A’s massaged data and arcane baseball statistics to spot undervalued players. Heavy data analysis had become standard not only in baseball but also in other sports, including English soccer, well before last year’s movie version of “Moneyball,” starring Brad Pitt.

Artificial-intelligence technologies can be applied in many fields. For example, Google’s search and ad business and its experimental robot cars, have navigated thousands of miles of California roads, both use a bundle of artificial-intelligence tricks. Both are daunting Big Data challenges, parsing vast quantities of data and making decisions instantaneously.

The wealth of new data, in turn, accelerates advances in computing – a virtuous circle of Big Data. Machine-learning algorithms, for example, learn on data, and the more data, the more the machines learn. Take Siri, the talking, question-answering application in iPhones, which Apple introduced last fall. Its origins go back to a Pentagon research project that was then spun off as a Silicon Valley start-up. Apple bought Siri in 2010, and kept feeding it more data. Now, with people supplying millions of questions, Siri is becoming an increasingly adept personal assistant, offering reminders, weather reports, restaurant suggestions and answers to an expanding universe of questions.

Google searches, Facebook posts and Twitter messages, for example, make it possible to measure behavior and sentiment in fine detail and as it happens. In business, economics and other fields, decisions will increasingly be based on data and analysis rather than on experience and intuition.

Retailers, like Walmart and Kohl’s, analyze sales, pricing and economic, demographic and weather data to tailor product selections at particular stores and determine the timing of price markdowns. Shipping companies, like U.P.S., mine data on truck delivery times and traffic patterns to fine-tune routing. Police departments across the country, led by New York’s, use computerized mapping and analysis of variables like historical arrest patterns, paydays, sporting events, rainfall and holidays to try to predict likely crime “hot spots” and deploy officers there in advance. Data-driven decision making” achieved productivity gains that were 5 percent to 6 percent higher than other factors could explain.

Big Data and the Internet of Things.

Posted in Mobile Data & Traffic by Manas Ganguly on February 15, 2012

(This is the second of series of posts on Big data and the Internet of Things. Read the first post here.)

With a 18 fold increase expected in the next 5 years timeframe Data is the new class of economic asset, like currency or gold.With growing multiplicity of data sources, Big Data has the potential to be “humanity’s dashboard,” an intelligent tool that can help combat poverty, crime and pollution. Privacy advocates take a dim view, warning that Big Data is Big Brother, in corporate clothing.

What is Big Data? A meme and a marketing term, for sure, but also shorthand for advancing trends in technology that open the door to a new approach to understanding the world and making decisions. There is a lot more data, all the time, growing at 50 percent a year, or more than doubling every two years, estimates IDC. It’s not just more streams of data, but entirely new ones. For example, there are now countless digital sensors worldwide in industrial equipment, automobiles, electrical meters and shipping crates. They can measure and communicate location, movement, vibration, temperature, humidity, even chemical changes in the air.

Linking these communicating sensors to computing intelligence and gives rise to what is called the Internet of Things or the Industrial Internet. Improved access to information is also fueling the Big Data trend. For example, government data – employment figures and other information – has been steadily migrating onto the Web. In 2009, Washington opened the data doors further by starting Data.gov, a Web site that makes all kinds of government data accessible to the public.

Data is not only becoming more available but also more understandable to computers. Most of the Big Data surge is data in the wild – unruly stuff like words, images and video on the Web and those streams of sensor data. It is called unstructured data and is not typically grist for traditional databases. But the computer tools for gleaning knowledge and insights from the Internet era’s vast trove of unstructured data are fast gaining ground. At the forefront are the rapidly advancing techniques of artificial intelligence like natural-language processing, pattern recognition and machine learning.

%d bloggers like this: