There has been a lot of movement on the financial side of the big data ecosystem, with recent IPOs, such as Tableau's, and the myriad of companies attracting new venture capital funding, but the market really begins to move when users have options and when the ubiquity of big data expands across platforms.
Usually, techno wars have only two camps, usually comprised of major vendors and their flagship customers pushing for one standard or another like TDMA vs. CDMA or ATM vs. IP. But it is different this time.
The New York Times today profiled Cisco CEO John Chambers' new strategy for big data and in the process, put a fine point on the most important question for the big data market, one being answered with four different approaches by some of the biggest names in networking and computing: "The question could ultimately be whether the center of the system is in the data, as EMC thinks, or in H.P.'s servers, IBM's software, or Cisco's network."
Michael F. Whiting, program director of systematics and biodiversity science at the National Science Foundation, said this week in Futurity that scientists are grappling with how to best to detect the signature of evolutionary history from a deluge of genetic data and resolve conflicts between studies that show different lineages for certain organisms.
There seems to be no end to the ways people can position big data with either praise as the next high-tech savior or derision as the latest scam. Those who position the technology as a scam or get hung up on nomenclatures should get over themselves and go find some other itch to scratch, but meaningful debate about the best approaches to big data are helpful and should be encouraged.
Google is betting the way to users' hearts is through their photos.
Even the White House has its policies around "open data standards," and has developed a set of guiding principles around implementation. But these so-called standards are not in any way a set of specifications for the industry.
By using tools, languages, and libraries normally used for designing single-machine applications across multiple machines, Ubalo was able to reduce image processing tasks from eight hours to five minutes.
Big data analytics and the collection of data don't require atomic-scale storage and processing yet, but if the current pace of networked data growth continues, it will soon. And IBM can't do it alone.
With so many varieties of data and variables to consider in building an effective predictive model, the first task should be to assess the viability of that data and confirm a particular variable's relevance before going further.