One of the most important things I noted at the conference: a considerable uptick in attendance of business people at an event previously thought to be a geek thing. There were a number of interesting announcements and achievements at Strata and you'll find many of them here. But if you were to ask me for a single winner in the battle of big data ideas, I would say it was...
Silk, a data publisher and collaborative platform, provides data visualizations on the current status of the ebola outbreaks, representing a blend of CDC and WHO data.
How to identify fake big data products
Here are a few of the product announcements made at Strata to give you an overview of the direction products and partnerships are now trending, or at least leaning.
Underscoring yet again that business users are essential to driving both the data-driven business and the big bucks to big data vendors, Platfora's Ben Werther said in his keynote that big data's "center of gravity is shifting to the business analyst and that's a really healthy thing because the person who's analyzing the data should be much more in control of the data. But that's leading to multi-structured questions and new stack requirements are emerging."
The Department of Energy's Energy Sciences Network, or ESnet, is deploying four new high-speed transatlantic links to deliver a total capacity of 340 gigabits-per-second.
Microsoft's idea of creating a data science marketplace takes best of show. It is the best idea in a sea of great ideas at Strata NY this year.
Put all three types of analytics together and you get an entire picture of the situation and possible solutions.
Louis Columbus has a chart of the Top 100 Enterprise Analytics Startups of 2014 in his post in Forbes.
Another area where jobs are springing up for data scientists is in mobile analytics startups. Of course, mobile companies of all types are also searching high and low for more data scientists, too.
The survey of device owners revealed widespread misunderstanding and inherent distrust in data usage but "a willingness to share data if it will aid areas such as healthcare and education."
Savi announced today that new predictive and prescriptive analytics-based scenarios have been added to Savi Insight for the purpose of uncovering previously undetected operational and supply chain patterns.
Predixion Software, a developer of cloud-based predictive analytics software, released Predixion Insight 4.0, a predictive analytics platform, at Strata.
Revolution Analytics introduced Revolution R Open and Revolution R Plus, two new offerings that support the open-source R community.
IPSoft's Amelia, a "learning cognitive agent," is already gaining high-level work skills that its creators say will put it, or others like it, at the top C-level position one day. And a Hong Kong-based venture capital company has already appointed an algorithm to its board of directors. Will we all work for machines soon--if we can find work at all?
"Commercial banks, credit card companies and credit bureaus have dived into big data, too, mainly for marketing and fraud protection," writes John Lippert in the Washington Post. "They've mostly left advances in the field of credit scoring to upstarts." That, it turns out, is a huge mistake for banks and the consumer credit industries.
U.S. National Institutes of Health (NIH) made multi-institute awards totaling nearly $32 million for 2014 under the NIH's Big Data to Knowledge, known as the BD2K initiative. One of the awards went to the University of Pittsburgh with Carnegie Mellon University, the Pittsburgh Supercomputing Center and Yale as partners, to the tune of $11 million.
It's not all that unusual for a General Electric (GE) business to hit the billion dollar mark, but as Quentin Hardy put it, GE's Internet of Things, or IoT, software business is "probably the fastest a GE business has hit the $1 billion mark." You can expect GE to pull in even more from this business arm because it has one heck of a great IoT strategy.
Yahoo!'s previous world record was 70 minutes using a large, open-source Hadoop cluster of 2100 machines for data processing. DataBricks, founded by the creators of Apache Spark, completed the Daytona GraySort, which is a distributed sort of 100 TB of on-disk data, in 23 minutes with 206 machines with 6,592 cores during this year's Sort Benchmark competition.