Cloudera announces the general availability of Cloudera Data Science Workbench, the self-service tool devoted to data scientists. The solution, announced in the beta version of Strata + Hadoop World San Jose 2017, provides businesses with fast, easy and secure data science in self-service mode.
“We are entering the golden age of automatic learning and everything is focused on data. However, scientist data continue to struggle to create and test new analytical projects at the desired speed, especially in large environments, “said Charles Zedlewski, vice president of Senior Products at Cloudera. “Data Science Workbench is a self-service tool that accelerates the ability to create, scale, and deploy automated learning solutions using the most powerful technologies. This means that scientists today have the freedom to collaborate, share and manage their data in the mode that best suits their business and their business, making it easier and quicker to produce. ”
With Python, R and Scale directly in the web browser, Cloudera Data Science Workbench offers a self-service data science experience, providing users with the ability to download and experiment with the latest libraries and frameworks in customizable project environments. Cloudera Data Science Workbench is secure and compatible with Hadoop authentication, authorization, encryption and governance support.
The National Statistics Office (ONS), the largest independent UK official statistic producer, aims to use Cloudera Data Science Workbench to create repetitive, accurate and transferable statistical research. “We’ve found less time in model development and increased visibility in tracking progress and results,” says Simon Sandford-Taylor, Chief Technology Officer. “We believe that Cloudera Data Science Workbench has the potential to speed up our release schedule and better share best practices.”
Data Science Workbench from Cloudera easily integrates with many deep learning frameworks, including BigDL, a deep-learning library for Apache Spark, open source by Intel. Created to run on distributed Spark / Hadoop infrastructures and with performance optimized for running on Intel Xeon processors (using the Intel Math Kernel library), BigDL works directly within Cloudera Data Science Workbench.
“Enterprise customers require a logical platform to scale their analysis solutions and maximize investment. BigDL’s native integration with Apache Spark brings the world of deep learning into the Apache Spark ecosystem and a higher value for business customers, “said Michael Greene, vice president and general manager of System Technologies and Optimization at Software and Services, Intel Corporation. “The BigDL framework will help business customers better utilize existing investments to build their own analytical capabilities with optimized performance on Intel architecture.”
The benefits of integrating BigDL into Data Science Workbench include the ability to leverage deep learning libraries and tactics on the CPU architecture without further hardware or separate environments. The combination, in fact, provides a convenient way to create spark data science pipelines natively and integrate them with the deep learning library (BigDL) and other Spark / Hadoop components on the Cloudera Data Science Workbench solution.