As part of our ongoing series on Productionizing Hadoop and Spark in the cloud, we explore performance optimization, and how companies scale and tune for the best performance. We also discuss what’s required for production-grade deployments, often an underestimated part of the process.
There is an interesting theme mentioned by the leaders of data science and advanced analytics groups: All are focused on how to make their team as productive as possible. The resources for these teams are notoriously hard to find. So, naturally, team leaders want to ensure that these scarce, highly-skilled workers have everything they need to be efficient. Here are the most common pitfalls we hear about. Do you agree?
Over the past few years, I have observed a deepening organizational divide in large data-driven companies. On one hand, IT and data owners have their hands full managing their current data infrastructure and platforms.
Japanese rock gardens, or zen gardens, were first constructed centuries ago at temples as aids to meditation. Also called “dry landscapes,” zen gardens are designed as miniature models of natural landscapes. This practice of artfully modeling the world in miniature seemed like a beautiful analogy to launch our new Data Science Sandbox as a Service…
In the past, protecting and securing enterprise data was simpler—handled mainly through the use of basic perimeter-based devices like firewalls and intrusion protection services. As more and more enterprises now look to migrate or augment their big data clusters with the cloud, the amount of access points to their data continues to exponentially increase. For the modern enterprise, perimeters are almost gone. Thorough security and compliance measures for this newly distributed data are now a top priority for CISOs and security teams, well-covered in several recent articles around the web.
For our upcoming webinar, we’re proud to feature guest speaker Mike Gualtieri, Forrester VP and principal analyst, an industry favorite. Why do we like him – especially on these topics? Well, as an industry analyst, Mike has a fascinating coverage area (bio), which includes big data and IoT strategy, Hadoop/Spark, predictive analytics, streaming analytics, prescriptive analytics, machine learning, data science, Artificial Intelligence, and emerging technologies.
In an upcoming webinar, we'll get up to date on Cloud Data Warehousing with guest speaker Noel Yuhanna, Forrester principal analyst; Prat Moghe, (Cazena founder & former Netezza SVP) and...You! Noel will present a short and succinct overview of cloud data warehousing, a market he's tracked as an analyst since its inception. Then, we've planned lots of time for your questions and an interactive discussion, moderated by yours truly.
This week Cazena made a major announcement, un-coincidentally timed with Strata + Hadoop World. We seriously enhanced our Data Lake as a Service, which is based on Cloudera Enterprise, runs on Microsoft Azure or AWS, and includes many new features for data science. Read more here. It’s been exciting to see the momentum in the Big Data as a Service category and I loved sharing the news at Strata. Walking through buzzy hum of expo floor conversations, I overheard the same terms over and over.