DATA LAKE AS A SERVICE
TURNKEY HADOOP SOLUTIONS FOR AZURE AND AWS
 
 
 
 
 
Cazena's Data Lake as a Service

 

Cazena's Data Lake, or Data Repository, as a Service helps companies consolidate high-volume data sources, such as data from sensors and IOT systems, social media, logs or streaming data, into a single environment for use cases as diverse as historical, archival and cross-channel customer behavior analytics.

It’s a simple, cost-effective way to store large volumes of data in the cloud but always have those datasets available for analysis. Cazena Data Lakes can easily be used in conjunction with an enterprise data warehouse for staging and pre-process incoming data, or cost-efficiently archiving historical data with enterprise-grade security, scalability and cloud economics.

Our Data Lake as a Service is powered by Cloudera Hadoop, Spark and other components, so you have peace of mind knowing that your data lake is built on industry-leading best practices and best-of-breed technologies.

What is a Data Lake as a Service?

“Data lake” is a recently-popular term for a data repository. This service is typically configured with lots of storage, because companies use data lakes to consolidate data, collect near-real time or otherwise capture and store large volumes of data. A data lake as a service is delivered in the cloud, in the case of Cazena – on our big data platform as service.

The data lake service is often used as a cloud repository to collect diverse data from a variety of sources - capture everything inexpensively (and outside the firewall!), then figure out what to do with it later.

But a data lake or analytic data repository is much more than just storage. Analytic and data transformation capabilities are critical, allowing analysts to use SQL or build the lake into data pipelines and workflows.

Cazena’s Data Lake as a Service includes all capabilities for big data processing in the cloud in an easy set of web interfaces. The Data Lake is deployed on Cazena's platform as a service, which can be hosted on Amazon Web Services or Microsoft Azure. It’s simple – handle big data projects, with no complex cloud components to configure. 

How Companies use Cazena's Data Lake as a Service

Cazena is the fastest way to get a production-grade Cloudera Hadoop or Spark deployment in the cloud, which means that companies can drive value from big data faster. 

Consolidate data from a variety of sources to improve access and share multi-channel insights across organizations.

Build flexible analytic applications that can quickly incorporate new data sources and revised and updated analytic models.

Collect large volumes of data from IoT sensors and other data sources in the data lake. Then push subsets of that data into other Cazena services – or back to a data warehouse or datacenter.

Accelerate data engineering processes and reduce devops costs. Transform data sets for analysis. Stage data for batch processing.

Migrate data warehouse workloads to the cloud for lower costs. Free up expensive data warehouse resources that are dedicated to strict SLAs. Archive and store historical data where it can still easily be accessed and queried for analytics. Quickly and cost-effectively augment your existing enterprise data pipelines to take advantage of new data sources.

Why use Cazena for Cloudera Hadoop and Data Lakes in the cloud?

Cazena’s Data Lake as a Service is simple and easy to access -- but it's the deployment time that most enterprises get excited about. Cazena deploys production-grade Data Lakes based on Cloudera Hadoop on AWS or Azure in four weeks or less. That includes working with your enterprise security and networking people to ensure secure data movement to and from the cloud, and seamless connections from your tools and data sources. Everything is pre-built, integrated, fully-managed and optimized for the cloud, presented behind a simple abstraction layer. Many data lake projects take 6 to 12 months, and have a hard time showing value. With Cazena, you can launch fast and focus efforts on driving adoption and value from the new environment. 

Easy to use, fully-managed. Access your cloud-based Cazena Data Lake with just a few clicks. That means you can move, load and manage big data in a Cloudera Hadoop environment from a web browser, without any special skills. Collect and move data from new cloud-based sources such as SaaS applications or setup regular streaming from IoT sensors. The entire platform is fully-managed, so we're always here to help!

Get the benefits of a hybrid cloud architecture. Cazena's unique Gateway technology and security model means that the platform integrates with the enterprise technology you already use – from BI tools to data warehouses and ETL platforms. You won't have to build or figure out integration. Cazena ensures that Cloudera Hadoop drops into your existing data flows easily. You can even securely load and move data to (and from) the Cazena Data Lake and a traditional data warehouse. Have the best of both worlds.

Reduce costs intelligently. With a Cloudera Hadoop Data Lake from Cazena, you'll save time on labor and project management costs. You might also be able to lower data warehousing and analytics costs, and reduce labor, time and money spent on big data pilots that never go anywhere. Cazena uses intelligence, automation and cloud smarts to give you the best price-performance, often half the price of alternative approaches.  

How it Works

Read more about how the data lake utilizes Cazena’s core capabilities, or click over to the “What is Cazena?” page to learn more. 

Interface: Interact with the Data Lake through the Cazena web browser or APIs. We’ll train you on unfamiliar terms, and have worked hard to make our service simple and intuitive. Cazena’s Data Lake is easy enough for a tech savvy analyst or business user. Now you don’t need a whole tech staff to collect and share data, even lots and lots of data, for any kind of project.

Intelligent Provisioning: Cazena’s Workload Intelligence process automatically configures the data lake based on the workload’s data type, volume, analytics and price-performance requirements. Cazena maintains comprehensive benchmarking data to give you the best configuration in the cloud. Cazena Data Lake as a Service components typically include Cloudera Spark, ecosystem components (Oozie, YARN, etc) and Impala, which allows for SQL processing. But the interface is Cazena, so it’s simple (and supported). Data Lakes are often configured for lots of storage and batch processing, but it all depends on your needs.

Data Movers: This function really allows Cazena’s Data Lake as a Service to become part of an analytic pipeline. Data ingestion is part of our Cazena Gateway function, which can be used to set up near-real-time streaming to a data lake, or regular data loads from most any source. When moving data into a Cazena Data Lake, the Cazena Data Mover automatically handles any basic data transformations and flags discrepancies that require a human decision. That makes it super easy and convenient to load data.

Security: Like all Cazena services, the Data Lake as a Service is delivered on a secure datacloud, with certifications, industry-leading expertise and 24 x 7 white glove monitoring. Safely and confidently choose Cazena for new projects, client data or migration to the cloud. Spark and Hadoop security and compliance in the cloud for data lakes and data repositories is notoriously challenging, especially with large implementations. Let Cazena handle all of it for you with automation, expertise and monitoring.

Fully-Managed: Running a big data cluster or data lake is a big job – one that Cazena handles with our intelligent platform, automation and incredible in-house experience. Cazena handles everything, from backup and restore, to platform upgrades – even automatic service optimization. Do you want to manage a Data Lake, or use a Data Lake? Don’t be a data lake statistic or build a data swamp, bog or another bad water pun. Focus on driving adoption and using the data. Don’t spend all day administering Hadoop.

One Simple Bill, One Call Support: Cazena’s platform is sold as a monthly, quarterly or annual subscription. We give you a subscription price and expansion unit pricing upfront, so you can plan your cloud budget with no end of month surprises. All Cazena services include cloud infrastructure, database licenses, data movers, Cazena Gateway software and the fully-managed platform. For one price. Pretty cool. Data Lakes sometimes seem cheap, because there’s a lot of open source involved. But free software can require a lot of time, care and feeding. Cazena is the most efficient way to manage your platform and use your time most efficiently. 

Need more info? Contact us at info@cazena.com.