In this webinar, learn about recent advances in cloud data lakes that simplify and accelerate the enterprise journey. Explore the transcript below or watch the complete event.
Moderator: Thank you very much for sharing all of that insight and that timeline. We do have a few questions here.
If you were starting out now, which deployment model would you recommend and why?
Matt Aslett, 451 Research: I think with the caveat is that it would depend on where your data is today. Obviously, if you had a lot of data on-premise, you might lean towards that. But obviously, Gordon illustrated the ease with which CWT was able to move data between clouds.
As we see, an increasing volume of data is being generated in the cloud from applications that are cloud-native, so I think most organizations would look to start there. Indeed, our survey data shows a lot of organizations are looking to, if not start there, then rapidly move in that direction.
How do you decide which use case to start with? Gordon, I think you had a bit of a similar situation. How do you make that determination about who gets the first pilot or the first projects?
Gordon Coale, CWT: “That's a good question. The first thing you've got to do is find something you can ring fence to a small, achievable amount of work that delivers real business wow factor.
You need to get an early win under your belt. And, Cazena can help you do that because you get the quick start of a Data Lake in a box, if you like, or a Data Lake SaaS. But pick something achievable. Pick something with strong business sponsorship. And, pick something that you can increment along in traditional agile fashion.
It doesn't matter too much what it is, as long as you've got the right business stakeholder behind you -- who's going to then champion you for your next three or four deployments.
Cazena sees a lot of Data Lake deployments. Are there any specific use cases that you think are more successful than others or places where you'd recommend people start?
Prat Moghe, Cazena: “It's actually a pretty horizontal problem, these Data Lakes. Clearly, with CWT, you heard Gordon's very interesting use cases around travel. But we have seen other use cases like preventive maintenance, especially in manufacturing. We've seen digital cars, IoT use cases, customer analytics, marketing analytics -- so it's a fairly broad set of use cases. Fraud is a very common one in insurance and financial services.
In general, the common theme is IT and data leaders on one side, who have access to data, being able to very quickly share that data in a targeted way to business -- where data scientists and data engineers are being hired for very specific business outcomes. So, we see these SaaS Data Lakes as being a bridge between IT and data on one side, and the business on the other side.
The key is not to compromise on principles like security and compliance, which Gordon pointed out – and, at the same time, leverage the assets that you have, and the teams you have. It's about enabling these new teams that are being hired around data engineering and data science. They all look different, but they all want to run fast, and it's all about some top line initiative or it's about improving customer stickiness.
What is interesting is that the SaaS model ends up building on top of previous innovations like PaaS and IaaS, going back to Matt's picture that he had of these layers.
We are seeing these companies, with leaders like Gordon that have done this before. They have scars from that experience, and so now they’re approaching it in a very mature way. That probably contributes just as much to the success of these. Data leaders are as responsible for this transformation, as much as the technology like the SaaS Data Lake.
if we want to migrate lots of different analytic workloads, are there any best practices for migration? What should you think about if you're migrating from on-premises into the cloud?
Matt Aslett, 451 Research: “Well, certainly “plan” would be the best thing definitely! One of the things we've seen is that we talk a lot about data having gravity. The scale of that is becoming more and more obvious. I actually try to be very careful about when we talk about data workloads being deployed in the cloud, not to talk necessarily about data moving to the cloud. In many cases, it's not necessarily that data is being moved, because clearly the larger volumes of data you have, the greater the gravity is and the more difficult and costly it can be to move that. Which is not to say it cannot be done.
But think very carefully about which workloads should be deployed in which locations and for which purpose. And there'll be multiple aspects that will go along with that. Obviously, that the services are available, but also the level of data that you have in that environment in the first place, broader business considerations and business relationships of the company beyond that.
It’s definitely not something that should be taken lightly. Historically, people don't change their database unless they've got a very, very good reason to do so. That's the same thing with this idea of moving data workloads around. It's one thing to think about doing in theory. It's quite another thing to actually do in practice. Though as Gordon earlier said, it can be done, but with a lot of forethought and good partners to help you do it.”
Moderator: Thank you very much, Matt Aslett, Gordon Coale, and Prat Moghe for joining us today.
I'm going to take that segue from Matt's last answer there and remind you that a great way to test this out and see if a Data Lake works for you is to use a Cazena SaaS Data Lake pilot. This is a program which can really quickly show outcomes.
You'll get access to a Data Lake right away, and you'll be able to see how a Data Lake in the cloud looks, how people can interact with it. You can learn more about that at info.cazena.com/saas-pilot.
Thank you so much for tuning in today, thank you for supporting Cazena, and we hope to see you on another webinar soon.