Data mesh in practice: how to set up a data-driven organization: Interview with Max Schultze

Hyperight interviews Max Schultze to discuss how organizations can be better driven by data.
Hyperight interviews Max Schultze to discuss how organizations can be better driven by data.

The Data Mesh paradigm presents a huge potential to replace the centralised data lake and data warehouse as the dominant architectural patterns in data and analytics, describes Max Schultze, Data Engineering Manager at Zalando.

Back up by his personal experience with applying the Data Mesh concept in practice and dedicated field research, Max is joining us at the 6th edition of the Data Innovation Summit to reveal the most common pain points at different stages of the journey and battle-proof approaches to overcome those challenges. He is also bringing both technical and organisational insights ranging from companies that are just starting to promote a mindset shift of working with data, to companies that are already in the process of transforming their data infrastructure landscape, to advanced companies that are working on federated governance setups for a sustainable data-driven future.

Learn more about the Data Innovation Summit

As a segue to his talk, Max shared his knowledge on the core principles of Data Mesh, the idea behind domain-driven data products, his lessons learnt from applying Data Mesh and his piece of advice for moving towards Data Mesh architecture.

Hyperight: Hi Max, I’m very excited to welcome you as a speaker to the 6th edition of the Data Innovation Summit. What would you tell us about yourself as an intro to our discussion?

Max Schultze, Data Engineering Manager at Zalando

Max Schultze: Hi Ivana, thanks a lot for inviting me, I am very excited to be here this year. Data Innovation Summit has a reputation for bringing praxis proven ideas in the data space to a broader audience, and I am very happy to take part in the 6th iteration of it to share more insights into the topic “Data Mesh in Practice”. I am currently a Data Engineering Manager at Zalando, Europe’s biggest online platform for fashion, and had the opportunity to experience innovations and challenges in the data space first-hand by leading the team responsible for the storage layer of a multi-petabyte data lake.

 Driven by that I started to get involved with the Data Mesh idea at the end of 2019 and soon realized that many of the presented concepts are very close to the things we discovered and tried to address on our own. That realization brought me into the position to start talking publicly about the practical parts of the topic. By now I followed that up with several conference talks, as well as introductory O’Reilly training on the topic as well as a soon to be released industry report.

Hyperight: At the Data Innovation Summit 2021, you are going to present on Data Mesh in Practice: How to set up a data-driven organization. Data Mesh is one of the latest trends in data analytics promoting distributed domain-driven architecture that holds promise to replace centralised data lakes and data warehouses. What are the core principles of Data Mesh that make Data Mesh a better architecture than a centralised one?

Max Schultze: First and foremost Data Mesh is trying to address the way we think about data. For many years data has been merely a side product of the production processes we are operating in our companies. While inside of data warehouses we tried to address issues of data quality, it was usually a few central teams that took care of such and we had to realize that the approach ultimately does not scale with the ever-growing amount and variety of data we are producing today. The data lake seemed to be our saviour for a while, as through new technologies and the shift to the cloud, we were introduced to virtually infinite storage and processing capacity. Unfortunately quickly the question became “What data can we store?” instead of “What data should we store?” and our ambitions to create a well-maintained data lake of high data quality quickly turned into data swamps of unclear ownership and responsibility.

This is where Data Mesh is coming in and trying to address the mess many of us are facing right now. By introducing the idea of data products we attempt to turn previously unmaintained datasets into valuable assets with a clear purpose and defined stakeholders. Simultaneously we speak about doing so in a distributed domain-driven way by ensuring that the ownership and responsibility of such data products lie with those that know the data best. To make such a distributed setup truly scalable it becomes necessary to provide a self-serve data-agnostic data infrastructure platform. Lastly, to ensure distributed data products do not turn into disconnected domain silos, we are introducing the concept of federated computational governance.

Data Mesh in Practice — How to set up a data-driven organization: Interview with Max Schultze

Hyperight: Domain-driven data products are the key concept of Data Mesh. Could you please explain to us the idea behind them?

Max Schultze: Treating a data set as a product means that a team developing such a data product needs to have product management that defines a roadmap for that dataset, manages requested features, and ultimately understands the requirements of the data product’s customers, i.e. its internal users. Conversely, however, the team also gets resources and management support based on the success of their data product. For instance, if more internal users are using the data product or if more other data products are built on top of this team’s data product this is appreciated like building a successful digital product for external customers.

To define decentralized ownership for such data products, data mesh applies domain-driven design. From an architectural perspective this means that instead of using systems, technologies, or process stages as the guiding criteria for structuring ownership, business domains or their subdomains should be used to define boundaries of ownership. The idea here is to build up domain expertise and then give domain experts both the authority to make the important decisions and the capabilities to implement these decisions (and deal with the consequences) that are necessary to generate the most value from the data that belongs to their domain.

Hyperight: What are your lessons learnt from applying the Data Mesh concept in practice?

Max Schultze: Applying the Data Mesh concept in practice is a long and tedious journey. Ultimately we are trying to change the way we work with data in a broader organizational scope. Pushing for an organizational rethinking, however, does not mean that you cannot play your part. As with many big changes, the first steps are small and it is important to foster local culture and build the first successful MVPs before trying to attempt a company-wide scale.

Personally, my biggest learning came from the data infrastructure platform side of things. While it is absolutely possible and even necessary to build the right tooling for your data mesh to scale, it is not about the technology itself. It is more important to follow the underlying principles of building self-service infrastructure in a domain agnostic way, there are many tools to get you there and the specifics will highly depend on your company’s setup.

Data Mesh in Practice — How to set up a data-driven organization: Interview with Max Schultze

Hyperight: One of the key points in your talk will be the main pain points. What are some of the biggest challenges when implementing Data Mesh?

Max Schultze: Especially when getting started it is easy to hit early roadblocks that might seem insurmountable, but awareness for some of those can clear a path. Don’t overload your people. Existing teams in many cases can have the necessary skills to start building initial data products, especially when it comes to product managers, however, it is important to not only factor in skills but also capacity. Don’t place your seed project that is supposed to change your company in a team that is already overloaded with their day to day business, without allocating additional time and resources to take it on.

Another important challenge sparks when taking on data infrastructure responsibilities, as without conscious decision making about what capabilities to provide, it is easy to again take on central responsibility for data and with that run into the same scalability issues that we were originally trying to supersede.

Hyperight: What would you advise companies that are thinking about starting their Data Mesh journey? What are some best practices to follow?

READ MORE:

Max Schultze: Start small but with commitment. The companies that I have seen to be most successful with moving towards a data mesh architecture did neither plan a company-wide program to introduce a data mesh nor did they secure lots of resources for a big data infrastructure project. But they also did not decide on a whim to try out a little data mesh experiment in some lab. Instead, the most successful approach is to carefully select a meaningful use case with a limited but valuable impact and then to provide all the support you have to make this first data product a success that can be demonstrated and learned from.

For more news from Top Business Tech, don’t forget to subscribe to our daily bulletin!

Follow us on LinkedIn and Twitter

Amber Donovan-Stevens

Amber is a Content Editor at Top Business Tech

How E-commerce Marketers Can Win Black Friday

Sue Azari • 11th November 2024

As new global eCommerce players expand their influence across both European and US markets, traditional brands are navigating a rapidly shifting landscape. These fast-growing Asian platforms have gained traction by offering ultra-low prices, rapid product turnarounds, heavy investment in paid user acquisition, and leveraging viral social media trends to create demand almost in real-time. This...

Why microgrids are big news

Craig Tropea • 31st October 2024

As the world continues its march towards a greener future, businesses, communities, and individuals alike are all increasingly turning towards renewable energy sources to power their operations. What is most interesting, though, is how many of them are taking the pro-active position of researching, selecting, and implementing their preferred solutions without the assistance of traditional...

Is automation the silver bullet for customer retention?

Carter Busse • 22nd October 2024

CX innovation has accelerated rapidly since 2020, as business and consumer expectations evolved dramatically during the Covid-19 pandemic. Now, finding the best way to engage and respond to customers has become a top business priority and a key business challenge. Not only do customers expect the highest standard, but companies are prioritising superb CX to...

Automated Testing Tools and Their Impact on Software Quality

Natalia Yanchii • 09th October 2024

Test automation refers to using specialized software tools and frameworks to automate the execution of test cases, thereby reducing the time and effort required for manual testing. This approach ensures that automation tests run quickly and consistently, allowing development teams to identify and resolve defects more effectively. Test automation provides greater accuracy by eliminating human...

Custom Software Development

Natalia Yanchii • 04th October 2024

There is a wide performance gap between industry-leading companies and other market players. What helps these top businesses outperform their competitors? McKinsey & Company researchers are confident that these are digital technologies and custom software solutions. Nearly 70% of the top performers develop their proprietary products to differentiate themselves from competitors and drive growth. As...

The Impact of Test Automation on Software Quality

Natalia Yanchii • 04th October 2024

Software systems have become highly complex now, with multiple interconnected components, diverse user interfaces, and business logic. To ensure quality, QA engineers thoroughly test these systems through either automated or manual testing. At Testlum, we met many software development teams who were pressured to deliver new features and updates at a faster pace. The manual...