The relevance of a Chief Data Officer and their organization relies heavily on how that organization views its data. If data is a ‘by-product’ of process and systems, then the CDO organization will only be another function that just adds unnecessary costs to the overall run-rate of the real business. If the organization views its data as one of its assets and ‘crown jewels,’ in a similar fashion to its critical processes and applications, the CDO will play a much more prominent role in shaping the data culture. Data Science has significantly challenged the traditional role of a CDO and the world needs to rethink the real value a CDO can add in the world of analytics.
The Data Mesh by Zhamak Dehghani challenges organizations to think differently about their data by treating it like a product to enable analytics and data science. The Data Mesh can be leveraged to radically rethink the CDO organization. There can be two ways a traditional CDO can approach the Data Mesh. The first will view a Mesh as very much a threat to the status quo — that it would disrupt the normal operations of the CDO and associated data management groups. The second viewpoint is that a Mesh is an exciting opportunity that has the potential to fundamentally change the data narrative in the organization, and therefore is something that should be explored with pace. Both viewpoints are correct. Done well, a Data Mesh will dramatically change the operations of a CDO, but this shift should be a positive disruption — allowing the CDO to not just remain relevant but to actually become a more critical part of the business, true leaders in embedding a data culture into the organization.
To understand how the CDO has to evolve to support a Data Mesh, one needs to understand how the CDO has evolved over time.
The CDO awakens
CDOs emerged in enterprises to help manage the data the organization generates and uses. Their structure, and the role they play, often reflects the way data is logically organized within the enterprise.
A fairly typical evolution is illustrated below:
The first state on the evolutionary ladder can be politely termed “decentralized.” It was actually the wild west, and everyone did as they pleased with zero re-use and no common standards. At this time, individual parts of the business were responsible for their own data — almost universally transactional in nature. Each business unit would store and manage its own data in its own way — a Banks Deposits business would hold information related to client deposits, the Investments business would hold information related to client investments, and so on. This approach led to multiple pools of transactional data scattered across the organization and aligned to functional business units.
Where data management activities took place, they were largely focused on data quality and integrity for the transactional data they held. Data standards existed but tended to focus very much on integration — defining file formats for sharing data between internal and external parties. In general, there was little or no strategy attempting to coordinate data activities across the organization, and data was seen very much as a ‘byproduct’ of business activities.
Unsurprisingly, data control and usage was, where it occurred, locally optimized with little synergy across organizational units. There was a high degree of flexibility inherent in this approach with each business unit having the freedom to operate and act on its data as it saw fit, but grander aspirations to operate data effectively across the organization suffered accordingly. This quickly became a problem as the value of cross-enterprise data started to become apparent. As demand grew to use data effectively across the organization, so too did demand for some form of central control to help manage and coordinate data activities, which resulted in the birth of the CDO.
The next evolution was diametrically opposite of the previous. As companies began to understand the value of the data they held and the importance of keeping it well- controlled, they began to look at technologies and governance mechanisms to ease this process. As a result, we see the emergence of two mutually supportive developments — the data warehouse and the CDO / DMO. The data warehouse became powerful by centralizing data from across the enterprise and maintaining it according to strict governance controls. By now, organizations had determined that data was truly valuable, and so numerous mechanisms were created by CDOs in order to ensure data was of the highest quality. The pendulum had turned completely and the wild west mentality was replaced with a well-organized, centrally managed government-like machine — including the drawbacks.
The focus of the CDO in this period was to ensure data was accurately copied to the data warehouse and that it was well-governed from source to destination. Data Stewards and Data Custodians became roles intertwined with the extensive data management / governance processes that were implemented.
With the emergence of Data Lakes, another form of central repository, CDOs everywhere simply expanded their mandate and continued their operations as before. Now data went to both the lake and the warehouse, but the way data flowed around the organization (normally batch-driven) and the way it was managed and governed changed very little.
Hi, I’m a Data Mesh
As we have moved closer to the current day, the expectations of data has grown exponentially — the volumes, the richness, the demand, the use-cases all increased massively. The centralized model began creaking under the weight of these demands and progressive data and technology thought-leaders began defining a different way to look at the problem. And thus came the Data Mesh.
Zhamak Dehgani created a domain-based paradigm to manage data assets for better analytics outcomes called the Data Mesh, and it drastically impacts the centralized structures built up by the CDO.
In summary, a Data Mesh is a federated architecture where data is made available from logical domains rather than from a centralized store. These domains provide the data for which they are responsible, and consumers pull this data from one or more domains to meet their purposes. It’s a form of data ecosystem where data is provided to consumers from those who best understand it and are best able to ensure it is of the highest quality.
This new architectural style brings with it implications for any CDOs comfortably established to operate in the common centralized model. Perhaps the most fundamental aspect of the data mesh architecture is that the ownership and accountability of the data is federated out to the business domains that understand, create and manage the data itself — the experts. Attempting to follow the more traditional model of data governance espoused by many CDOs will only result in blurred lines, grey areas, gaps, and unnecessary friction between parties.
Building a CDO for the future world of analytics enabled by the Data Mesh
The next generation of CDO must adapt to reflect the new paradigm of enterprise data. The data mesh requires an effective CDO function in order to reach optimal effectiveness, but this CDO must be a very different entity than the generations before it. Where previously there was a strong focus on management of data, there must now be a greater emphasis on data leadership. Leadership vs Management is a topic that has kept management consultancies and think tanks busy for many years, so it’s not something I will dwell on but, for me, it all stems from the level of proactivity and agency the CDO has.
Essentially, CDOs must bring the Federal perspective to the Federated architecture. They must become the group that ensures the domains become more than just the sum of their parts — that there is some additive value the CDO brings to allow the data mesh to operate at a higher level than would otherwise be possible. In many ways, it means the CDO must be prepared to do more, rather than less, despite the fact that some of their current responsibilities will be incorporated by the domains (e.g. domain-level data quality, domain-level data governance etc).
Be strategic, transformational thinkers driving the data agenda forward
It is no longer simply about ensuring that each data asset has the required management controls in place. Instead, CDOs must be big-picture thinkers who understand the value of data to the organization and work tirelessly to innovate and encourage the use of enterprise data to identify, enable or even become new business opportunities.
CDOs must have team members who are strategically minded, are passionate about data and have the gravitas and communication skills necessary to work alongside business leaders and explore opportunities together. Simply adding “strategy” to someone’s job title will not be enough!
Be facilitators rather than doers
CDOs must realize that their continued success is not about what they do — it’s about what they make happen. In a mesh-style architecture, the role of the CDO becomes helping to lead the federation of domains. The CDO must become the “secret sauce” that makes the domains more effective both individually and when working in concert.
CDOs should be positioned to support domain data teams with their activities — bringing their expertise and authority to bear to help deliver domain agendas. But they must also be willing to let the domains lead these activities and to play more of a supporting and guiding role. If needed, CDOs should send coaches into domains to facilitate activities and encourage new ideas in much the same way Agile Coaches helped organizations adapt to agile ways-of-working.
Where CDOs should lead is with regards to interoperability across domains, enabling a mesh to become federated architecture as opposed to purely decentralized. CDOs should actively steer domains to work together and introduce lightweight standards, frameworks and other tooling to make it easier for them to do so. This process could be done formally, with standards definitions groups, or informally, with a community of peers, but it must always be done pragmatically and without stifling innovation. CDOs must remember that when the domains succeed, so do they — and that enterprise data now has shared ownership.
Be champions of a data culture
One of the most important tasks a CDO function must undertake is that of championing a data culture across the organization. All too often, data is considered very much a secondary concern, a by-product of genuine business activities, rather than the actual asset, or product, it is. Therefore, it is vital that it is the CDO leading the efforts to embed a culture of “loving the data” into the organization — to spur teams to generate more and better data, to leverage it to make better decisions, and to explore new and innovative ways to generate value from the data.
This creation of a data culture is actually made somewhat easier by virtue of the very structure of the data mesh. In a mesh, the domain experts become curators of their own data rather than having a central body responsible for all data. As a result, there are multiple groups across the enterprise who are working day-to-day with data, treating it as a product, innovating and evolving it based on needs. In such a model, these domain data teams will play a role in further spreading the culture of data and acting as champions for their own area in particular and inter-related domains via the mesh. This is, in many ways, analogous to having teams of influencers each affecting particular groups within the organization. It is therefore the role of the CDO to harness these efforts, to provide direction and be the exemplars of a new, pervasive data culture.
Actually understand data
When establishing CDOs and DMOs, the focus has often been on the successful operation of mechanistic processes related to managing the data; as a result, there has been a tendency to recruit process operators or owners rather than people with a true understanding of the data. Whilst that may have been satisfactory before, it is important that CDOs are resourced with people who understand data and the value it brings rather than simply knowing how to manage the various governance processes that have been established over the years. Whilst processes are still important and will remain, there will be an expectation that CDOs are true experts on data — even if they defer to domains for specialist subject matter expertise.
Look beyond the enterprise
The data mesh model, whereby data is made available via discrete and logical domains, provides a great opportunity for organizations to get greater value from the data assets they hold. But what about the data assets they don’t hold? A strong CDO can come to the forefront in this scenario. An effective data mesh relies on amalgamating data from across multiple domains and finding new ways to build composite value. It therefore lends itself naturally to adding more sources of domain data, this time from external sources, as the governance structures, technologies, and ways of working are already largely in place. This has the potential to further increase the value of data for the enterprise — by extending its own reach externally and becoming part of a broader data ecosystem.
In summary, the CDO has to move away from a Governance function into one that enables analytics and data culture.
Notes from the Author:
Thank you for Reading!
I really appreciate you reading through the entire article — this is my second article on the impacts of the Data Mesh on existing organizations and I hope to write more.
If you have any questions/thoughts or want to share any constructive criticism, you are more than welcome to reach out to me on Medium or LinkedIn.
Dehghani, Zhamak (2019, May 20). How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh. MartinFowler.com. Link.
Article Originally Published on Toward Data Science