Data Mesh: Enable Employees to Develop Business Predictions
The data analytics infrastructure of most organizations typically consists of data producers on one end, data consumers on the other, and in the middle, a data engineering and analytics team.
In most cases, this team is understaffed and tries to keep up with an ever-growing web of data pipelines to meet the growing demands of management and all operational, frontline functions.
With today’s growing data demands, the current approach can prove inefficient due to a number of problems that cause data consumers (e.g., marketing, sales, and CSMs) to wait too long for answers, delaying critical decision-making, and significantly slowing down their ability to execute effectively.
Problem #1: Data is a Monolith
This inefficiency stems from the fact that data lakes and warehouses are centralized structures, and they inevitably become too large and complex to easily maintain.
Old tables and views get left abandoned, file naming conventions change, governance lapses occur, and procedures get built on top of each other until no one remembers how data gets from staging to reporting.
Problem #2: Pipelines Are Fragile
Let’s not forget that pipelines and ETLs are quite fragile. Sure, some pipelines are well-defined and data moving through them is consistent and reliable, but oftentimes that's not the case.
The demand for data is always shifting and the time frames to build pipelines are short. Sources are unreliable, business rules are changing and the limits of dynamic pipelines are pushed to try to accommodate for unexpected data.
The result is a complex web of processes that often take quite a bit of babysitting, which make them very difficult to scale.
Problem #3: Data Experts Need to Also be Domain Experts
Data teams are often expected to also be experts in the domain data, despite never working with such data on a daily basis.
For instance, the analyst and data scientist are not sales people, but they need to think like ones, in order to answer questions that a data consumer (e.g., Account Executive, Sales Director) might have about the Q4 sales pipeline.
Enter the Data Mesh Approach
Data mesh is a theory designed to drive and support agile, efficient data architectures. It's a significant shift in thinking about how data should work in a company.
The first shift is the idea that domains (e.g., marketing, sales, customer success) own their data as a product, so let's assume your organization has a Sales platform that produces information about the organization’s sales efforts.
In this case, the data mesh approach suggests that Sales will now own the Sales domain data, and treat Sales data as a product for the company. This is a huge change in thinking, as most Sales departments don't really view themselves as owners of Sales data sets, which need to be distributed to the rest of the company.
Keep in mind that the Sales domain may include several data sources. Let's say there is a Sales Development app and a Sales CRM for account executives. The Sales team would manage all of this data and generate data sets and business predictions to be consumed by other users within the department but also outside of it.
They will also own the governance of the data in their domain, and the same will be done by other domains around the company (i.e., Marketing, Customer Success, and so on). In addition to the data generating domains, there may still be a need for analytics and aggregations of data, and these could exist as their own domains.
Data Teams Are Not Obsolete
Let's make one thing clear. Data teams are important. The analytics group still actively receives data from the different domains and uses it for the organization's needs.
For instance, they can aggregate the data from the Sales and Marketing domains to generate a data set for measuring CAC, based on Marketing spend and Marketing & Sales performance. This data set can be produced and made available to those who need it, like executive management in order to plan next quarter spending.
The Role of Engineering Teams
The next big shift is where the engineering team comes in. Their focus is no longer on pipelines consuming data, modeling data, and becoming knowledge experts in the domain data.
Instead, they are responsible for building and supporting the infrastructure needed for all of these domains to produce and share their data.
This of course could vary based on specific needs. For example, the data engineering team sets up the infrastructure. In this case a data lake along with metadata for governance and cataloging what data sets are stored.
Next, domain teams are enabled to produce and put their data sets into the data lake storage. The metadata layer would include things like access controls and possibly a key mapping to help integrate between the domains.
Generating Business Predictions in a Self-Service Fashion
From there, domain expert teams would develop business predictions in a self-service fashion, using a no-code predictive analytics layer like Forwrd.ai. This removes dependencies and allows Sales, Marketing, and Customer Success manager to access and analyze the data as they need.
Forwrd's approach essentially borrows concepts from advancements in the application development world, where the use of microservices is a popular way to manage monolithic applications. With microservice, apps are built as connected, single-function modules. If one module becomes obsolete it can be removed, rewritten, and rebuilt without affecting other modules.
With Forwrd, frontline Go-to-Market teams can build containers, known as Decision Bases, in which they define the metrics they wish to monitor (e.g., MQL, SQL, Closed Won, etc..) and continuously run their corporate data against these metric, to generate rich business predictions aimed at accelerating growth.
Business predictions can then be shared with team members, or any other member in the organization, via email, Slack, or by pushing business predictions directly into the business applications that Go-to-Market teams use daily (e.g., Salesforce, Hubspot, Zendesk) to accelerate their efforts by orders of magnitude.
The result is a decentralized data storage and processing but with a centralized infrastructure. And decentralized ownership but with centralized governance, along with company-wide standards for interoperability between domain data sets.
The data mesh approach is clearly aimed at tech-minded, growth companies that understand the concept of treating data as a product and have appreciation for agility and the democratization of data.
In old-fashioned organizations, with highly centralized IT, it would likely be nearly impossible to decentralize data with this proposed approach.
However, if it's a tech-savvy company where departments tend to implement and manage their own data-generating applications, and where the demand for data far outpaces what an engineering team can handle, then data mesh approach is an approach to consider seriously, if your end goal is to enable frontline teams to accelerate decision-making and business execution.