In a connected world, collaboration and sharing are key principals. In particular, the faster our networks are and the better our connectivity is, the more your organization will benefit from information sharing and operational collaboration. In turn, as we share more information among our partners, our connectedness is enhanced as well. It is not just systems that work better together, but the people managing those systems forge better working relationships, leading to more effective management of the business and ultimately, to competitive advantage. However, the more we share information, the more we realize that years of distribution of computing power and business applications into vertical lines of business has led to “islands of information coherence.” Data architectures designed to support operational processes within each business application area require their own definitions, dictionaries, structures, etc., all defined from the aspect of that particular business application.
The result is that the enterprise is composed of multiple, sometimes disparate sets of data that re intended to represent the same, or similar, business concepts. Yet, to exploit that information or both operational and analytical processes, an organization must be able to clearly define hose business concepts, identify the different ways that data sets represent those concepts, ntegrate that data, and then make it available across the organization. And this need has ntroduced a significant opportunity for organizational information integration, management and haring. n this white paper, we will discuss how these processes comprise a master data management MDM) program. We will look at questions such as:
The Origins of Master Data
The introduction of workgroup computing coupled with desktop applications ushered in an era of information management distribution. Administrative control of a business application, along with its required resources, brought a degree of freedom and agility to a business manager. However, by virtue of that distribution, the managers of a line of business could dictate the processes and constraints associated with the development of their own vertical applications to run their own lines-of-business, leading to variance in the ways that business concepts and objects are defined.
Not only that, the increase in both power and functionality at the desktop has engendered an even finer granularity of data distribution, allowing an even greater freedom in describing and modeling business information. And whether it is in the mainframe files, the database server, or in the spreadsheets on your desktop, we start to see a confusing jumble of concepts, along with creative ways of implementing those concepts.
Over the past ten years or so, the pendulum has swung back to centralized computing (such as data warehousing) for applications that help improve the business, with the intention of consolidating the organization’s data into an information asset to be mined for actionable knowledge. And while the centralization of information for analysis and reporting has great promise, this in turn introduces a different challenge. As data sets are integrated and transformed for analysis and reporting, cleansing and corrections applied at the warehouse imply that the analysis and reports may no longer be synchronized with the source data. In essence, this just clarifies the benefit of creating a single source of truth for all enterprise applications – not just for analysis or reporting – and this embodies the concept of master data.
For example, consider your company’s customers. Each customer may participate in a number of business operations: sales, finance, customer service. Some of your customers may even participate in different contexts, perhaps as vendors or suppliers. In each of these contexts, the customer may play a different role, and in turn, the business may value some attributes over others depending on the context. But you clearly want to ensure that your business processes don’t fail because the customer appears multiple times in different data sets. In addition, you want to be confident that the customer’s activities are accurately portrayed in management reports.
In other words, different business applications record transactions or analysis regarding entities and their activities. And it is desirable for all the business applications to agree on what those entities and activities are. We can summarize two objectives:
- Integrate the multiple variations of the same business entities into a single (perhaps virtualized) source of truth.
- Enable enterprise applications to share that same view of the business objects within the enterprise.
Defining Master Data
So far we have used terms such as “business concepts” or “business entities” when referring to master data, but what are the characteristics of master data? Master data objects are those core business objects that are used in the different applications across the organization, along with their associated metadata, attributes, definitions, roles, connections and taxonomies. Master data objects are those “things” that we care about – the things that are logged in our transaction systems, measured and reported on in our reporting systems, and analyzed in our analytical systems. Common examples of master data include:
- Customers
- Suppliers
- Parts
- Products
- Locations
- Contact mechanisms
Master data tends to exist in more than one business area within the organization, so the same customer may show up in the sales system as well as the billing system. Master data tends to be relatively static, and does not change frequently. Master data objects may be classified within a hierarchy.
For example, we may have a master data category of “party,” which in turn is comprised of “individuals” or “organizations.” Those parties may also be classified based on their roles, such as “prospect,” “customer,” “supplier,” “vendor,” or “employee.” While we may see a natural hierarchy across one dimension, the taxonomies that are applied to our data instances may actually cross multiple hierarchies. For example, a party may simultaneously be an individual, a customer and an employee.
In turn, the same master data categories and their related taxonomies are used for analysis and reporting. For example, the headers in a monthly sales report may be derived from the master data categories (such as sales by customer by region by time period). Enabling the transactional systems to refer to the same data objects as the subsequent reporting systems ensures that the analysis reports are consistent with the transaction systems.
What is Master Data Management?
As opposed to being a technology or a shrink-wrapped product, master data management (MDM) is comprised of a mixture of business applications, methods and tools. These aspects can implement the policies, procedures and infrastructure that support the capture, integration, and subsequent shared use of accurate, timely, consistent and complete master data.
- Assess the use of core information objects, data value domains and business rules in the range of applications across the enterprise
- Identify core information objects relevant to business success used in different pplication data sets that would benefit from centralization
- Instantiating a standardized model to manage those key information objects in a shared repository
- Managing collected and discovered metadata as an accessible, browsable resource, and using it to facilitate consolidation
- Collect and harmonize unique instances to populate the shared repository
- Integrate the harmonized view of data object instances with existing and newly developed business applications via a service-oriented approach
- Institute the proper data governance policies and procedures at the corporate or organizational level to ensure the continuous maintenance of the master data repository
Numerous technologies have, in the past, been expected to address parts of this problem through customer master tables or industry-specific consolidated product management systems. But these applications have been criticized (perhaps unfairly) as a result of the organizational management approach to their implementation: largely IT-driven, presumed to be usable out of the box, lacking enterprise integration and suffering from limited business acceptance. Resolving the issues pointed out by that criticism is what defines some of these considerations for implementing a successful master data management program:
- Effective technical infrastructure for collaboration
- Organizational preparedness
- “Round-trip” enterprise acceptance and integration
- Measurably high data quality
- Data governance
Architectural Approaches to MDM
Let’s consider an MDM infrastructure. In the best of all possible worlds, we desire a service-oriented environment that can support new applications while allowing for incremental integration with legacy applications. There may be many ways to develop a master data repository. Here, we explore three conceptual architectural approaches to developing this capability, each addressing different organizational needs. It is interesting to note, though, that once the MDM concept is accepted within the organization, it is relatively easy to transition between solution frameworks.
In a Central Master Data System, for each data “domain,” a set of core attributes associated with each master data model is defined and managed within a single master system. The master repository is the source for managing these core master data objects, which are subsequently published out to the application systems. Within each dependent system, application-specific attributes are managed locally, but are linked back to the master instance via a shared global primary key. In this approach, new data instances may be created in each application, but those newly-created records must be synchronized with central system.
In a Mapped Master System, an existing application system is selected to be the master, and other systems become dependent on that system as the main repository. New data instances are created in the master, which are then propagated to other systems. In this approach, different data objects are not necessarily linked by a global primary key, so it may be necessary to define mappings to link objects between systems. When new objects are created, they are distributed out to other applications, and similar to the central master approach, the other applications may not modify core attribute values but may introduce and modify their own application-specific attributes.
In a Hub Repository, a single repository is used to manage the core master system, and data is not replicated to other systems. Applications request information from central hub and provide updates to the central hub. Since there is only one copy, all applications are modified to interact directly with the hub.
There are other alternate approaches as well, yet all the approaches share these characteristics:
- Relevant core master data is managed within a single repository
- Dependent applications rely on publication of master information via a service-based approach
- An investment must be made in integrating data from across the application systems to identify the best sources of high-quality master data as well as that data’s use in dependent applications
Regardless, once the MDM system is in place, the corresponding data quality assurance and data governance framework must be in place to ensure ongoing synchronization.
Organizational Challenges and Master Data Management
A key success factor in creating a master data management program is an early understanding of the transition of responsibilities for data oversight and governance that will impact your organization. Years of distribution of business applications into vertical lines of business have led to discrete islands of information. As a result, the IT and data management structures associated with those lines of business have erected barriers to collaboration.
In addition, the politics of information ownership and management have created artificial fiefdoms overseen by individuals for whom centralization holds no incentive. And lastly, consolidating master data into a single repository transfers the responsibility and accountability for information management from the lines of business to the organization. Because of this, some of the greatest challenges to success are not technical – they are organizational.
It is largely because of these issues that MDM should be considered as a “program” and not as an application. Focusing on these directives will distinguish a successful MDM program from one destined for failure:
Organizational Preparedness: Anticipate that a rapid transition from a loosely-coupled confederation of vertical silos to a more tightly-coupled collaborative framework will ruffle a number of feathers. Assess the kinds of training sessions and individual incentives that must be established in order to create a smooth transition.
Data Governance: As management responsibility and accountability transition to the enterprise team, it is important to define the policies and procedures governing the oversight of master data. By distributing these policies and seeking proactive comments from across the different application teams, you have an opportunity to create the stewardship framework through consensus while preparing the organization for the transition.
Technology Integration: As is often the case, new technologies that are dependent on application packages are like the tail wagging the dog. Recognize the objective to integrate technology to support the process instead of develop process around technology.
Anticipating Change: As with any paradigm shift, proper preparation and organization will subtly introduce change to the way that people think and act. Since the nature of a master data management program is to integrate data for competitive advantage, encourage individuals to begin to explore new ways to exploit master data for improving business productivity.