{"id":16121,"date":"2013-07-03T14:51:00","date_gmt":"2013-07-03T21:51:00","guid":{"rendered":"https:\/\/devwww.3cloudsolutions.com\/post\/avoiding-data-quality-pitfalls-when-reconciling-multiple-sources-2\/"},"modified":"2023-08-22T10:57:33","modified_gmt":"2023-08-22T17:57:33","slug":"avoiding-data-quality-pitfalls-when-reconciling-multiple-sources","status":"publish","type":"post","link":"https:\/\/3cloudsolutions.com\/resources\/avoiding-data-quality-pitfalls-when-reconciling-multiple-sources\/","title":{"rendered":"Avoiding Data Quality Pitfalls when Reconciling Multiple Sources"},"content":{"rendered":"<div class=\"hs-migrated-cms-post\">\n<p><img decoding=\"async\" id=\"img-1372863302291\" class=\"alignRight\" style=\"float: right;\" src=\"https:\/\/3cloudsolutions.com\/wp-content\/uploads\/2022\/11\/vendor_entity_reconciliation1.jpg\" alt=\"Vendor Reconciliation\" border=\"0\" \/>In my <a title=\"previous article\" href=\"http:\/\/www.blue-granite.com\/blog\/bid\/283568\/Business-Analytics-less-Data-Quality-equals-Bad-Decisions\" target=\"_self\" rel=\"noopener\">previous article<\/a>, I touched on some high level actions you can take after you&#8217;ve discovered you have data quality issues in your business analytics solution.\u00a0 In this article, I want to delve a little deeper into the causes of data quality issues.\u00a0 Specifically, I\u2019d like to discuss data quality symptoms that one might encounter if they happen to have multiple source systems containing the same information.<\/p>\n<p><!--more--><\/p>\n<p>Understanding the root causes of data quality issues is essential to any data quality initiative because you need to understand the cause in order to stop the flow of bad data.\u00a0 If root causes aren\u2019t addressed, then you\u2019ll be forced to perpetually clean up unacceptable data issues.\u00a0 In the case of multiple source systems, data quality issues can be avoided through careful planning during the design phase of a business analytics solution.\u00a0 Additionally, discrepancies could be addressed through a master data management initiative.<\/p>\n<h2><b>Data Quality Issues Caused by Multiple Source Systems<\/b><\/h2>\n<p>Many companies these days have multiple ERP\u2019s to store information for different regions or for different lines of business.\u00a0 Whatever the reason, it is very likely that the data contained within these systems has some level of redundancy that can only lead to problems when attempting to create a business analytics platform.<\/p>\n<h3>The Complexities of Transaction Attribution<\/h3>\n<p>There are two big problems when dealing with data in multiple systems.\u00a0 The first is that transactions cannot be properly attributed to a single entity because that entity exists independently in both systems.\u00a0 For the sake of example, let\u2019s assume that a company has the same vendor listed in two separate ERP systems.\u00a0 In the first system, it\u2019s Vendor A, and in the second it\u2019s listed as Vendor 3.<\/p>\n<p>When the data is consolidated into a single analytics solution, if specific reconciliation steps aren\u2019t taken, then this vendor will be listed as two completely separate companies.\u00a0 The result is that there may be a complete picture of the overall business, but the values for any specific vendor cannot be trusted as they may be significantly understated.<\/p>\n<p>Imagine the additional negotiation leverage one might have in discussions with a vendor if you inform them that your company purchases 1 million total units of a specific part from them annually, rather than just 335,000 units as listed in the first ERP system.\u00a0 Very likely the price per unit could be reduced, which would have a significant impact on the bottom line.<\/p>\n<h3>Reconciling Contradicting Information<\/h3>\n<p>The second problem that occurs when dealing with multiple source systems is contradicting information.\u00a0 In the vendor example above, the transactions would be distinct by line of business or by geography (depending on the ERP implementation), so combining them is a safe activity.\u00a0 However, what about descriptive information about the vendor?<\/p>\n<p>Address, for example would not be guaranteed to be the same.\u00a0 The vendor could have numerous site addresses, and very likely the site address in one system won\u2019t match up with the site address in the other system.<\/p>\n<p>Even if the site addresses were entered identically, what occurs once the vendor moves locations and the address changes?\u00a0 Perhaps one of the ERP data input teams is very diligent and the other not so.\u00a0 In this scenario one of the ERP addresses would be updated while the other would not be.<\/p>\n<h3>Compounding Data Quality Issues<\/h3>\n<p>Notice that you likely cannot solve the attribution problem without first addressing the contradicting information issue.\u00a0 Very likely the address will need to be used to tie the two vendor records together, and in turn the transactions for the two vendors could be properly attributed to the combined vendor record, providing the correct comprehensive view of vendor activity.<\/p>\n<p>The issue with this, as we already established, is that the address information is very likely not in sync and therefore will not allow the vendor records to be easily aligned.\u00a0 The real effort will be in cleaning up the address information to ensure that entity alignment can take place cleanly and effectively.<\/p>\n<p>Parts of the address cleanup activity could be automated, but there will be situations where manual review will be required.\u00a0 Additionally, expect this process to become more and more complicated the more systems that are added.\u00a0 The greater the number of variables, the greater likelihood that you won\u2019t be comparing apples to apples, and a much larger cleanup effort will be required.<\/p>\n<h3>Conclusion<\/h3>\n<p>In summary, storing the same information in multiple source systems can cause several data quality problems in downstream business analytics solutions.\u00a0 There are two key data quality problems that must be considered when a business analytics solution is being created.<\/p>\n<p>First, the problem of properly attributing transactions to an identical entity between multiple source systems.\u00a0 Second, the problem of identifying common entities between multiple source systems in the face of contradicting records.<\/p>\n<p>The two problems are very closely related, but are two distinct issues that must be carefully considered.\u00a0 Depending on the magnitude of the issues, a complete master data management solution may be required.<\/p>\n<p>At a minimum, the business analytics development team will need to take steps to properly reconcile the systems at design time.\u00a0 Depending on complexity, it may be more prudent to implement a master data management solution in order to address these issues.\u00a0 One thing is certain, by properly addressing incorrect entity information, the transaction attribution problem can be addressed as well.<\/p>\n<p>Reconciling data between multiple source systems can be a complex activity.\u00a0 Success depends on fully understanding the issues at hand, and properly researching and planning before building a business analytics solution.\u00a0 Take care to address these issues in your business analytics solution and you can avoid these data quality pitfalls before they occur.<\/p>\n<p>This is part 1 of a 3 part series on data quality that I&#8217;ll present this summer. Want to <a href=\"\/get-started\/\">get started today<\/a>? Reach out to us!<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Discuss how to avoid data quality issues that arise when combining data from multiple source systems into a business analytics solution.<\/p>\n","protected":false},"author":21,"featured_media":15034,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":""},"categories":[297],"tags":[304],"class_list":["post-16121","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-platform","tag-modern-data-platform","topics-blog"],"acf":[],"_links":{"self":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts\/16121","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/users\/21"}],"replies":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/comments?post=16121"}],"version-history":[{"count":0,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts\/16121\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/media\/15034"}],"wp:attachment":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/media?parent=16121"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/categories?post=16121"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/tags?post=16121"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}