The Data Quality Problem That’s Plaguing IT Procurement

by   |   October 16, 2013 10:56 am   |   1 Comments

Mahesh Kumar of BDNA.

Mahesh Kumar of BDNA.

The familiar purchase order—often resented by people making purchases—is actually a mother lode of valuable data about IT. The single most critical business record for procurement teams, the purchase order (PO) feeds many business processes, ranging from purchasing and ERP systems, IT asset, software and vendor management to risk mitigation and financial controls. Any problems with PO data quality quickly become larger issues that can have a negative impact on audits, financial decisions and reputational risk.

While ordinary purchasing processes have their problems (rogue purchasing, inaccurate data), technology procurement is significantly more complicated. Vendors offer complex and varied licensing models and change them frequently. Licensing compliance may be tied to the IT infrastructure, which itself is in flux. Technology maintenance contracts further complicate the procurement processes, with many variations. And the technology itself has “end of life” dates that must be tracked over time. Other than perishable products (say, in the food industry), few other procurement products have a similar “end of life” component.

Related Stories

Guide to procurement analytics.
Read the story »

Visualizations highlight data for improving manufacturer’s procurement process.
Read the story »

Data quality managers learn to market benefits to win over executives.
Read the story »

Opinion: Five steps to ensure data quality success.
Read the story »

BDNA has analyzed 7,500 purchase orders over the past two years from four leading financial institutions to gain a better understanding of the current data problems plaguing POs. One might assume that financial institutions would have their technology procurement processes in order, given their background in risk and cost controls. But the analysis reveals that 60 percent are incomplete or incorrect.

Clearly, the people in these institutions are capable of filling out a form correctly! But, the complexity inherent in technology procurement makes errors and inaccuracies almost inevitable. Some of the data quality problems we discovered include:

• Product naming variations. How many ways can you refer to SAP? We found nine different ways in these POs, each arguably correct in its own way. Consider the ubiquitous Adobe Acrobat. Acrobat version 8 alone has a near-infinite variety of possible name variations, including: Adobe Acrobat 8.0 Standard – English, Acrobat, Acrobat 8 Pro, Acrobat 8 Professional, Acrobat 8 Standard, Acrobat Professional, Acrobat Professional 8.1.1 (R1), Acrobat Professional 8.1.5 (R1), Acrobat Standard 8.1.2 (R1), Acrobat Standard 8.1.5 (R1), AcrobatProfessional, AcrobatProfessional 08 and AcrobatProfessional [AIS] 08.00.0000.0101.

• Licensing variations. Vendors are almost infinitely creative in their use of licensing models. Software may be licensed by CPU, CPU processor, concurrent users, by seat, for Internet access only, and for standby servers, among other terms. In the POs we examined, one single product was procured via 11 different licensing models.

• Maintenance variations. As an annual, recurring cost, maintenance should be a separate line item from the one-time license costs. But in nine percent of the analyzed POs, maintenance costs and license costs were bundled in the same line item—clearly diluting the quality of the PO data. Here’s one example from the data of a stealth maintenance contract: “IBM Retention Policy Framework Connection License + SW Subscription & Support 12 Months.”

• Incorrect data entry. In 10 percent of the POs analyzed, product quantities were embedded in the text-based product description field. For example, a PO might list a bundle as including 10 licenses in the description field (such as Tivoli Asset Discovery–10 pack). But when you are trying to determine how many licenses you have for that software, the quantity field shows one, which might lead you to believe you had a single license, and thus overprovision in the future.

This incomplete, inconsistent and redundant data makes it nearly impossible for IT and procurement organizations to accurately determine their software and hardware entitlements.  If a vendor shows up for a software audit, it’s difficult for the purchasing function to accurately state how much of that software they are legally entitled to use.

The financial and operational downsides of this data quality problem include paying for more or less technology than a company uses—over-provisioning and over-licensing, or under-provisioning—and the associated performance or availability impacts. Poor PO data quality also contributes to poor forecasting about IT needs and system demands, and increased time spent on license compliance and audits.

If the world’s leading financial institutions have these PO problems, the chances are good that other enterprises do, too. The real question is how to fix the problem.

Recognizing Data as the Real Source of the Problem
The core of the problem isn’t broken purchasing processes—it’s bad data.

It’s tempting to believe that process improvements will solve the problem. “If only people would fill out the POs correctly and follow policies, then the data will be accurate,” some might suggest. But policies and processes are never enough to fix the data, for two important reasons.

First, a close analysis of the actual data in POs demonstrates that many of the problems come from factors outside a company’s control – such as inconsistent product naming and licensing policies.  These problems originate with vendors.

Second, technology itself is working against IT procurement processes. The trends in IT include expanding virtualization and decentralization through mobile devices and cloud computing. The whole category of mobile app purchases didn’t exist even a few years ago. And departments can easily sign up with Amazon to spin up virtual servers in the public cloud. Technology purchasing is only getting more fragmented and complex, not less so.

Best Practices for Addressing PO Data Quality
Technology procurement teams need to be nimble to adapt to this constantly changing environment. While it’s important to be responsive to business unit needs, they also need to recognize that the problem requires more than “better policies” and “more training.” They need to focus on finding ways to turn the incomplete, inconsistent PO data into clean data that can support both IT and procurement decisions. Steps to take include the following:

Analyze the imperfect data already there. Don’t take PO data at face value. Recognizing that the data is flawed, look more deeply into both the structure and unstructured fields to find the data is already there. Restructure Extract-Transform-Load (ETL) processes that look at PO data to analyze both structured and unstructured data. This will uncover the situations in which product quantity is embedded in the unstructured ‘description’ field.

Normalize the data. Align PO records with a common taxonomy that’s shared across purchasing and IT (deployment) to accurately assess all products across all of technology vendors, consistently. BDNA’s Technopedia is one such taxonomy, although companies can create their own fitting their specific business needs.

Get outside intelligence. No enterprise is an island, particularly when it comes to technology. Look for ways to integrate constantly changing external technology data (product name changes, acquisitions, end-of-life dates) with internal data.

Close the loop without disrupting business. Having now created clean, updated procurement data, feed that data back into the systems that rely on PO data as a data source. In this way, all of the people and processes that depend on the data will continue to operate as usual, but gain the value of clean data.

By tackling the purchase order mess as an enterprise data problem, enterprises can potentially improve many business processes. Procurement teams will be better able to fulfill their core objectives, while IT teams will have the foundation for better technology forecasting and alignment.

Mahesh Kumar is the CMO of BDNA, a data as a service company. Mahesh oversees all marketing and business development efforts at BDNA. Prior to BDNA, Mahesh held senior leadership positions at HP, Loudcloud and Autodesk.  He can be reached at, via LinkedIn. Follow him on Twitter: @maheshSFO.

Home page image of purchase order via ThinkStock.

Tags: ,

One Comment

  1. Travis Pearl
    Posted December 26, 2013 at 7:17 pm | Permalink

    Great article Mahesh, we analyze public-sector procurement documents (government bids, rfps, etc) and normalization and data quality is always a key goal for us.

    We just released our Onvia Vendor Center product for competitive intelligence on government contractors and collected 6 million vendors names from 3 years worth of government procurement documents, normalized those names into 1 million vendor records and consolidated those records into a collection of 170,000 unique vendor profiles that capture holding company and subsidiary relationships.

    Your example of PO normalization for product names/SKUs is just the tip of the iceberg. There are massive cleanliness & duplication issues in most large scale PO systems when it comes to customer names, customer/buyer contacts, parts suppliers and vendor names.

    Thanks for detailing the importance and value of data cleanliness in PO and procurement documents.

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>