Data Woes

KeepCalmStudio.com-[Crown]-Remain-Calm-And-Validate-Your-DataEvery business transaction creates data.  Ensuring easy and efficient transaction entry is on the top of everyone’s list when evaluating a new financial system.  However, information must not only get into a system it must get out again.  The main methods of retrieving data are canned reports, and dumping data into Excel for ad hoc analysis.

The Dangers of Canned Reports

1.  The report was developed for a specific purpose and is now being used generally

A report may have been developed for a specific reporting requirement, and its criteria may relate to the quirks of the one agency that requires the data to be selected in an unusual way.  The report might then be used for other purposes.  For example, a particular grant may require a calculation of “teacher experience” that includes all years of experience as a credentialed teacher, including experience at other agencies.  Subsequently, users might run the report thinking that it shows years teaching at this agency, credentialed or not.

2.  The report pulls a subset of data and users believe the report pulls all relevant data

This often results from hard coding certain data elements into the report.  A user may require a report that contains certain pay codes that relate to a specific type of payroll. Subsequent to the creation of the report a new pay code is added.  The report is now incomplete.

3.  The report contains inaccurate calculations

Reports are developed to calculate class sizes, graduation rates, and longevity, to name just a few.  These calculations are often based on contracts, statutes, or a generally accepted understanding of the term.  Sometimes the complexity of the calculation or the statutory requirements are not well understood, even by the reporting department.  For example, a report may calculate longevity per the requirements of an employee contract.  The underlying calculation may simply be wrong from the get go, or it was correct and a subsequent change to the contract causes a once accurate report to become inaccurate.  An internal report on percent of students “on track to graduate”, after several personnel changes and a number of elapsed years, may be used to publish the district’s “graduation rate” simply due to a poor report naming convention.

4.  The report references wrong data

Similarly named data fields may be confused by the report writer. For example, there are numerous dates associated with an employee, such as the original hire date, the current hire date (if re-hired) and several others.  The report writer might simply choose the wrong data field when building the report.

5.  The report is inadequately tested. 

The reporting department receives a draft version of the report.  They run it and “spot check” various data points. The report delivers accurate data for the checked items.  The report is signed off.  Even veterans with a good technical background have fallen victim to the “spot checking” trap.  Perhaps only one in one thousand records are wrong. Spot checking is unlikely to reveal such errors.  If a report pulls payroll data, for example, run it for an entire payroll (not one or two employees), and compare the totals to known totals.  If the totals don’t match, something is wrong.  The report cannot be put into production until the error is identified and corrected.

Departments must take ownership of their data.  This includes how the system is structured.  Upon system implementation, thought must be given as to how data will be pulled and used.  Data fields must be established with this in mind.  Naming and numbering conventions can promote or hinder reporting. As an example. a purchasing system may have the ability to use commodity codes to track various types of purchases.  Should the code be random, or be developed so that reports can be pulled with meaningful ranges.  A department may ask that all its purchases be given one commodity code for ease of pulling up all if its purchase orders.  Acquiescing to this request may destroy the ability to report on types of purchases entity-wide.  Acquiescing to this request probably means that purchasing staff do not know their system well enough to guide the departmental user how to pull up their orders (already a functionality within the system).  When there is a low level of technical expertise in the department, poor report specifications could easily be written.  Reports are then inadequately tested, and reports are not reviewed when there is a change in law.  Staff and, disappointingly, even departmental management may regard a report as belonging to the IT department rather than their own department.

Increasingly, technical expertise cannot not merely be the realm of the IT department.  There needs to be savvy users at the departmental level.  But does that mean that savvy users should just bypass the canned report altogether and pull “raw” data into Excel and create their own reports that way?  The pitfalls with this approach are unfortunately just as great, or even greater.  With this approach there is usually no independent review of the report output.

The Dangers of Excel

1.  Raw data is not what you think it is

We used a report to pull employee data into Excel. The report included employee number, job, pay range, step, base salary and salary add-ons (such as longevity, mileage stipends).  The report included benefits such as social security, Medicare, retirement system payments, health and welfare, and so forth.  The report was great.  Any time there was a question such as, what do nurses cost, an analyst would just download the report and do a quick pivot table selecting for nurses.  Except that recently I decided to compare the report to employee costs in the system, and they did not agree.  I found two basic errors.  One had been there since the report was created ten years ago, and another was that cell phone stipends (a new item in the past couple of years) were not included.  All that “on the fly” analysis was wrong.  Not horribly wrong, but wrong.  Because the “raw data” download was still a report that selected certain data from the system and excluded other data.  The thousands of detailed data records that were pulled into the report led to a kind of blindness to the fact that this “data dump” was still created by a report with built in selection criteria.

2.  Calculations are done on the fly and are (generally) not verified

Just as a canned report can have erroneous calculations, so can an Excel spreadsheet.  Except there is usually no independent testing and verification.  The report’s creator is also the report’s tester.  The accuracy of the report depends on the level of a user’s Excel expertise, as well as their understanding of business practices.  A deficiency in either area can lead to poor results.

The Solution

1.  Document all reports and review them on a set schedule

Both canned reports and most-used Excel reports need to be documented and reviewed periodically.  This needs to be included in someone’s job description, even if the formal position of “Data Analyst” or “Business Analyst” does not exist within the organization.

2.  Know your data, and own it

Departmental management must be tech savvy.  Both my former CBO and I “apprenticed” with positions in the IT departments of our organizations.  I recently urged an up and coming accountant in her 20’s to take a position in her IT department.  Management has to be engaged with the data.  It is not someone else’s responsibility.  Imagine if attendance data was significantly incorrect and significant dollars needed to be returned to the state.  There is no way a Finance Director or Chief Business Official could avoid responsibility by blaming IT for a bad report.

In areas outside finance, it is common that clerical staff is given responsibility for data with almost no management oversight.  Recently, while teasing apart data from another department and discovering numerous errors it was said repeatedly that it was the “clerk’s error”.  The manager neither reviewed the report prior to submission, nor took responsibility for the bad data in it.  Vigilance in the business office stopped bad data being reported to the state.  Had this bad data been reported, the Finance Director or CBO would not be able to disavow responsibility.  The Business Department “should have known better”.

Recently I was validating the hourly rates used in a work order system.  The Business department had provided straight time and overtime rates for maintenance staff to the Maintenance department.  The overtime rate was only about a dollar per hour higher than the straight time rate.  This was correct, because there is no additional health and welfare cost on overtime, and no additional retirement system payments.  The clerk in the Maintenance Department could not believe this, so simply multiplied the straight time rate by 1.5 when entering rates into the work order system.  This was done with no subsequent management check in her own department.  When questioned, departmental management did not know this had been done.

When it comes to data, the Business department has to poke around in other departments’ business. The best approach is to question and verify all data.  Be nice about it and let them know “I’d like to help”.  But in the end “The Buck Stops in Business”.

%d bloggers like this: