Archive for January 2007

Classifying data in Excel

Two common audit procedures are to classify data by their frequency of occurrence and to determine if there are any duplicate items. Often, this data may reside within an Excel spreadsheet. Both procedures can be done manually within Excel - for example by first sorting the data, doing subtotals, etc. However, this procedure can be both time consuming as well as error prone.

Here I discuss how this can be done very quickly, and I provide example procedure steps and screen shots of the results. The data I use to illustrate the concepts is the list of debarred medical providers in the State of New Jersey. This list can either be downloaded directly, or else you can use the spreadsheet I have already prepared. The reason for using this list is that it contains elements common in many audit tests - e.g. State code, dates, reason codes, etc. Also, the data is public record. The data elements that I will focus on in this exercise are: state code, effective date and action taken. Four analytical procedures will be performed:

1. Classify state codes by frequency

2. Determine if there are duplicate state codes.

3. Classify action taken by type,

4. Determine if there were any duplicate effective dates.

A simple, yet efficient, way to CMA sample data on Excel sheets

A common audit procedure is to select a random sample using the cumulative monetary amount (also known as dollar unit sampling) in order to audit transactions from a population. Often, that data may reside within an Excel spreadsheet.The underlying process is determine the materiality and the confidence level required. The purpose of this audit exercise is not to explain the methodology for the process, but instead to lay out the procedural steps needed to perform a CMA sample from data contained on an Excel spreadsheet. (Note that the same technique can be applied to other data sources as well, provided it is in the tab separated value (TSV) format. There is a white paper on converting data to tab separated value format.

For purposes of this exercise, we have taken data published by the World Bank and used this data as the basis for the sample. The data consists of both debit and credit transactions (although credits are not sampled). We have placed the data into an Excel workbook which can be downloaded, as well as the formatted data. In order to follow this exercise, the following steps will need to be performed:

1. Download the Excel workbook and select the first sheet, which is named WB.

2. There is some “clean-up” work which needs to be done, as some data is missing or else improperly formatted. (However, this has already been done, and you can just download the formatted data). This includes ensuring that all the columns have headers which consist of a single word (i.e. no embedded blanks).

3. After clean-up, select the data range to be sampled, and copy it to the clipboard (Ctrl-C or Edit|Copy).(Skip this step if you have downloaded the formatted data WB.tab).

4. Open Word Pad and “paste” the data into it.Save this file as “WB.tab” and remember the location where it was saved. (You can skip this step if you have downloaded the formatted data).

5. Open EZ-R Stats for Windows, and select the menu item Sampling | CMA Sample. If you do not have EZ-R Stats for Windows, it can be downloaded and installed without cost.

6. In EZ-R Stats, specify the input file (WB.tab), the output file where the sample extract is to be written (e.g. cma.tab), the “R” factor (e.g. 2), the J factor (e.g. 5000) and a random start (e.g.347). The random start must be between 0 and J / R.

7. Click the column name to be sampled (e.g. SaleAmount).

8. Click the “Run” bottom at the bottom of the dialog.

9. Once the sample is run, the sample selections will be written to the file specified, and a sample reconciliation will be written to the same file name, but with the file suffix of “.rec”.

10. Document the results in the work-papers, including the data source, sample reconciliation, etc.

See the screen shot of the process… (or view the step-by-step process with results).

The example sample reconciliation can be viewed here.

In this instance, there were 13 transactions in excess of J, (so they were selected) and 55 under J.

Below is the key portion of the sample reconciliation document:

population debits 325884.40
population credits -9.10
value of samples under J 33688.20
value of samples over J 55414.90
Number of samples under J 55
Number of samples over J 13
Number of samples under J is 55 times J of 5000.00 gives 275000.00
Add starting RS of 347.00
Subtract ending RS of -4877.50
Add amount of samples over J 55414.90
population debits 325884.40
Difference (s/b 0) is 0.00

Note that the same methodology can also be used for data files, which may contain considerably more rows than the Excel limitation of approximately 65,000.


A simple, yet effective, way to test Benford’s law

This example explains how to perform a simple, yet effective, test to determine if a data range in Excel conforms with that expected under Benford’s law. The example data used is that provided in the Excel data sheet provided in the 26th annual report to Congress. This is an audit test that should take no more than a few minutes, yet provides a good test of fit for the numbers in the range, and potentially an easy test for possible fraud or made-up data. The step-by-step procedure is detailed. The example provided is the test of the first digit. Further tests of Benford’s law, such as the first 2 or 3 digits, the last 1 or 2 digits or the second digit, can be performed using EZ-R Stats for Windows (no cost). For the curious, there is an Excel workbook with all the formulae…

|