Statistical Sampling in Federal Tax Filings

By Mary Batcher, Ph.D., and Wendy Rotz, Ernst & Young, Washington, D.C.

Editor: Valrie Chambers, Ph.D., CPA

Procedure & Administration

Statistical sampling is pervasive in daily life. A physician drawing a small amount of blood for testing or someone tasting a small bite of a new food is an example of sampling. These methods can provide valuable information so long as the sample is representative and well-mixed. The physician does not need to drain all the blood out of a body because blood is well-mixed, so a small sample is representative. When tasting food, samples are less certain. If a soup is stirred, it is generally well-mixed and the small taste is representative. But if it is not well-mixed, the taste could be atypical. Thus, a good sample is one that properly reflects the variations that occur in the whole “population” or “universe.”

Sampling is increasingly relied on by government regulators including the IRS, the Federal Communications Commission, and the Department of Health and Human Services. Numerous tax agencies—state, federal, and foreign—have published guidelines for taxpayer use of sampling for estimating values that are reported or claimed on tax returns. The AICPA has published guidelines including Statement on Auditing Standards No. 39, Audit Sampling .

Sampling Pros and Cons

Advantages of sampling include:

  1. The ability to reasonably assess accuracy or quantify an amount using fewer resources.

  2. Less disruption of normal business operations with fewer documentation requests.

  3. The ability to thoroughly review fewer documents rather than make a cursory review of all of them, leading to a more accurate result. Extrapolating results from a sample of thoroughly reviewed and corrected records is often more accurate than using the whole database subjected to only high-level checks or minimal review.

  4. Increasing regulatory and judicial acceptance.

Disadvantages may include:

  1. Inaccuracies stemming from random chance, unrepresentative samples, or an ill-conceived approach.

  2. Potential mistrust of the value of sample results by people unfamiliar with sampling.

  3. Failure to recognize the limitations of a sample and, as a result, overextending the sample results.

Statistical and Judgment Sampling

The basic means of sample selection give rise to two forms of samples: judgment and statistical (or probability) samples. With judgment samples, a knowledgeable person chooses representative or illustrative test cases or, sometimes, problematic cases. In either situation, human judgment is used to choose the items to include in the sample. Haphazard sampling, e.g., taking a few from the top, middle, and bottom of a list, is another form of judgment sampling, as is convenience sampling, where the sample consists of available cases, as when magazines ask readers to fill out a questionnaire.

With statistical or probability samples, random chance is used to select specific records to audit. Typically, this is done with computer-generated random numbers available in most software packages used for managing or analyzing data. The primary advantages of statistical samples are that the accuracy of their estimates can be measured, predicted in advance, and controlled through increasing the sample size or using a better sample design. Another advantage is their acceptance by regulatory agencies and courts. Statistical samples have scientifically proven mathematical measures of accuracy. Judgment samples do not; they rely on trial-and-error assessments of accuracy.

In addition to random selection, every record under consideration in the scope of the study (the “population” or “universe”) must have a known non-zero chance of selection. That means a statistical sample cannot be extrapolated to records with no chance of selection. This precludes using a sample from one year to apply to later years.

Complex sample designs are highly effective in solving common logistical problems. For example, two-stage samples are often used when records are decentralized across so many sites that it is not feasible to travel to all sites to inspect the records. In the first stage of selection, sites are randomly chosen; in the second stage, records within each chosen site are randomly drawn and reviewed.

Another common approach is stratified sample designs. Records are grouped into categories or “strata,” and each stratum is sampled separately. The probabilities of selection in these complex approaches must be considered when extrapolating results of the sample, or biases can occur. Samples are often stratified by the dollar amounts, with small-value records separated from medium or larger records. Usually the smaller records are sampled less heavily than larger ones, and often all the largest records are selected. The stratification makes the samples more efficient, meaning a good estimate can be made with a smaller sample.

Statistical Sampling in Tax

Statistical sampling is very common in federal tax settings. It has been used for a long time by the IRS as an audit tool during examinations of corporate tax filings. It allows close examination of a representative subset of company records to determine the accuracy of claimed deductions or credits. The results of the review of sampled records are then applied to the population from which the sample was selected to determine an adjustment amount.

Historically, the IRS has issued revenue procedures that authorize taxpayers to use statistical sampling in limited settings. However, in March 2002, it issued the Field Directive on the Use of Estimates From Probability Samples , which broadly recognized the appropriateness of statistical sampling by taxpayers as necessitated by the time and cost of analyzing the entire population and when a better answer is not available in the company books and records. The field directive further specified which statistical sampling and estimation methods the IRS considered appropriate, but it did not close the door on other valid statistical methods. It did not recognize judgment sampling as an acceptable approach.

The 2002 field directive was followed by revenue procedures authorizing the use of statistical samples in specified applications, including meals and entertainment deductions and the domestic production activities deduction. The Field Directive on the Use of Estimates From Probability Samples was revised and reissued on Nov. 3, 2009 (LMSB-4-0809-032). All of these issuances led to increased use of statistical sampling by companies, but there was still concern among taxpayers about the appropriateness of statistical sampling in circumstances other than those explicitly authorized in the revenue procedures.

In September 2011, the IRS issued Rev. Proc. 2011-42, which provided guidance to taxpayers “regarding the use and evaluation of statistical samples and sampling estimates.” The guidance in Rev. Proc. 2011-42 is similar to the guidance in the 2009 field directive, but it explicitly authorizes the broad use of statistical sampling by taxpayers with the added authority of a revenue procedure, thereby resolving the question of its acceptance by the IRS.

Since then, three revenue procedures consistent with Rev. Proc. 2011-42 have been issued: Rev. Procs. 2011-43, 2012-19, and 2012-20. The IRS seems to be signaling a clear acceptance of statistical sampling as an appropriate, or even desirable, method for taxpayers to use in tax filings, provided the sampling and estimation are consistent with the broad guidelines in Rev. Proc. 2011-42. Statistical sampling is a win for taxpayers, allowing them to claim deductions they could not reasonably substantiate otherwise because of the large volume of records; it is a win for the IRS, allowing the Service’s examinations to proceed efficiently; and it is a large win for both because it reduces the burden of filing and examining credits and deductions.

Statistical sampling has many applications beyond tax. It can be a less costly, quicker, and more reliable approach to estimating when there are a large number of cases and the cost of examining each item is high. Consider statistical sampling in these instances; sometimes the IRS has the right idea.


Valrie Chambers is a professor of accounting at Texas A&M University–Corpus Christi in Corpus Christi, Texas. Mary Batcher is the national director of Statistics and Sampling in Ernst & Young’s National Tax Department in Washington, D.C. Wendy Rotz is a senior manager at Ernst & Young in Washington, D.C. Prof. Chambers is a member of the AICPA Tax Division IRS Practice and Procedures Committee. For more information about this column, contact Prof. Chambers at

Newsletter Articles


How the Election May Affect Taxation of Business Income

This report summarizes recent proposals to reform the U.S. business income tax system and considers the path to enactment of any such legislation.


CPAs Contend With Tax ID Theft

Tax-related identity theft fraud remains a widespread problem that is often difficult for victims and their tax preparers to correct.