 NCL Home > Documentation > Functions > Statistics, Bootstrap

# bootstrap_diff

Bootstrap mean differences from two samples. Available in version 6.4.0 and later.

## Prototype

```	function bootstrap_diff (
x         : numeric,
y         : numeric,
nBoot  : integer,
nDim  [*] : integer,
opt    : logical
)

return_val [ variable of type 'list' containing multiple estimates]
```

## Arguments

x

A numeric array of up to four dimensions: x(NX), x(NX,:), x(NX,:,:), x(NX,:,:,:). 'NX' represents the original sample size.

y

A numeric array of up to four dimensions: x(NY), x(NY,:), x(NY,:,:), x(NY,:,:,:). 'NY' represents the original sample size. NOTE: NX and NX may be different.

nBoot

An integer specifying the number of bootstrap data samples to be generated.

nDim

The dimension(s) of x and y on which to calculate the statistic. Most commonly, this is set to (/0,0/) or, if they are both the same, simply, 0.

opt

A logical scalar to which optional attributes may be attached. If opt=False, all default values are used. If opt=True and no optional attributes are present, default values will be used. If opt=True then:

• opt@sample_size_x and opt@sample_size_y allow the user to specify the sample sizes used to estimate the respective means. The defaults are: opt@sample_size_x=NX and opt@sample_size_y=NY.
• opt@sample_size_x=nx where (nx.le.NX) and/or opt@sample_size_y=ny where (ny.le.NY). When these options are used, nx and ny are typically, 10-25% the size of NX and NY.

• opt@rseed1=rseed1: allows user to set the first random seed integer value. Default is to use the system initial random seed. (See: random_setallseed)
• opt@rseed2=rseed2: allows user to set the second random seed integer value. Default is to use the system initial random seed. (See: random_setallseed)
• optrseed3="clock": tells NCL to use the 'date' clock to set the two random seeds. (See: random_setallseed)

## Return value

A variable of type 'list'. Members of a list can be accessed directly. However, it is clearer if the members are explicity extracted and given meaningful names.

```                                    ; typeof(Bootstrap) is 'list'
BootStrap = bootstrap_diff(x, y, stat, nBoot, 0, opt)
dBoot     = BootStrap        ; bootstrapped differences in ascending order
dBootAvg  = BootStrap        ; Average of the bootstrapped differences
dBootStd  = BootStrap        ; Std. Deviation of the bootstrapped differences
delete(BootStrap)       ; no longer needed
```

## Description

Bootstrapping is a statistical method that uses data resampling with replacement (see: generate_sample_indices) to estimate the properties of nearly any statistic. It is particularly useful when dealing with small sample sizes. A key feature is that bootstrapping makes no apriori assumption about the distribution of the sample data.

References:

```Computer Intensive Methods in Statistics
P. Diaconis and B. Efron
Scientific American (1983), 248:116-130
doi:10.1038/scientificamerican0583-116
http://www.nature.com/scientificamerican/journal/v248/n5/pdf/scientificamerican0583-116.pdf

An Introduction to the Bootstrap
B. Efron and R.J. Tibshirani, Chapman and Hall (1993)

Bootstrap Methods and Permutation Tests: Companion Chapter 18 to the Practice of Business Statistics
Hesterberg, T. et al (2003)
http://statweb.stanford.edu/~tibs/stat315a/Supplements/bootstrap.pdf

Climate Time Series Analysis: Classical Statistical and Bootstrap Methods
M. Mudelsee (2014) Second edition. Springer, Cham Heidelberg New York Dordrecht London
ISBN: 978-3-319-04449-1, e-ISBN: 978-3-319-04450-7
doi: 10.1007/978-3-319-04450-7
xxxii + 454 pp; Atmospheric and Oceanographic Sciences Library, Vol. 51
```

## Examples

Please see the Bootstrap and Resampling application page.

Example 1: Let x(NX); y(NY)

```
nBoot       = 1000                ; user set
nDim        = 0                   ; (/0,0/) since they refer to the same dimension
opt         = False

BootStrap   = bootstrap_diff(x, y, nBoot, nDim, opt)
diffBoot    = BootStrap ; All the bootstrapped differences
diffBootAvg = BootStrap ; Average of the bootstrapped differences
diffBootStd = BootStrap ; Std. Dev. of the boot strapped samples
delete(BootStrap)         ; no longer needed

diffBootLow = bootstrap_estimate(diffBoot, 0.025, False)   ;  2.5% lower confidence bound
diffBootMed = bootstrap_estimate(diffBoot, 0.500, False)   ; 50.0% median of bootstrapped estimates
diffBootHi  = bootstrap_estimate(diffBoot, 0.975, False)   ; 97.5% upper confidence bound

printVarSummary(diffBoot)   ; information only
printVarSummary(diffBootMed)

```

Example 2: Let x(NX,:,:); y(NY,: :) where NX=100 and NY=50. Use subsampling:

```

nBoot       = 2000                ; user set
nDim        = 0
opt         = True
opt@sampling_size_x = 30
opt@sampling_size_y = 10

BootStrap   = bootstrap_diff(x, y, nBoot, nDim, opt)
```
`					`