NCL Home > Documentation > Functions > General applied math, Statistics

escorc_n

Computes the (Pearson) sample linear cross-correlations at lag 0 only, across the specified dimensions.

Available in version 6.2.1 and later.

Prototype

	function escorc_n (
		x          : numeric,  
		y          : numeric,  
		dims_x [*] : integer,  
		dims_y [*] : integer   
	)

	return_val  :  numeric

Arguments

x

An array of any numeric type or size. The rightmost dimension is usually time.

y

An array of any numeric type or size. The rightmost dimension is usually time. The size of the rightmost dimension must be the same as x.

dims_x

A scalar integer indicating which dimension of x to do the calculation on. Dimension numbering starts at 0.

dims_y

A scalar integer indicating which dimension of y to do the calculation on. Dimension numbering starts at 0.

Description

Computes sample linear cross-correlations (Pearson) at lag 0 only. If a lagged correlations is required, use esccr. Missing values are allowed. This function can also be used to determine a "one-point-correlation-map" where one point is used to cross-correlate with all other points (see example 4 below).

Algorithm:

     cor = SUM [(X(t)-Xave)*(Y(t)-Yave)}]/(Xstd*Ystd)
     
The dimension sizes(s) of c are a function of the dimension sizes of the x and y arrays. Type double is returned if x or y are double, and float otherwise. The following illustrates dimensioning:
        x(N), y(N)          c
        x(N), y(K,M,N)      c(K,M)
      x(I,N), y(K,M,N)      c(I,K,M)
    x(J,I,N), y(L,K,M,N)    c(J,I,L,K,M)
    
Special case when dimensions of all x and y are identical:
    x(J,I,N), y(J,I,N)      c(J,I)
    

The Pearson linear correlation coefficient (r) for n pairs of independent observations can be tested against the null hypothesis (ie.: no correlation) using the statistic

    t = r*sqrt[ (n-2)/(1-r^2) ]
This statistic has a Student-t distribution with n-2 degrees of freedom. See Example 1.

The confidence interval for r may also be estimated. However, since the sampling distribution of Pearson's r is not normally distributed, the Pearson r is converted to Fisher's z-statistic and the confidence interval is computed using Fisher's z. An inverse transform is used to return to r space (-1 to +1). This approach is also demonstrated in Example 1.

Specifically, the confidence interval for the Pearson correlation may be obtained via use of:

  • the Fischer z-transformation: z = 0.5*log((1+r)/(1-r))
  • the standard error of the z-transformation: z_se = sqrt(1.0/(N-3))
  • The inverse of the Fischer transform: ri = (exp(2*z)-1)/(exp(2*z)+1))

See Also

escorc, esacv, esacr, esccr, esccv, escovc, pattern_cor, rtest, student_t

Examples

Example 1

The following will calculate the cross-correlation for a two one-dimensional arrays x(N) and y(N).

        r = escorc_n(x,y,0,0)   ; r is a scalar
     
The following is an example that illustrates calculating the cross-correlation(s) and associated confidence limits.

     ;; http://www.unt.edu/UNT/departments/CC/Benchmarks/sprsum97/resamp.htm:

        x    = (/ 0.20, 1.88, -0.76, 0.42, 0.32, -0.56, 1.55, -1.21, -0.66, -0.96, -0.21 /)
        y    = (/ 0.18, 0.54, -0.49, 0.92, 0.22,  0.75, 0.66, -2.65, -0.51,  0.47, -0.09 /)
        r    = escorc_n(x,y,0,0)          ; Pearson correlation
                                          ; r=0.559956
    ;---Compute correlation confidence interval

        n    = dimsizes(x)                ; n=11
        df   = n-2
                                          ; Fischer z-transformation
        z    = 0.5*log((1+r)/(1-r))       ; z-statistic
        se   = 1.0/sqrt(n-3)              ; standard error of z-statistic

                                          ; low  and hi z values
        zlow = z - 1.96*se                ; 95%  (2.58 for 99%)
        zhi  = z + 1.96*se
                                          ; inverse z-transform; return to r space (-1 to +1)
        rlow = (exp(2*zlow)-1)/(exp(2*zlow)+1)
        rhi  = (exp(2*zhi )-1)/(exp(2*zhi )+1)

        print("r="+r)                     ;  r=0.559956
        print("z="+z+"  se="+se)          ;  z=0.63277  se=0.353553
        print("zlow="+zlow+"  zhi="+zhi)  ;  zlow=-0.0601951  zhi=1.32573
        print("rlow="+rlow+"  rhi="+rhi)  ;  rlow=-0.0601225  rhi=0.868203

Since the r confidence interval includes 0.0, the calculated r is not significant.

An alternative for testing significance is:

        t    = r*sqrt((n-2)/(1-r^2))
        p    = student_t(t, df)
        psig = 0.05                       ; test significance level
        print("t="+t+"  p="+p)            ; t=2.02755  p=0.0732238
        if (p.le.psig) then
            print("r="+r+" is significant at the 95% level"))
        else
            print("r="+r+" is NOT significant at the 95% level"))
        end if

Example 2

The following will calculate the cross-correlation for one three-dimensional array y(lat,lon,time) and one one-dimensional array x(time).

     ccr = escorc_n(x,y,0,2)      ; ccr(nlat,mlon)
     
Example 3

The following will calculate the cross-correlations for x3(time,lat,lon) and y3(time,lat,lon) and x4(time,lev,lat,lon) and y4(time,lev,lat,lon).

     ccr3 = escorc_n(x3,y3,0,0)      ; ccr3(nlat,mlon)
     ccr4 = escorc_n(x4,y4,0,0)      ; ccr4(klev,lat,lon)
     
Example 4

Consider x(neval,time) and y(lat,lon,time)

     ccr = escorc_n(x,y,1,2)      ; ccr(neval,nlat,mlon)
     
Example 5

Consider ya(time,nl,ml) and yb(lat,lon,time) where nl and ml are scalar integers (grid indices) specified by the user. The result is a "one-point correlation pattern". Basically, a specific point is correlated with all other points. NOTE: NCL makes y(:,nl,ml) and yb(nl,ml,:) into one-dimensional arrays. Hence, dimension number for time is 0.

     nl   = 32 ; for example
     ml   = 64
     ccra = escorc_n(ya(:,nl,ml),yb,0,0)   ===> ccra(lat,lon)
     ccrb = escorc_n(ya(nl,ml,:),yb,0,0)   ===> ccrb(lat,lon)