geocat.comp.stats.eofunc_pcs

Contents

geocat.comp.stats.eofunc_pcs#

geocat.comp.stats.eofunc_pcs(data, npcs=1, time_dim=0, pcscaling=0, weights=None, center=True, ddof=1, meta=False)#

Computes the principal components (time projection) in the empirical orthogonal function analysis.

Note: eofunc_pcs allows to perform the analysis that was previously done via the NCL function eofunc_ts.

However, there are a few changes to the NCL flow such as:

  1. Only np.nan is supported as missing value,

  2. EOFs are computed only from covariance matrix and there is no support for computation from correlation matrix,

  3. percentage of non-missing points that must exist at any single point is no longer an input.

This implementation uses eofs package, which is built upon the following study: Dawson, Andrew, “eofs: A library for EOF analysis of meteorological, oceanographic, and climate data,” Journal of Open Research Software, vol. 4, no. 1, 2016. Further information about this package can be found here.

This implementation provides a few conveniences to the user on top of eofs package that are described below in the Parameters section.

Parameters:
  • data (xarray.DataArray or numpy.ndarray or list) – Should contain numbers or np.nan for missing value representation. It must be at least a 2-dimensional array.

    When input data is of type xarray.DataArray, eofs.xarray interface assumes the left-most dimension (i.e. dim_0) is the time dimension. In this case, that dimension should have the name “time”.

    When input data is of type numpy.ndarray or list, this function still assumes the leftmost dimension to be the number of observations or time dimension: however, in this case, user is allowed to input otherwise. If the input do not have its leftmost dimension as the time or number of observations, then the user should specify with time_dim=x to define which dimension must be treated as time or number of observations

  • npcs (int, optional) – A scalar integer that specifies the number of principal components (i.e. eigenvalues and eigenvectors) to be returned. This is usually less than or equal to the minimum number of observations or number of variables. Defaults to 1.

  • time_dim (int, optional) – An integer defining the time dimension if it is not the leftmost dimension. When input data is of type xarray.DataArray, this is ignored (assuming xarray.DataArray has its leftmost dimension with the exact name ‘time’). It must be between 0 and data.ndim - 1 or it could be -1 indicating the last dimension. Defaults to 0.

    Note: The time_dim argument allows to perform the EOF analysis that was previously done via the NCL function eofunc_ts_n.

  • pcscaling (int, optional) –

    (From eofs package): Sets the scaling of the retrieved PCs. The following values are accepted:
    • 0 : Un-scaled PCs (default).

    • 1 : PCs are divided by the square-root of their eigenvalues.

    • 2 : PCs are multiplied by the square-root of their eigenvalues.

  • weights (array_like, optional) – (From eofs package): An array of weights whose shape is compatible with those of the input array dataset. The weights can have the same shape as dataset or a shape compatible with an array broadcast (i.e., the shape of the weights can can match the rightmost parts of the shape of the input array dataset). If the input array dataset does not require weighting then the value None may be used. Defaults to None (no weighting).

  • center (bool, optional) – (From eofs package): If True, the mean along the first axis of dataset (the time-mean) will be removed prior to analysis. If False, the mean along the first axis will not be removed. Defaults to True (mean is removed).

    The covariance interpretation relies on the input data being anomaly data with a time-mean of 0. Therefore this option should usually be set to True. Setting this option to True has the useful side effect of propagating missing values along the time dimension, ensuring that a solution can be found even if missing values occur in different locations at different times.

  • ddof (int, optional) – (From eofs package): ‘Delta degrees of freedom’. The divisor used to normalize the covariance matrix is N - ddof where N is the number of samples. Defaults to 1.

  • meta (bool, optional) – If set to True and the input array is an Xarray, the metadata from the input array will be copied to the output array. Defaults to False.

See also

Related NCL Functions: eofunc_ts, eofunc_ts_Wrap, eofunc_ts_n, eofunc_ts_n_Wrap