geocat.comp.eofunc.eofunc_eofs#

geocat.comp.eofunc.eofunc_eofs(data, neofs=1, time_dim=0, eofscaling=0, weights=None, center=True, ddof=1, vfscaled=False, meta=False)#

Deprecated since version 2022.10.0: The eofunc module is deprecated. eofunc_eofs has been moved to the stats module for future use. Use geocat.comp.eofunc_eofs or geocat.comp.stats.eofunc_eofs for the same functionality.

Computes empirical orthogonal functions (EOFs, aka: Principal Component Analysis).

This implementation uses eofs package, which is built upon the following study: Dawson, Andrew, “eofs: A library for EOF analysis of meteorological, oceanographic, and climate data,” Journal of Open Research Software, vol. 4, no. 1, 2016. Further information about this package can be found at: https://ajdawson.github.io/eofs/latest/index

This implementation provides a few conveniences to the user on top of eofs package that are described below in the Parameters section.

Note

eofunc_eofs performs the EOF analysis that was previously done via the NCL function eofunc. However, there are a few changes to the NCL flow such as:

  1. Only np.nan is supported as missing value,

  2. EOFs are computed only from covariance matrix and there is no support for computation from correlation matrix,

  3. percentage of non-missing points that must exist at any single point is no longer an input.

Parameters
  • data (xarray.DataArray or numpy.ndarray or list) – Should contain numbers or np.nan for missing value representation. It must be at least a 2-dimensional array. Function assumes the left-most dimension is the time dimension. If input is xarray.DataArray, the time dimension must be named “time”. If input is a numpy.ndarray or a list, this function still assumes the left-most dimension to be the number of observations or time dimension: however the user is allowed to input otherwise using the time_dim parameter.

  • neofs (int, optional) – A scalar integer that specifies the number of empirical orthogonal functions (i.e. eigenvalues and eigenvectors) to be returned. This is usually less than or equal to the minimum number of observations or number of variables. Defaults to 1.

  • time_dim (int, optional) – An integer defining the time dimension. When input data is of type xarray.DataArray, this is ignored It must be between 0 and data.ndim - 1 or it could be -1 indicating the last dimension. Defaults to 0.

    Note: The time_dim argument allows to perform the EOF analysis that was previously done via the NCL function eofunc_n.

  • eofscaling (int, optional) – (From eofs package): Sets the scaling of the EOFs. The following values are accepted:

    0 : Un-scaled EOFs (default). 1 : EOFs are divided by the square-root of their eigenvalues. 2 : EOFs are multiplied by the square-root of their eigenvalues.

  • weights (array_like, optional) – (From eofs package): An array of weights whose shape is compatible with those of the input array dataset. The weights can have the same shape as dataset or a shape compatible with an array broadcast (i.e., the shape of the weights can can match the rightmost parts of the shape of the input array dataset). If the input array dataset does not require weighting then the value None may be used. Defaults to None (no weighting).

  • center (bool, optional) – (From eofs package): If True, the mean along the first axis of dataset (the time-mean) will be removed prior to analysis. If False, the mean along the first axis will not be removed. Defaults to True (mean is removed).

    The covariance interpretation relies on the input data being anomaly data with a time-mean of 0. Therefore this option should usually be set to True. Setting this option to True has the useful side effect of propagating missing values along the time dimension, ensuring that a solution can be found even if missing values occur in different locations at different times.

  • ddof (int, optional) – (From eofs package): ‘Delta degrees of freedom’. The divisor used to normalize the covariance matrix is N - ddof where N is the number of samples. Defaults to 1.

  • vfscaled (bool, optional) – (From eofs package): If True, scale the errors by the sum of the eigenvalues. This yields typical errors with the same scale as the values returned by Eof.varianceFraction. If False then no scaling is done. Defaults to False.

  • meta (bool, optional) – If set to True and the input array is an Xarray, the metadata from the input array will be copied to the output array. Defaults to False.

Returns

  • A multi-dimensional array containing EOFs. The returned array will be of the same size as data with the

  • leftmost dimension removed and an additional dimension of the size ``neofs` added.`

  • The return variable will have associated with it the following attributes

  • eigenvalues – A one-dimensional array of size neofs that contains the eigenvalues associated with each EOF.

  • northTest – (From eofs package): Typical errors for eigenvalues.

    The method of North et al. (1982) is used to compute the typical error for each eigenvalue. It is assumed that the number of times in the input data set is the same as the number of independent realizations. If this assumption is not valid then the result may be inappropriate.

    Note: The northTest attribute allows to perform the error analysis that was previously done via the NCL function eofunc_north.

  • totalAnomalyVariance – (From eofs package): Total variance associated with the field of anomalies (the sum of the eigenvalues).

  • varianceFraction – (From eofs package): Fractional EOF mode variances.

    The fraction of the total variance explained by each EOF mode, values between 0 and 1 inclusive.

See also

Related NCL Functions: eofunc, eofunc_Wrap, eofunc_north, eofunc_n, eofunc_n_Wrap