geocat.comp.polynomial.ndpolyfit#

geocat.comp.polynomial.ndpolyfit(x, y, deg, axis=0, **kwargs)#

An extension to numpy.polyfit function to support multi-dimensional arrays, Dask arrays, and missing values.

Parameters

x (array_like) – X-coordinate, an iterable object of shape (M,), (M, 1), or (1, M) where M = y.shape(axis). It cannot have nan or missing values.
y (array_like) – Y-coordinate, an iterable containing the data. It could be list, numpy.ndarray, xarray.DataArray, Dask array. or any Iterable convertible to numpy.ndarray. In case of Dask Array, The data could be chunked. It is recommended not to chunk along the axis provided.
deg (int) – Degree of the fitting polynomial
axis (int, optional) – Axis to fit the polynomial to. Default is 0.
kwargs (dict, optional) – See below

Keyword Arguments

rcond (float, optional) – Relative condition number of the fit. Refer to numpy.polyfit for further details.
full (bool, optional) – Switch determining nature of return value. Refer to numpy.polyfit for further details.
w (array_like optional) – Weights applied to the y-coordinates of the sample points. Refer to numpy.polyfit for further details.
cov (bool, optional) – Determines whether to return the covariance matrix. Refer to numpy.polyfit for further details.
missing_value (number or nan, optional) – The value to be treated as missing. Default is numpy.nan
meta (bool, optional) – If set to True and the input, i.e. y, is of type xarray.DataArray, the attributes associated to the input are transferred to the output.

Returns

coefficients (xarray.DataArray or numpy.ndarray) – An array containing the coefficients of the fitted polynomial.

Examples

Fitting a line to a one dimensional array:

>>> import numpy as np
>>> from geocat.comp.polynomial import ndpolyfit
>>> x = np.arange(10, dtype=float)
>>> y = 2*x + 3
>>> p = ndpolyfit(x, y, deg=1)
>>> print(p)
<xarray.DataArray (dim_0: 2)>
array([2., 3.])
Dimensions without coordinates: dim_0
Attributes:
    deg:             1
    provided_rcond:  None
    full:            False
    weights:         None
    covariance:      False

Fitting a second degree polynomial to a one dimensional array:

>>> y = 4*x*x + 3*x + 2
>>> p = ndpolyfit(x, y, deg=2)
>>> print(p)
<xarray.DataArray (dim_0: 3)>
array([4., 3., 2.])
Dimensions without coordinates: dim_0
Attributes:
    deg:             2
    provided_rcond:  None
    full:            False
    weights:         None
    covariance:      False

Fitting polynomial with missing values: Ordinarily NaN’s are treated as missing values. In this example let’s introduce a different value to indicate missing data.

>>> # Let's introduce some missing values:
>>> y[7:] = 999
>>> p = ndpolyfit(x, y, deg=2)
>>> print(p)
<xarray.DataArray (dim_0: 3)>
array([ 21.15909091, -62.14090909,  20.4       ])
Dimensions without coordinates: dim_0
Attributes:
    deg:             2
    provided_rcond:  None
    full:            False
    weights:         None
    covariance:      False
>>> # As you can see, we got a different coefficients
>>> # Now let's define 999 as missing value
>>> p = ndpolyfit(x, y, deg=2, missing_value=999)
>>> print(p)
<xarray.DataArray (dim_0: 3)>
array([4., 3., 2.])
Dimensions without coordinates: dim_0
Attributes:
    deg:             2
    provided_rcond:  None
    full:            False
    weights:         None
    covariance:      False
>>> # Now we got the coefficient we were looking for

Fitting polynomial with NaN as missing values: NaN is by default considered a missing value all the time

>>> import numpy as np
>>> from geocat.comp.polynomial import ndpolyfit
>>> x = np.arange(10, dtype=float)
>>> y = 4*x*x + 3*x + 2
>>> y[7:] = np.nan
>>> print(y)
[  2.   9.  24.  47.  78. 117. 164.  nan  nan  nan]
>>> p = ndpolyfit(x, y, deg=2)
>>> print(p)
<xarray.DataArray (dim_0: 3)>
array([4., 3., 2.])
Dimensions without coordinates: dim_0
Attributes:
    deg:             2
    provided_rcond:  None
    full:            False
    weights:         None
    covariance:      False
>>> # as you can see, despite not specifying NaN as missing value, the coefficients are properly calculated

Fitting a line to a multi-dimensional array

>>> y_md = np.tile(y.reshape(1, 10, 1, 1), [2, 1, 3, 4])
>>> y_md.shape
(2, 10, 3, 4)
>>> print(y)
[  2.   9.  24.  47.  78. 117. 164. 219. 282. 353.]
>>> print(y_md[1, :, 1, 1])
[  2.   9.  24.  47.  78. 117. 164. 219. 282. 353.]
>>> p = ndpolyfit(x, y_md, deg=2, axis=1)