Utility classes and functions

There are number of functions and classes that might be useful for working with data outside the hoggorm package. They are provided here for convenience.

Functions in hoggorm.statTools module

The hoggorm.statTools module provides some functions that can be useful when working with multivariate data sets.

hoggorm.statTools.center(arr, axis=0)

This function centers an array column-wise or row-wise.

Parameters:arrX (numpy array) – A numpy array containing the data
Returns:Mean centered data.
Return type:numpy array

Examples

>>> import hoggorm as ho
>>> # Column centering of array
>>> centData = ho.center(data, axis=0)
>>> # Row centering of array
>>> centData = ho.center(data, axis=1)
hoggorm.statTools.matrixRank(arr, tol=1e-08)

Computes the rank of an array/matrix, i.e. number of linearly independent variables. This is not the same as numpy.rank() which only returns the number of ways (2-way, 3-way, etc) an array/matrix has.

Parameters:arrX (numpy array) – A numpy array containing the data
Returns:Rank of matrix.
Return type:scalar

Examples

>>> import hoggorm as ho
>>>
>>> # Get the rank of the data
>>> ho.matrixRank(myData)
>>> 8
hoggorm.statTools.ortho(arr1, arr2)

This function orthogonalises arr1 with respect to arr2. The function then returns orthogonalised array arr1_orth.

Parameters:
  • arr1 (numpy array) – A numpy array containing some data
  • arr2 (numpy array) – A numpy array containing some data
Returns:

A numpy array holding orthogonalised numpy array arr1.

Return type:

numpy array

Examples

some examples

hoggorm.statTools.standardise(arr, mode=0)

This function standardises the input array either column-wise (mode = 0) or row-wise (mode = 1).

Parameters:
  • arrX (numpy array) – A numpy array containing the data
  • selection (int) – An integer indicating whether standardisation should happen column wise or row wise.
Returns:

Standardised data.

Return type:

numpy array

Examples

>>> import hoggorm as ho
>>> # Standardise array column-wise
>>> standData = ho.standardise(data, mode=0)
>>> # Standardise array row-wise
>>> standData = ho.standardise(data, mode=1)

Cross validation classes in hoggorm.cross_val module

hoggorm classes PCA, PLSR and PCR use a number classes for computation of the models which are found in the hoggorm.cross_val module.

The cross validation classes in this module are used inside the multivariate statistical methods and may be called upon using the cvType input parameter for these methods. They are not intended to be used outside the multivariate statistical methods, even though it is possible. They are shown here to illustrate how the different cross validation options work.

The code in this module is based on the cross_val.py module from scikit-learn 0.4. It is adapted to work with hoggorm.

Authors:

Alexandre Gramfort <alexandre.gramfort@inria.fr>

Gael Varoquaux <gael.varoquaux@normalesup.org>

License: BSD Style.

class hoggorm.cross_val.KFold(n, k)

K-Folds cross validation iterator: Provides train/test indexes to split data in train test sets

class hoggorm.cross_val.LeaveOneLabelOut(labels)

Leave-One-Label_Out cross-validation iterator: Provides train/test indexes to split data in train test sets

class hoggorm.cross_val.LeaveOneOut(n)

Leave-One-Out cross validation iterator: Provides train/test indexes to split data in train test sets

class hoggorm.cross_val.LeavePOut(n, p)

Leave-P-Out cross validation iterator: Provides train/test indexes to split data in train test sets

hoggorm.cross_val.split(train_indexes, test_indexes, *args)

For each arg return a train and test subsets defined by indexes provided in train_indexes and test_indexes