Utililty classes and functions¶
There are number of functions and classes that might be useful for working with data outside the hoggorm package. They are provided here for convenience.
Functions in hoggorm.statTools module¶
The hoggorm.statTools module provides some functions that can be useful when working with multivariate data sets.
-
hoggorm.statTools.
center
(arr, axis=0)¶ This function centers an array column-wise or row-wise.
Parameters: arrX (numpy array) – A numpy array containing the data Returns: Mean centered data. Return type: numpy array Examples
>>> import hoggorm as ho >>> # Column centering of array >>> centData = ho.center(data, axis=0)
>>> # Row centering of array >>> centData = ho.center(data, axis=1)
-
hoggorm.statTools.
matrixRank
(arr, tol=1e-08)¶ Computes the rank of an array/matrix, i.e. number of linearly independent variables. This is not the same as numpy.rank() which only returns the number of ways (2-way, 3-way, etc) an array/matrix has.
Parameters: arrX (numpy array) – A numpy array containing the data Returns: Rank of matrix. Return type: scalar Examples
>>> import hoggorm as ho >>> >>> # Get the rank of the data >>> ho.matrixRank(myData) >>> 8
-
hoggorm.statTools.
ortho
(arr1, arr2)¶ This function orthogonalises arr1 with respect to arr2. The function then returns orthogonalised array arr1_orth.
Parameters: - arr1 (numpy array) – A numpy array containing some data
- arr2 (numpy array) – A numpy array containing some data
Returns: A numpy array holding orthogonalised numpy array
arr1
.Return type: numpy array
Examples
some examples
-
hoggorm.statTools.
standardise
(arr, mode=0)¶ This function standardises the input array either column-wise (mode = 0) or row-wise (mode = 1).
Parameters: - arrX (numpy array) – A numpy array containing the data
- selection (int) – An integer indicating whether standardisation should happen column wise or row wise.
Returns: Standardised data.
Return type: numpy array
Examples
>>> import hoggorm as ho >>> # Standardise array column-wise >>> standData = ho.standardise(data, mode=0)
>>> # Standardise array row-wise >>> standData = ho.standarise(data, mode=1)
Cross validation classes in hoggorm.cross_val module¶
hoggorm classes PCA, PLSR and PCR use a number classes for computation of the models which are found in the hoggorm.cross_val module.
The cross validation classes in this module are used inside the multivariate statistical methods and may be called upon using the cvType
input parameter for these methods. They are not intended to be used outside the multivariate statistical methods, even though it is possible.
They are shown here to illustrate how the different cross validation options work.
The code in this module is based on the cross_val.py module from scikt-learn 0.4. It is adapted to work with hoggorm.
Authors:
Alexandre Gramfort <alexandre.gramfort@inria.fr>
Gael Varoquaux <gael.varoquaux@normalesup.org>
License: BSD Style.
-
class
hoggorm.cross_val.
KFold
(n, k)¶ K-Folds cross validation iterator: Provides train/test indexes to split data in train test sets
-
__init__
(n, k)¶ K-Folds cross validation iterator: Provides train/test indexes to split data in train test sets
Parameters: - n (int) – Total number of elements
- k (int) – number of folds
Examples
>>> import hoggorm as ho >>> X = [[1, 2], [3, 4], [1, 2], [3, 4]] >>> y = [1, 2, 3, 4] >>> kf = ho.KFold(4, k=2) >>> for train_index, test_index in kf: ... print "TRAIN:", train_index, "TEST:", test_index ... X_train, X_test, y_train, y_test = cross_val.split(train_index, test_index, X, y) TRAIN: [False False True True] TEST: [ True True False False] TRAIN: [ True True False False] TEST: [False False True True]
Notes
All the folds have size trunc(n/k), the last one has the complementary
-
-
class
hoggorm.cross_val.
LeaveOneLabelOut
(labels)¶ Leave-One-Label_Out cross-validation iterator: Provides train/test indexes to split data in train test sets
-
__init__
(labels)¶ Leave-One-Label_Out cross validation: Provides train/test indexes to split data in train test sets
Parameters: labels (list) – List of labels Examples
>>> import hoggorm as ho >>> X = [[1, 2], [3, 4], [5, 6], [7, 8]] >>> y = [1, 2, 1, 2] >>> labels = [1, 1, 2, 2] >>> lolo = ho.LeaveOneLabelOut(labels) >>> for train_index, test_index in lol: ... print "TRAIN:", train_index, "TEST:", test_index ... X_train, X_test, y_train, y_test = cross_val.split(train_index, test_index, X, y) ... print X_train, X_test, y_train, y_test TRAIN: [False False True True] TEST: [ True True False False] [[5 6] [7 8]] [[1 2] [3 4]] [1 2] [1 2] TRAIN: [ True True False False] TEST: [False False True True] [[1 2] [3 4]] [[5 6] [7 8]] [1 2] [1 2]
-
-
class
hoggorm.cross_val.
LeaveOneOut
(n)¶ Leave-One-Out cross validation iterator: Provides train/test indexes to split data in train test sets
-
__init__
(n)¶ Leave-One-Out cross validation iterator: Provides train/test indexes to split data in train test sets
Parameters: n (int) – Total number of elements Examples
>>> import hoggorm as ho >>> X = [[1, 2], [3, 4]] >>> y = [1, 2] >>> loo = ho.LeaveOneOut(2) >>> for train_index, test_index in loo: ... print "TRAIN:", train_index, "TEST:", test_index ... X_train, X_test, y_train, y_test = cross_val.split(train_index, test_index, X, y) ... print X_train, X_test, y_train, y_test TRAIN: [False True] TEST: [ True False] [[3 4]] [[1 2]] [2] [1] TRAIN: [ True False] TEST: [False True] [[1 2]] [[3 4]] [1] [2]
-
-
class
hoggorm.cross_val.
LeavePOut
(n, p)¶ Leave-P-Out cross validation iterator: Provides train/test indexes to split data in train test sets
-
__init__
(n, p)¶ Leave-P-Out cross validation iterator: Provides train/test indexes to split data in train test sets
Parameters: - n (int) – Total number of elements
- p (int) – Size test sets
Examples
>>> import hoggorm as ho >>> X = [[1, 2], [3, 4], [5, 6], [7, 8]] >>> y = [1, 2, 3, 4] >>> lpo = ho.LeavePOut(4, 2) >>> for train_index, test_index in lpo: ... print "TRAIN:", train_index, "TEST:", test_index ... X_train, X_test, y_train, y_test = cross_val.split(train_index, test_index, X, y) TRAIN: [False False True True] TEST: [ True True False False] TRAIN: [False True False True] TEST: [ True False True False] TRAIN: [False True True False] TEST: [ True False False True] TRAIN: [ True False False True] TEST: [False True True False] TRAIN: [ True False True False] TEST: [False True False True] TRAIN: [ True True False False] TEST: [False False True True]
-
-
hoggorm.cross_val.
split
(train_indexes, test_indexes, *args)¶ For each arg return a train and test subsets defined by indexes provided in train_indexes and test_indexes