htmd.clustering.regular module

class htmd.clustering.regular.RegCluster(radius=None, n_clusters=None)

Bases: sklearn.base.BaseEstimator, sklearn.base.ClusterMixin, sklearn.base.TransformerMixin

Class to perform regular clustering of a given data set

RegCluster can be passed a radius or an approximate number of clusters. If a number of clusters is passed, KCenter clustering is used to estimate the necessary radius. RegCluster randomly chooses a point and assigns all points within the radius of this point to the same cluster. Then it proceeds with the nearest point, which is not yet assigned to a cluster and puts all unassigned points within the radius of this point in the next cluster and so on.

Parameters
  • radius (float) – radius of clusters

  • n_clusters (int) – desired number of clusters

Examples

>>> cluster = RegCluster(radius=5.1)
>>> cluster.fit(data)
cluster_centers

list with the points, which are the centers of the clusters

Type

list

centerFrames

list of indices of center points in data array

Type

list

labels_

list with number of cluster of each frame

Type

list

clusterSize_

list with number of frames in each cluster

Type

list

property clusterSize
property cluster_centers_
fit(data)

performs clustering of data

Parameters
  • data (np.ndarray) – array of data points to cluster

  • merge (int) – minimal number of frames within each cluster. Smaller clusters are merged into next big one

fit_predict(X, y=None)

Perform clustering on X and returns cluster labels.

Parameters
  • X (ndarray, shape (n_samples, n_features)) – Input data.

  • y (Ignored) – Not used, present for API consistency by convention.

Returns

labels – Cluster labels.

Return type

ndarray, shape (n_samples,)

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
  • X (numpy array of shape [n_samples, n_features]) – Training set.

  • y (numpy array of shape [n_samples]) – Target values.

  • **fit_params (dict) – Additional fit parameters.

Returns

X_new – Transformed array.

Return type

numpy array of shape [n_samples, n_features_new]

get_params(deep=True)

Get parameters for this estimator.

Parameters

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

mapping of string to any

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**params (dict) – Estimator parameters.

Returns

self – Estimator instance.

Return type

object