MDS (`sa.mds`)¶

Multi dimensional scaling.

class sa.mds.MDS(n_components=2, metric=True, n_init=4, max_iter=300, verbose=0, eps=0.001, n_jobs=1, random_state=None)[source]¶

Bases: sklearn.base.BaseEstimator

Multidimensional scaling

metric : boolean, optional, default: True

compute metric or nonmetric SMACOF (Scaling by Majorizing a Complicated Function) algorithm

n_components : int, optional, default: 2

number of dimension in which to immerse the similarities overridden if initial array is provided.

n_init : int, optional, default: 4

Number of time the smacof algorithm will be run with different initialisation. The final results will be the best output of the n_init consecutive runs in terms of stress.

max_iter : int, optional, default: 300

Maximum number of iterations of the SMACOF algorithm for a single run

verbose : int, optional, default: 0

level of verbosity

eps : float, optional, default: 1e-6

relative tolerance w.r.t stress to declare converge

n_jobs : int, optional, default: 1

The number of jobs to use for the computation. This works by breaking down the pairwise matrix into n_jobs even slices and computing them in parallel.

If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debuging. For n_jobs below -1, (n_cpus + 1 - n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.

random_state : integer or numpy.RandomState, optional

The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.

embedding_ : array-like, shape [n_components, n_samples]: Stores the position of the dataset in the embedding space
stress_ : float: The final value of the stress (sum of squared distance of the disparities and the distances for all constrained points)

“Modern Multidimensional Scaling - Theory and Applications” Borg, I.; Groenen P. Springer Series in Statistics (1997)

“Nonmetric multidimensional scaling: a numerical method” Kruskal, J. Psychometrika, 29 (1964)

“Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis” Kruskal, J. Psychometrika, 29, (1964)

fit(X, init=None, y=None)[source]¶

Computes the position of the points in the embedding space

X: array, shape=[n_samples, n_samples], symetric: Proximity matrice
init: {None or ndarray, shape (n_samples,)}: if None, randomly chooses the initial configuration if ndarray, initialize the SMACOF algorithm with this array

fit_transform(X, init=None, y=None)[source]¶

Fit the data from X, and returns the embedded coordinates

X: array, shape=[n_samples, n_samples], symetric: Proximity matrice
init: {None or ndarray, shape (n_samples,)}: if None, randomly chooses the initial configuration if ndarray, initialize the SMACOF algorithm with this array

get_params(deep=True)¶

Get parameters for the estimator

deep: boolean, optional: If True, will return the parameters for this estimator and contained subobjects that are estimators.

set_params(**params)¶

Set the parameters of the estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

self

sa.mds.pool_adjacent_violators(distances, similarities, max_iter=300, verbose=0)[source]¶

Pool adjancent violators

Computes an isotonic regression of distances on similarities.

distances: ndarray, shape (n, 1): array to fit
similarities: ndarray, shape (n, 1): array on which to fit
max_iter: int, optional, default:300: Set the maximum number of iteration
verbose: int, optional, default: 0: set the level of verbosity

distances: ndarray, shape (n, 1)

“Modern Multidimensional Scaling - Theory and Applications” Borg, I.; Groenen P. Springer Series in Statistics (1997)

sa.mds.smacof(similarities, metric=True, n_components=2, init=None, n_init=8, n_jobs=1, max_iter=300, verbose=0, eps=0.001, random_state=None)[source]¶

Computes multidimensional scaling using SMACOF (Scaling by Majorizing a Complicated Function) algorithm

The SMACOF algorithm is a multidimensional scaling algorithm: it minimizes a objective function, the stress, using a majorization technique. The Stress Majorization, also known as the Guttman Transform, guarantees a monotone convergence of Stress, and is more powerful than traditional technics such as gradient descent.

The SMACOF algorithm for metric MDS can summarized by the following steps:

Set an initial start configuration, randomly or not.
Compute the stress
Compute the Guttman Transform
Iterate 2 and 3 until convergence.

The nonmetric algorithm adds a monotonic regression steps before computing the stress.

similarities : symmetric ndarray, shape (n_samples, n_samples): similarities between the points
metric : boolean, optional, default: True: compute metric or nonmetric SMACOF algorithm
n_components : int, optional, default: 2: number of dimension in which to immerse the similarities overridden if initial array is provided.
init : {None or ndarray of shape (n_samples, n_components)}: if None, randomly chooses the initial configuration if ndarray, initialize the SMACOF algorithm with this array
n_init : int, optional, default: 8: Number of time the smacof algorithm will be run with different initialisation. The final results will be the best output of the n_init consecutive runs in terms of stress.

n_jobs : int, optional, default: 1

The number of jobs to use for the computation. This works by breaking down the pairwise matrix into n_jobs even slices and computing them in parallel.

If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debuging. For n_jobs below -1, (n_cpus + 1 - n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.

max_iter : int, optional, default: 300: Maximum number of iterations of the SMACOF algorithm for a single run
verbose : int, optional, default: 0: level of verbosity
eps : float, optional, default: 1e-6: relative tolerance w.r.t stress to declare converge
random_state : integer or numpy.RandomState, optional: The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.

X : ndarray (n_samples,n_components): Coordinates of the n_samples points in a n_components-space
stress : float: The final value of the stress (sum of squared distance of the disparities and the distances for all constrained points)

“Modern Multidimensional Scaling - Theory and Applications” Borg, I.; Groenen P. Springer Series in Statistics (1997)

“Nonmetric multidimensional scaling: a numerical method” Kruskal, J. Psychometrika, 29 (1964)

“Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis” Kruskal, J. Psychometrika, 29, (1964)

MDS (`sa.mds`)¶

Previous topic

Next topic

Navigation

MDS (sa.mds)¶

Previous topic

Next topic

Quick search

Navigation

MDS (`sa.mds`)¶