API Reference

Resampling

Resample

class thermal.Sample(amplitude='gmm', *, n_components=7, **kwargs)

Generic resampling algorithm interface.

Parameters
  • amplitude (string, default='gmm') – The amplitude density resampling method.

  • n_components (int, default=3) – The number of mixture components used when amplitude==’gmm.

size_

The default number of samples to generate.

Type

int

Example

import numpy as np
import thermal as th

x = np.random.normal(size=10)
s = th.Resample().fit(x).sample()

s
>>> array([ 0.01212549,  0.04772549,  0.08693959, ..., -0.00519905,
   -0.00908192,  0.00756048])
fit(x, **kwargs)

Estimate model parameters of the Gaussian Mixtures Resampler.

Parameters

x (array-like of shape (n_samples)) – List of data points.

Returns

self – The fitted Gaussian Mixtures Resampler.

Return type

object

sample(n_samples=None, **kwargs)

Generate random samples.

Parameters

n_samples (int, default=None) – Number of samples to generate. When omitted the number of samples will be the same as the number of samples used to fit.

Returns

X – Randomly generated sample.

Return type

array, shape (n_samples)

ResampleGmm

class thermal.SampleGmm(n_components=7, *, prune=True, bayesian=False, **kwargs)

Resample using Gaussian Mixtures.

Parameters

n_components (int, default=3) – The number of mixture components.

size_

The default number of samples to generate.

Type

int

gmm_

The Gaussian Mixture model.

Type

sklearn.mixture.GaussianMixture object.

Example

import numpy as np
import thermal as th

x = np.random.normal(size=10)
s = th.ResampleGmm(3).fit(x).sample()

s
>>> array([ 0.01212549,  0.04772549,  0.08693959, ..., -0.00519905,
   -0.00908192,  0.00756048])
fit(x, **kwargs)

Estimate model parameters of the Gaussian Mixtures Resampler.

Parameters

x (array-like of shape (n_samples)) – List of data points.

Returns

self – The fitted Gaussian Mixtures Resampler.

Return type

object

sample(n_samples=None, **kwargs)

Generate random samples.

Parameters

n_samples (int, default=None) – Number of samples to generate. When omitted the number of samples will be the same as the number of samples used to fit.

Returns

X – Randomly generated sample.

Return type

array, shape (n_samples)

ResampleHist

class thermal.SampleHist(replace=True, *args, **kwargs)

Sample using Historical sampling, either with- or without- replacement.

Parameters

replace (boolean, default=True) – Sample with- or without- replacement.

fit(x, *args, **kwargs)

Provide a set of samples that will be used to reample.

Parameters

x (array-like of shape (n_samples)) – List of data points.

Returns

self – The fitted Historical Resampler.

Return type

object

sample(n_samples=None, *args, **kwargs)

Generate random samples.

Parameters
  • n_samples (int, default=None) – Number of samples to generate. When omitted the number of samples will be the same as the number of samples used to fit.

  • replace (boolean, default=True) – Sample with- or without- replacement.

Returns

X – Randomly drawen sample.

Return type

array, shape (n_samples)

ResampleKde

class thermal.SampleKde(cv=None, *args, **kwargs)

Resample using Kernel Density estimate.

kernel_width_

The estimated Kernel width.

Type

number

fit(x, **kwargs)

Estimate model parameters of the Kernel Density Resampler.

Parameters

x (array-like of shape (n_samples)) – List of data points.

Returns

self – The fitted Kernel Density Resampler.

Return type

object

sample(n_samples=None, **kwargs)

Generate random samples.

Parameters

n_samples (int, default=None) – Number of samples to generate. When omitted the number of samples will be the same as the number of samples used to fit.

Returns

x – Randomly generated sample.

Return type

array, shape (n_samples)

Statistical tests

anderson_test

thermal.anderson_test(original, surrogates, *, num_tests=1000)

Run multiple two-sample Anderson-Darling test between the original and surrogate samples.

Tests the null hypothesis that multiple surrogate sets and a set of original samples are drawn from the same population without having to specify the distribution function of that population. For each test it returns an approximate significance level at which the null hypothesis for the provided samples can be rejected. The value is floored / capped at 0.1% / 25%.

Parameters
  • original (Orignal set samples) –

  • surrogates (Sets of samples, or a fitted resampler opject.) –

  • num_tests ((optional) number of tests to do.) –

Return type

Array of significance level values.

ks_test

thermal.ks_test(original, surrogates, *, num_tests=1000)

Run multiple two-sample Kolmogorov–Smirnov test.

This test is used to test the similarity between the distribution of original data samples v.s. multiple set of generated surrogate data samples.

Parameters
  • original (Orignal set samples) –

  • surrogates (Sets of samples, or a fitted resampler opject that will be used to generate surrogate samples.) –

  • num_tests ((optional) number of tests to do.) –

Return type

Array of p-values.