API Reference
Resampling
Resample
- class thermal.Sample(amplitude='gmm', *, n_components=7, **kwargs)
Generic resampling algorithm interface.
- Parameters
amplitude (string, default='gmm') – The amplitude density resampling method.
n_components (int, default=3) – The number of mixture components used when amplitude==’gmm.
- size_
The default number of samples to generate.
- Type
int
Example
import numpy as np import thermal as th x = np.random.normal(size=10) s = th.Resample().fit(x).sample() s >>> array([ 0.01212549, 0.04772549, 0.08693959, ..., -0.00519905, -0.00908192, 0.00756048])
- fit(x, **kwargs)
Estimate model parameters of the Gaussian Mixtures Resampler.
- Parameters
x (array-like of shape (n_samples)) – List of data points.
- Returns
self – The fitted Gaussian Mixtures Resampler.
- Return type
object
- sample(n_samples=None, **kwargs)
Generate random samples.
- Parameters
n_samples (int, default=None) – Number of samples to generate. When omitted the number of samples will be the same as the number of samples used to fit.
- Returns
X – Randomly generated sample.
- Return type
array, shape (n_samples)
ResampleGmm
- class thermal.SampleGmm(n_components=7, *, prune=True, bayesian=False, **kwargs)
Resample using Gaussian Mixtures.
- Parameters
n_components (int, default=3) – The number of mixture components.
- size_
The default number of samples to generate.
- Type
int
- gmm_
The Gaussian Mixture model.
- Type
sklearn.mixture.GaussianMixture object.
Example
import numpy as np import thermal as th x = np.random.normal(size=10) s = th.ResampleGmm(3).fit(x).sample() s >>> array([ 0.01212549, 0.04772549, 0.08693959, ..., -0.00519905, -0.00908192, 0.00756048])
- fit(x, **kwargs)
Estimate model parameters of the Gaussian Mixtures Resampler.
- Parameters
x (array-like of shape (n_samples)) – List of data points.
- Returns
self – The fitted Gaussian Mixtures Resampler.
- Return type
object
- sample(n_samples=None, **kwargs)
Generate random samples.
- Parameters
n_samples (int, default=None) – Number of samples to generate. When omitted the number of samples will be the same as the number of samples used to fit.
- Returns
X – Randomly generated sample.
- Return type
array, shape (n_samples)
ResampleHist
- class thermal.SampleHist(replace=True, *args, **kwargs)
Sample using Historical sampling, either with- or without- replacement.
- Parameters
replace (boolean, default=True) – Sample with- or without- replacement.
- fit(x, *args, **kwargs)
Provide a set of samples that will be used to reample.
- Parameters
x (array-like of shape (n_samples)) – List of data points.
- Returns
self – The fitted Historical Resampler.
- Return type
object
- sample(n_samples=None, *args, **kwargs)
Generate random samples.
- Parameters
n_samples (int, default=None) – Number of samples to generate. When omitted the number of samples will be the same as the number of samples used to fit.
replace (boolean, default=True) – Sample with- or without- replacement.
- Returns
X – Randomly drawen sample.
- Return type
array, shape (n_samples)
ResampleKde
- class thermal.SampleKde(cv=None, *args, **kwargs)
Resample using Kernel Density estimate.
- kernel_width_
The estimated Kernel width.
- Type
number
- fit(x, **kwargs)
Estimate model parameters of the Kernel Density Resampler.
- Parameters
x (array-like of shape (n_samples)) – List of data points.
- Returns
self – The fitted Kernel Density Resampler.
- Return type
object
- sample(n_samples=None, **kwargs)
Generate random samples.
- Parameters
n_samples (int, default=None) – Number of samples to generate. When omitted the number of samples will be the same as the number of samples used to fit.
- Returns
x – Randomly generated sample.
- Return type
array, shape (n_samples)
Statistical tests
anderson_test
- thermal.anderson_test(original, surrogates, *, num_tests=1000)
Run multiple two-sample Anderson-Darling test between the original and surrogate samples.
Tests the null hypothesis that multiple surrogate sets and a set of original samples are drawn from the same population without having to specify the distribution function of that population. For each test it returns an approximate significance level at which the null hypothesis for the provided samples can be rejected. The value is floored / capped at 0.1% / 25%.
- Parameters
original (Orignal set samples) –
surrogates (Sets of samples, or a fitted resampler opject.) –
num_tests ((optional) number of tests to do.) –
- Return type
Array of significance level values.
ks_test
- thermal.ks_test(original, surrogates, *, num_tests=1000)
Run multiple two-sample Kolmogorov–Smirnov test.
This test is used to test the similarity between the distribution of original data samples v.s. multiple set of generated surrogate data samples.
- Parameters
original (Orignal set samples) –
surrogates (Sets of samples, or a fitted resampler opject that will be used to generate surrogate samples.) –
num_tests ((optional) number of tests to do.) –
- Return type
Array of p-values.