Skip to content

ramsey.experimental

Experimental modules such as Gaussian processes or Bayesian neural networks.

Note

Experimental code is not native Ramsey code and subject to change, and might even get deleted in the future. Better don't build critical code bases around the :code:ramsey.experimental submodule.

Distributions

ramsey.experimental.Autoregressive

Bases: Distribution

An autoregressive model.

Attributes:

Name Type Description
parameters Array

an initializer object from Flax

parameters Optional[Array]

an initializer object from Flax

ar_coefficients = ar_coefficients instance-attribute

arg_constraints = {'loc': constraints.real, 'ar_coefficients': constraints.real_vector, 'scale': constraints.positive} class-attribute instance-attribute

length = length instance-attribute

loc = loc instance-attribute

p = len(ar_coefficients) instance-attribute

reparametrized_params = ['loc', 'scale', 'ar_coefficients'] class-attribute instance-attribute

scale = scale instance-attribute

support = constraints.real_vector class-attribute instance-attribute

__init__(loc, ar_coefficients, scale, length=None)

Construct an autoregressive distribution.

log_prob(value: Array)

Compute the log probability of a value.

Parameters:

Name Type Description Default
value Array

one-dimensional array of floats

required

Returns:

Type Description
float

returns the mean

mean(length: Optional[int] = None, initial_state: Optional[float] = None)

Compute the mean of the autoregressive distribution.

Parameters:

Name Type Description Default
length Optional[int]

"length" of the autoregressive sequence. If None, takes length supplied to constructor during construction of object.

None
initial_state Optional[float]

initial state of the distribution. If None, takes mean

None

Returns:

Type Description
float

returns the mean

sample(rng_key: jr.PRNGKey, length: Optional[int] = None, initial_state: Optional[float] = None, sample_shape=())

Sample from the distribution.

Parameters:

Name Type Description Default
rng_key PRNGKey

a random key for seeding

required
length Optional[int]

length of sequence

None
initial_state Optional[float]

an initial value

None
sample_shape

a tuple of the form (shape,)

()

Returns:

Type Description
Array

returns an array of values

Models

ramsey.experimental.BNN

Bases: Module

A Bayesian neural network.

The BNN layers can a mix of Bayesian layers and conventional layers. The training objective is the ELBO and is calculated according to [1].

Attributes:

Name Type Description
layers Iterable[Module]

layers of the BNN

family Family

exponential family of the response

References

[1] Blundell C., Cornebise J., Kavukcuoglu K., Wierstra D. "Weight Uncertainty in Neural Networks". ICML, 2015.

family: Family = Gaussian() class-attribute instance-attribute

layers: Iterable[nn.Module] instance-attribute

__call__(x: Array, **kwargs)

Transform the inputs through the Bayesian neural network.

Parameters:

Name Type Description Default
inputs

Input data of dimension (*batch_dims, spatial_dims..., feature_dims)

required
**kwargs

Keyword arguments can include: - outputs: jax.Array. If an argument called outputs is provided, computes the loss (negative ELBO) together with a predictive posterior distribution

{}

Returns:

Type Description
Union[distribution, Tuple[distribution, float]]

if 'outputs' is provided as keyword argument, returns a tuple of the predictive distribution and the negative ELBO which can be used as loss for optimzation. If 'outputs' is not provided, returns the predictive distribution only.

ramsey.experimental.RANP

Bases: ANP

A recurrent attentive neural process.

Implements the core structure of a recurrent attentive neural process cross-attention module.

Attributes:

Name Type Description
decoder Sequential

the decoder can be any network, but is typically an MLP. Note that the last layer of the decoder needs to have twice the number of nodes as the data you try to model

latent_encoder Optional[Tuple[Module, Module]]

a tuple of two nn.Modules. The latent encoder can be any network, but is typically an MLP. The first element of the tuple is a neural network used before the aggregation step, while the second element of the tuple encodes is a neural network used to compute mean(s) and standard deviation(s) of the latent Gaussian.

deterministic_encoder Optional[Tuple[Module, Attention]]

a tuple of a nn.Module and an Attention object. The deterministic encoder can be any network, but is typically an MLP

family Family

distributional family of the response variable

decoder: nn.Module instance-attribute

deterministic_encoder: Optional[Tuple[nn.Module, Attention]] = None class-attribute instance-attribute

family: Family = Gaussian() class-attribute instance-attribute

latent_encoder: Optional[Tuple[nn.Module, nn.Module]] = None class-attribute instance-attribute

__call__(x_context: Array, y_context: Array, x_target: Array, **kwargs)

Transform the inputs through the neural process.

Parameters:

Name Type Description Default
x_context Array

Input data of dimension (*batch_dims, spatial_dims..., feature_dims)

required
y_context Array

Input data of dimension (*batch_dims, spatial_dims..., response_dims)

required
x_target Array

Input data of dimension (*batch_dims, spatial_dims..., feature_dims)

required
**kwargs

Keyword arguments can include: - y_target: jax.Array. If an argument called 'y_target' is provided, computes the loss (negative ELBO) together with a predictive posterior distribution

{}

Returns:

Type Description
Union[distribution, Tuple[distribution, float]]

If 'y_target' is provided as keyword argument, returns a tuple of the predictive distribution and the negative ELBO which can be used as loss for optimization. If 'y_target' is not provided, returns the predictive distribution only.

setup()

Construct all networks.

ramsey.experimental.GP

Bases: Module

A Gaussian process.

Attributes:

Name Type Description
kernel Kernel

a covariance function

sigma_init Optional[Initializer]

an initializer object from Flax

kernel: Kernel instance-attribute

sigma_init: Optional[initializers.Initializer] = None class-attribute instance-attribute

__call__(x: Array, **kwargs)

Evaluate the Gaussian process.

Parameters:

Name Type Description Default
inputs

training point x

required
**kwargs

Keyword arguments can include: - outputs: jax.Array. - inputs_star: jax.Array

{}

Returns:

Type Description
distribution

returns a multivariate normal distribution object

References

.. [1] Rasmussen, Carl E and Williams, Chris KI. "Gaussian Processes for Machine Learning". MIT press, 2006.

ramsey.experimental.SparseGP

Bases: Module

A sparse Gaussian process.

Attributes:

Name Type Description
kernel Kernel

a covariance function

n_inducing int

number of inducing points

jitter float

jitter to add to the covariance matrix diagonal

log_sigma_init Optional[Initializer]

an initializer object from Flax

inducing_init Optional[Initializer]

an initializer object from Flax

References

[1] Titsias, Michalis K. "Variational Learning of Inducing Variables in Sparse Gaussian Processes". AISTATS, 2009

inducing_init: Optional[initializers.Initializer] = initializers.uniform(1) class-attribute instance-attribute

jitter: Optional[float] = 1e-07 class-attribute instance-attribute

kernel: Kernel instance-attribute

log_sigma_init: Optional[initializers.Initializer] = initializers.constant(jnp.log(1.0)) class-attribute instance-attribute

n_inducing: int instance-attribute

__call__(x: Array, **kwargs)

Call the sparse GP.

Modules

ramsey.experimental.BayesianLinear

Bases: Module

Linear Bayesian layer.

A linear Bayesian layer using distributions over weights and bias. The KL divergences between the variational posteriors and priors for weigths and bias are calculated. The KL divergence terms can be used to obtain the ELBO as an objective to train a Bayesian neural network.

Attributes:

Name Type Description
output_size int

number of layer outputs

with_bias bool

control usage of bias term

w_prior Optional[Distribution]

prior distribution for weights

b_prior Optional[Distribution]

prior distribution for bias

name Optional[str]

name of the layer

kwargs keyword arguments

you can supply the initializers for the parameters of the priors using the keyword arguments. For instance, if your prior on the weights is a dist.Normal(loc, scale) then you can supply hk.initializers.Initializer objects with names w_loc_init and w_scale_init as keyword arguments. Likewise you can supply initializers called b_loc_init and b_scale_init for the prior on the bias. If your prior on the weights is a dist.Uniform(low, high) you will need to supply initializers called w_low_init and w_high_init

References

.. [1] Blundell C., Cornebise J., Kavukcuoglu K., Wierstra D. "Weight Uncertainty in Neural Networks". ICML, 2015.

b_prior: Optional[dist.Distribution] = dist.Normal(loc=0.0, scale=1.0) class-attribute instance-attribute

mc_sample_size: int = 10 class-attribute instance-attribute

name: Optional[str] = None class-attribute instance-attribute

output_size: int instance-attribute

use_bias: bool = True class-attribute instance-attribute

w_prior: Optional[dist.Distribution] = dist.Normal(loc=0.0, scale=1.0) class-attribute instance-attribute

__call__(x: Array, is_training: bool = False)

Call a sparse Gaussian process.

Parameters:

Name Type Description Default
inputs

layer inputs

required
is_training bool

training mode where KL divergence terms are calculated and returned

False

setup()

Construct a linear Bayesian layer.

Covariance functions

ramsey.experimental.ExponentiatedQuadratic

Bases: Kernel, Module

Exponentiated quadratic covariance function.

Attributes:

Name Type Description
active_dims Optional[list]

either None or a list of integers. Specified the dimensions of the data on which the kernel operates on

rho_init Optional[Initializer]

an initializer object from Haiku or None

sigma_init Optional[Initializer]

an initializer object from Haiku or None

name Optional[str]

name of the layer

active_dims: Optional[list] = None class-attribute instance-attribute

rho_init: Optional[initializers.Initializer] = None class-attribute instance-attribute

sigma_init: Optional[initializers.Initializer] = None class-attribute instance-attribute

__add__(other)

Add two kernels.

__call__(x1: Array, x2: Array = None)

Call the covariance function.

__mul__(other)

Multiply two kernels.

setup()

Construct a stationary covariance.

ramsey.experimental.exponentiated_quadratic(x1: Array, x2: Array, sigma: float, rho: Union[float, jnp.ndarray])

Exponentiated-quadratic convariance function.

Parameters:

Name Type Description Default
x1 Array

(n x p)-dimensional set of data points

required
x2 Array

(m x p)-dimensional set of data points

required
sigma float

the standard deviation of the kernel function

required
rho Union[float, ndarray]

the lengthscale of the kernel function. Can be a float or a :math:p-dimensional vector if ARD-behaviour is desired

required

Returns:

Type Description
Array

returns a (n x m)-dimensional kernel matrix

ramsey.experimental.Linear

Bases: Kernel, Module

Linear covariance function.

Parameters:

Name Type Description Default
active_dims

the indexes of the dimensions the kernel acts upon

required
sigma_b_init

an initializer object from Flax or None

required
sigma_v_init

an initializer object from Flax or None

required
offset_init

an initializer object from Flax or None

required

active_dims: Optional[list] = None class-attribute instance-attribute

offset_init: Optional[initializers.Initializer] = initializers.uniform() class-attribute instance-attribute

sigma_b_init: Optional[initializers.Initializer] = initializers.uniform() class-attribute instance-attribute

sigma_v_init: Optional[initializers.Initializer] = initializers.uniform() class-attribute instance-attribute

__add__(other)

Add two kernels.

__call__(x1: Array, x2: Array = None)

Call the covariance function.

__mul__(other)

Multiply two kernels.

setup()

Construct parameters.

ramsey.experimental.linear(x1: Array, x2: Array, sigma_b, sigma_v, offset)

Linear convariance function.

Parameters:

Name Type Description Default
x1 Array

:math:n x p-dimensional set of data points

required
x2 Array

:math:m x p-dimensional set of data points

required
sigma_b

the standard deviation of the kernel function

required
sigma_v

the standard deviation of the kernel function

required
offset
required

Returns:

Type Description
Array

returns a :math:n x m-dimensional Gram matrix

ramsey.experimental.Periodic

Bases: Kernel, Module

Periodic covariance function.

Attributes:

Name Type Description
period float

the period of the periodic kernel

active_dims Optional[list]

either None or a list of integers. Specified the dimensions of the data on which the kernel operates on

rho_init Optional[Initializer]

an initializer object from Haiku or None

sigma_init Optional[Initializer]

an initializer object from Haiku or None

active_dims: Optional[list] = None class-attribute instance-attribute

period: float instance-attribute

rho_init: Optional[initializers.Initializer] = initializers.uniform() class-attribute instance-attribute

sigma_init: Optional[initializers.Initializer] = initializers.uniform() class-attribute instance-attribute

__add__(other)

Add two kernels.

__call__(x1: Array, x2: Array = None)

Call the covariance function.

__mul__(other)

Multiply two kernels.

setup()

Construct the covariance function.

ramsey.experimental.periodic(x1: Array, x2: Array, period, sigma, rho)

Periodic convariance function.

Parameters:

Name Type Description Default
x1 Array

(n x p)-dimensional set of data points

required
x2 Array

(m x p)-dimensional set of data points

required
period

the period

required
sigma

the standard deviation of the kernel function

required
rho

the lengthscale of the kernel function. Can be a float or a :math:p-dimensional vector if ARD-behaviour is desired

required

Returns:

Type Description
Array

returns a (n x m)-dimensional Gram matrix

Train functions

ramsey.experimental.train_gaussian_process(rng_key: jr.PRNGKey, gaussian_process: GP, x: Array, y: Array, optimizer=optax.adam(0.003), n_iter=1000, verbose=False)

Train a Gaussian process.

Parameters:

Name Type Description Default
rng_key PRNGKey

a key for seeding random number generators

required
gaussian_process GP

a GP object

required
x Array

an input array of dimension :math:n \times p

required
y Array
required
optimizer

an optax optimizer

adam(0.003)
n_iter

number of training iterations

1000
verbose

print training details

False

Returns:

Type Description
Tuple[dict, Array]

a tuple of training parameters and training losses

ramsey.experimental.train_sparse_gaussian_process(rng_key: jr.PRNGKey, gaussian_process: SparseGP, x: Array, y: Array, optimizer=optax.adam(0.003), n_iter=1000, verbose=False)

Train a sparse Gaussian process.

Parameters:

Name Type Description Default
rng_key PRNGKey

a key for seeding random number generators

required
gaussian_process SparseGP

a SparseGP object

required
x Array

an input array of dimension :math:n \times p

required
y Array
required
optimizer

an optax optimizer

adam(0.003)
n_iter

number of training iterations

1000
verbose

print training details

False

Returns:

Type Description
Tuple[dict, Array]

a tuple of training parameters and training losses