pycmtensor.optimizers#

PyCMTensor optimizers module

Module Contents#

class pycmtensor.optimizers.Adam(params: list, b1: float = 0.9, b2: float = 0.999, **kwargs)[source]#

Bases: Optimizer

An optimizer that implments the Adam algorithm [1]

Parameters:
  • params (list) – a list of TensorSharedVariable

  • b1 (float, optional) – exponential decay rate for the 1st moment estimates. Defaults to 0.9

  • b2 (float, optional) – exponential decay rate for the 2nd moment estimates. Defaults to 0.999

property t[source]#
property m_prev[source]#
property v_prev[source]#
update(cost, params: list, lr: float = 0.001)[source]#

Generate a list of updates

Parameters:
  • cost (TensorVariable) – a scalar element for the expression of the cost function where the derivatives are calculated

  • params (list) – a list of TensorSharedVariable

  • lr (float, optional) – learning rate. Defaults to 0.001

Returns:

a list of tuples of (p, p_t), (m, m_t), (v, v_t), (t, t_new)

Return type:

list

class pycmtensor.optimizers.Nadam(params: list, b1: float = 0.99, b2: float = 0.999, **kwargs)[source]#

Bases: Adam

An optimizer that implements the Nesterov Adam algorithm [2]

Parameters:
  • params (list) – a list of TensorSharedVariable

  • b1 (float, optional) – exponential decay rate for the 1st moment estimates. Defaults to 0.9

  • b2 (float, optional) – exponential decay rate for the 2nd moment estimates. Defaults to 0.999

update(cost, params: list, lr: float = 0.001)[source]#

Generate a list of updates

Parameters:
  • cost (TensorVariable) – a scalar element for the expression of the cost function where the derivatives are calculated

  • params (list) – a list of TensorSharedVariable

  • lr (float, optional) – learning rate. Defaults to 0.001

Returns:

a list of tuples of (p, p_t), (m, m_t), (v, v_t), (t, t_new)

Return type:

list

class pycmtensor.optimizers.Adamax(params: list, b1: float = 0.9, b2: float = 0.999, **kwargs)[source]#

Bases: Adam

An optimizer that implements the Adamax algorithm [3]. It is a variant of the Adam algorithm

Parameters:
  • params (list) – a list of TensorSharedVariable

  • b1 (float, optional) – exponential decay rate for the 1st moment estimates. Defaults to 0.9

  • b2 (float, optional) – exponential decay rate for the 2nd moment estimates. Defaults to 0.999

update(cost, params: list, lr: float = 0.001)[source]#

Caller to the optimizer class to generate a list of updates

Parameters:
  • cost (TensorVariable) – a scalar element for the expression of the cost function where the derivatives are calculated

  • params (list) – a list of TensorSharedVariable

  • lr (float, optional) – learning rate. Defaults to 0.001

Returns:

a list of tuples of (p, p_t), (m, m_t), (v, v_t), (t, t_new)

Return type:

list

class pycmtensor.optimizers.Adadelta(params: list, rho: float = 0.95, **kwargs)[source]#

Bases: Optimizer

An optimizer that implements the Adadelta algorithm [4]

Adadelta is a stochastic gradient descent method that is based on adaptive learning rate per dimension to address two drawbacks:

  • The continual decay of learning rates throughout training

  • The need for a manually selected global learning rate

Parameters:
  • params (list) – a list of TensorSharedVariable

  • rho (float, optional) – the decay rate for learning rate. Defaults to 0.95

property accumulator[source]#
property delta[source]#
update(cost, params: list, lr: float = 1.0)[source]#

Caller to the optimizer class to generate a list of updates

Parameters:
  • cost (TensorVariable) – a scalar element for the expression of the cost function where the derivatives are calculated

  • params (list) – a list of TensorSharedVariable

  • lr (float, optional) – learning rate. Defaults to 1.0

Returns:

a list of tuples of (param, param_new), (a, a_t), (d, d_t)

Return type:

list

Note

Since the Adadelta algorithm uses an adaptive learning rate, the learning rate is set to 1.0

class pycmtensor.optimizers.RMSProp(params: list, rho: float = 0.9, **kwargs)[source]#

Bases: Optimizer

An optimizer that implements the RMSprop algorithm [5]

Parameters:
  • params (list) – a list of TensorSharedVariable

  • rho (float, optional) – discounting factor for the history/coming gradient. Defaults to 0.9

property accumulator[source]#
update(cost, params: list, lr: float = 0.001)[source]#

Caller to the optimizer class to generate a list of updates

Parameters:
  • cost (TensorVariable) – a scalar element for the expression of the cost function where the derivatives are calculated

  • params (list) – a list of TensorSharedVariable

  • lr (float, optional) – learning rate. Defaults to 0.001

Returns:

a list of tuples of (param, param_new), (a, a_t)

Return type:

list

class pycmtensor.optimizers.Momentum(params: list, mu: float = 0.9, **kwargs)[source]#

Bases: Optimizer

An optimizer that implements the Momentum algorithm [6]

Parameters:
  • params (list) – a list of TensorSharedVariable

  • mu (float, optional) – acceleration factor in the relevant direction and dampens oscillations. Defaults to 0.9

property velocity[source]#
update(cost, params: list, lr: float = 0.001)[source]#

Caller to the optimizer class to generate a list of updates

Parameters:
  • cost (TensorVariable) – a scalar element for the expression of the cost function where the derivatives are calculated

  • params (list) – a list of TensorSharedVariable

  • lr (float, optional) – the learning rate. Defaults to 0.001

Returns:

a list of tuples of (param, param_new), (v, v_t)

Return type:

list

class pycmtensor.optimizers.NAG(params: list, mu: float = 0.99, **kwargs)[source]#

Bases: Momentum

An optimizer that implements the Nestrov Accelerated Gradient algorithm [7]

Parameters:
  • params (list) – a list of TensorSharedVariable

  • mu (float, optional) – acceleration factor in the relevant direction and dampens oscillations. Defaults to 0.9

property t[source]#
update(cost, params: list, lr: float = 0.001)[source]#

Caller to the optimizer class to generate a list of updates

Parameters:
  • cost (TensorVariable) – a scalar element for the expression of the cost function where the derivatives are calculated

  • params (list) – a list of TensorSharedVariable

  • lr (float, optional) – the learning rate. Defaults to 0.001

Returns:

a list of tuples of (param, param_new), (v, v_t)

Return type:

list

class pycmtensor.optimizers.SGD(params: list, **kwargs)[source]#

Bases: Optimizer

An optimizer that implements the stochastic gradient algorithm

Parameters:

params (list) – a list of TensorSharedVariable

update(cost, params: list, lr: float = 0.001)[source]#

Caller to the optimizer class to generate a list of updates

Parameters:
  • cost (TensorVariable) – a scalar element for the expression of the cost function where the derivatives are calculated

  • params (list) – a list of TensorSharedVariable

  • lr (float, optional) – the learning rate. Defaults to 0.001

Returns:

a list of (param, param_new) tuples

Return type:

list