queens.stochastic_optimizers package#
Stochastic optimizers.
Modules containing stochastic optimizers.
Submodules#
queens.stochastic_optimizers.adam module#
Adam optimizer.
- class Adam(learning_rate, optimization_type, rel_l1_change_threshold, rel_l2_change_threshold, clip_by_l2_norm_threshold=inf, clip_by_value_threshold=inf, max_iteration=1000000.0, beta_1=0.9, beta_2=0.999, eps=1e-08, learning_rate_decay=None)[source]#
Bases:
StochasticOptimizer
Adam stochastic optimizer [1].
References
[1] Kingma and Ba. “Adam: A Method for Stochastic Optimization”. ICLR 2015. 2015.
- beta_1#
\(\beta_1\) parameter as described in [1].
- Type:
float
- beta_2#
\(\beta_2\) parameter as described in [1].
- Type:
float
- m#
Exponential average of the gradient.
- Type:
ExponentialAveragingObject
- v#
Exponential average of the gradient momentum.
- Type:
ExponentialAveragingObject
- eps#
Nugget term to avoid a division by values close to zero.
- Type:
float
queens.stochastic_optimizers.adamax module#
Adamax optimizer.
- class Adamax(learning_rate, optimization_type, rel_l1_change_threshold, rel_l2_change_threshold, clip_by_l2_norm_threshold=inf, clip_by_value_threshold=inf, max_iteration=1000000.0, beta_1=0.9, beta_2=0.999, eps=1e-08, learning_rate_decay=None)[source]#
Bases:
StochasticOptimizer
Adamax stochastic optimizer [1]. eps added to avoid division by zero.
References
[1] Kingma and Ba. “Adam: A Method for Stochastic Optimization”. ICLR 2015. 2015.
- beta_1#
\(\beta_1\) parameter as described in [1].
- Type:
float
- beta_2#
\(\beta_2\) parameter as described in [1].
- Type:
float
- m#
Exponential average of the gradient.
- Type:
ExponentialAveragingObject
- u#
Maximum gradient momentum.
- Type:
np.array
- eps#
Nugget term to avoid a division by values close to zero.
- Type:
float
queens.stochastic_optimizers.learning_rate_decay module#
Learning rate decay for stochastic optimization.
- class DynamicLearningRateDecay(alpha=0.1, rho_min=1.0)[source]#
Bases:
LearningRateDecay
Dynamic learning rate decay.
- alpha#
Decay factor
- Type:
float
- rho_min#
Threshold for signal-to-noise ratio
- Type:
float
- k_min#
Minimum number of iterations before learning rate is decreased
- Type:
int
- k#
Iteration number
- Type:
int
- a#
Sum of parameters
- Type:
np.array
- b#
Sum of squared parameters
- Type:
np.array
- c#
Sum of parameters times iteration number
- Type:
np.array
- class LogLinearLearningRateDecay(slope)[source]#
Bases:
LearningRateDecay
Log linear learning rate decay.
- slope#
Logarithmic slope
- Type:
float
- iteration#
Current iteration
- Type:
int
queens.stochastic_optimizers.rms_prop module#
RMSprop optimizer.
- class RMSprop(learning_rate, optimization_type, rel_l1_change_threshold, rel_l2_change_threshold, clip_by_l2_norm_threshold=inf, clip_by_value_threshold=inf, max_iteration=1000000.0, beta=0.999, eps=1e-08, learning_rate_decay=None)[source]#
Bases:
StochasticOptimizer
RMSprop stochastic optimizer [1].
References
- [1] Tieleman and Hinton. “Lecture 6.5-rmsprop: Divide the gradient by a running average of
its recent magnitude”. Coursera. 2012.
- beta#
\(\beta\) parameter as described in [1].
- Type:
float
- v#
Exponential average of the gradient momentum.
- Type:
ExponentialAveragingObject
- eps#
Nugget term to avoid a division by values close to zero.
- Type:
float
queens.stochastic_optimizers.sgd module#
SGD optimizer.
- class SGD(learning_rate, optimization_type, rel_l1_change_threshold, rel_l2_change_threshold, clip_by_l2_norm_threshold=inf, clip_by_value_threshold=inf, max_iteration=1000000.0, learning_rate_decay=None)[source]#
Bases:
StochasticOptimizer
Stochastic gradient descent optimizer.
queens.stochastic_optimizers.stochastic_optimizer module#
Stochastic optimizer.
- class StochasticOptimizer(learning_rate, optimization_type, rel_l1_change_threshold, rel_l2_change_threshold, clip_by_l2_norm_threshold=inf, clip_by_value_threshold=inf, max_iteration=1000000.0, learning_rate_decay=None)[source]#
Bases:
object
Base class for stochastic optimizers.
The optimizers are implemented as generators. This increases the modularity of this class, since an object can be used in different settings. Some examples:
- Example 1: Simple optimization run (does not strongly benefit from its generator nature):
Define a gradient function gradient()
Create a optimizer object optimizer with the gradient function gradient
Run the optimization by optimizer.run_optimization() in your script
- Example 2: Adding additional functionality during the optimization
Define a optimizer object using a gradient function.
Example code snippet:
for parameters in optimizer:
rel_l2_change_params=optimizer.rel_l2_change
iteration=optimizer.iteration
# Verbose output
print(f”Iter {iteration}, parameters {parameters}, rel L2 change “
f”{rel_l2_change:.2f}”)
# Some additional condition to stop optimization
if self.number_of_simulations >= 1000:
break
- Example 3: Running multiple optimizer iteratively sequentially:
Define optimizer1 and optimizer2 with different gradient functions
Example code:
while not done_bool:
if not optimizer1.done:
self.parameters1=next(optimizer1)
if not optimizer2.done:
self.parameters2=next(optimizer2)
# Example on how to reduce the learning rate for optimizer2
if optimizer2.iteration % 1000 == 0:
optimizer2.learning_rate *= 0.5
done_bool = optimizer1.done and optimizer2.done
- learning_rate#
Learning rate for the optimizer.
- Type:
float
- clip_by_l2_norm_threshold#
Threshold to clip the gradient by L2-norm.
- Type:
float
- clip_by_value_threshold#
Threshold to clip the gradient components.
- Type:
float
- max_iteration#
Maximum number of iterations.
- Type:
int
- precoefficient#
Is 1 in case of maximization and -1 for minimization.
- Type:
int
- rel_l1_change_threshold#
If the L1 relative change in parameters falls below this value, this criterion catches.
- Type:
float
- rel_l2_change_threshold#
If the L2 relative change in parameters falls below this value, this criterion catches.
- Type:
float
- iteration#
Number of iterations done in the optimization so far.
- Type:
int
- done#
True if the optimization is done.
- Type:
bool
- rel_l2_change#
Relative change in L2-norm of variational params w.r.t. the previous iteration.
- Type:
float
- rel_l1_change#
Relative change in L1-norm of variational params w.r.t. the previous iteration.
- Type:
float
- current_variational_parameters#
Variational parameters.
- Type:
np.array
- current_gradient_value#
Current gradient vector w.r.t. the variational parameters.
- Type:
np.array
- gradient#
Function to compute the gradient.
- Type:
function
- learning_rate_decay#
Object to schedule learning rate decay
- Type:
- clip_gradient(gradient)[source]#
Clip the gradient by value and then by norm.
- Parameters:
gradient (np.array) – Current gradient
- Returns:
gradient (np.array) – The clipped gradient
- do_single_iteration(gradient)[source]#
Single iteration for a given gradient.
- Iteration step for a given gradient \(g\):
\(p^{(i+1)}=p^{(i)}+\beta \alpha g\)
where \(\beta=-1\) for minimization and +1 for maximization and \(\alpha\) is the learning rate.
- Parameters:
gradient (np.array) – Current gradient