Download

View publication

Abstract

Evolution strategies constitute a set of domain-general optimization algorithms, which do not require the ability to compute well-behaved gradients. They suffer from the inability to scale to large search spaces, a lack of hyperparameter intuition and an inflexible, often heuristic design. In order to address such limitations, we take inspiration from recent advances in learned optimization and automatically discover new evolution strategies via meta-learning. The proposed search strategy is parametrized by a self-attention-based architecture, which enables flexible interpolation between different search heuristics. The induced search update rule is equivariant to the ordering of the candidate solutions. We show that meta-evolving this system on a set of representative low-dimensional optimization problems discovers new evolution strategies capable of generalization to unseen optimization problems, population sizes and optimization horizons. Our experiments on vision and continuous control tasks demonstrate that the learned evolution strategy is more sample-efficient than established baseline strategies and can effectively scale to large population sizes. As a bonus, we show that it is possible to self-referentially train the evolution strategy starting from a random initialization using a simple selection heuristic. Finally, we study the contributions of the individual neural network components and reverse engineer the learned strategy into a competitive evolution strategy based on the discovered heuristics.