Evolving Hierarchical Structures for Convolutional Neural Networks using Jagged Arrays
Abstract
Traditionally, deep learning practitioners have relied on heuristics and past high-performing networks to hand-craft their neural network architectures because architecture search algorithms were too computationally intensive. However, with the advent of better GPU processing for neural networks, architecture search is now feasible. We propose a search through the architecture space, using a genetic algorithm that evolves convolutional neural networks using a novel decomposition scheme of nested hierarchical structures using jagged arrays. We test it on standard image classification benchmarks (MNIST, CIFAR-10) as well as a modular task (classifying "what" and "where"). The resulting architectures achieve performance comparable with the state of the art. We then explore the types of network architectures that arise from the evolutionary stage. We show that the evolved architectures adapt to the specific dataset, that common heuristics such as modular reuse evolve independently in our approach and hierarchical structures develop for the modular task.