385 stories

Rainbow Long Exposure Photography

1 Comment

I can not stop looking at these rainbow long exposure photographs by Daniel Mercadante. He captured his partner running with a custom built lighting rig. So much appreciation for this. So much!

Read the whole story
3 days ago
Share this story

Using Evolutionary AutoML to Discover Neural Network Architectures

1 Share

The brain has evolved over a long time, from very simple worm brains 500 million years ago to a diversity of modern structures today. The human brain, for example, can accomplish a wide variety of activities, many of them effortlessly — telling whether a visual scene contains animals or buildings feels trivial to us, for example. To perform activities like these, artificial neural networks require careful design by experts over years of difficult research, and typically address one specific task, such as to find what's in a photograph, to call a genetic variant, or to help diagnose a disease. Ideally, one would want to have an automated method to generate the right architecture for any given task.

One approach to generate these architectures is through the use of evolutionary algorithms. Traditional research into neuro-evolution of topologies (e.g. Stanley and Miikkulainen 2002) has laid the foundations that allow us to apply these algorithms at scale today, and many groups are working on the subject, including OpenAI, Uber Labs, Sentient Labs and DeepMind. Of course, the Google Brain team has been thinking about AutoML too. In addition to learning-based approaches (eg. reinforcement learning), we wondered if we could use our computational resources to programmatically evolve image classifiers at unprecedented scale. Can we achieve solutions with minimal expert participation? How good can today's artificially-evolved neural networks be? We address these questions through two papers.

In “Large-Scale Evolution of Image Classifiers,” presented at ICML 2017, we set up an evolutionary process with simple building blocks and trivial initial conditions. The idea was to "sit back" and let evolution at scale do the work of constructing the architecture. Starting from very simple networks, the process found classifiers comparable to hand-designed models at the time. This was encouraging because many applications may require little user participation. For example, some users may need a better model but may not have the time to become machine learning experts. A natural question to consider next was whether a combination of hand-design and evolution could do better than either approach alone. Thus, in our more recent paper, “Regularized Evolution for Image Classifier Architecture Search” (2018), we participated in the process by providing sophisticated building blocks and good initial conditions (discussed below). Moreover, we scaled up computation using Google's new TPUv2 chips. This combination of modern hardware, expert knowledge, and evolution worked together to produce state-of-the-art models on CIFAR-10 and ImageNet, two popular benchmarks for image classification.

A Simple Approach
The following is an example of an experiment from our first paper. In the figure below, each dot is a neural network trained on the CIFAR-10 dataset, which is commonly used to train image classifiers. Initially, the population consists of one thousand identical simple seed models (no hidden layers). Starting from simple seed models is important — if we had started from a high-quality model with initial conditions containing expert knowledge, it would have been easier to get a high-quality model in the end. Once seeded with the simple models, the process advances in steps. At each step, a pair of neural networks is chosen at random. The network with higher accuracy is selected as a parent and is copied and mutated to generate a child that is then added to the population, while the other neural network dies out. All other networks remain unchanged during the step. With the application of many such steps in succession, the population evolves.
Progress of an evolution experiment. Each dot represents an individual in the population. The four diagrams show examples of discovered architectures. These correspond to the best individual (rightmost; selected by validation accuracy) and three of its ancestors.
The mutations in our first paper are purposefully simple: remove a convolution at random, add a skip connection between arbitrary layers, or change the learning rate, to name a few. This way, the results show the potential of the evolutionary algorithm, as opposed to the quality of the search space. For example, if we had used a single mutation that transforms one of the seed networks into an Inception-ResNet classifier in one step, we would be incorrectly concluding that the algorithm found a good answer. Yet, in that case, all we would have done is hard-coded the final answer into a complex mutation, rigging the outcome. If instead we stick with simple mutations, this cannot happen and evolution is truly doing the job. In the experiment in the figure, simple mutations and the selection process cause the networks to improve over time and reach high test accuracies, even though the test set had never been seen during the process. In this paper, the networks can also inherit their parent's weights. Thus, in addition to evolving the architecture, the population trains its networks while exploring the search space of initial conditions and learning-rate schedules. As a result, the process yields fully trained models with optimized hyperparameters. No expert input is needed after the experiment starts.

In all the above, even though we were minimizing the researcher's participation by having simple initial architectures and intuitive mutations, a good amount of expert knowledge went into the building blocks those architectures were made of. These included important inventions such as convolutions, ReLUs and batch-normalization layers. We were evolving an architecture made up of these components. The term "architecture" is not accidental: this is analogous to constructing a house with high-quality bricks.

Combining Evolution and Hand Design
After our first paper, we wanted to reduce the search space to something more manageable by giving the algorithm fewer choices to explore. Using our architectural analogy, we removed all the possible ways of making large-scale errors, such as putting the wall above the roof, from the search space. Similarly with neural network architecture searches, by fixing the large-scale structure of the network, we can help the algorithm out. So how to do this? The inception-like modules introduced in Zoph et al. (2017) for the purpose of architecture search proved very powerful. Their idea is to have a deep stack of repeated modules called cells. The stack is fixed but the architecture of the individual modules can change.
The building blocks introduced in Zoph et al. (2017). The diagram on the left is the outer structure of the full neural network, which parses the input data from bottom to top through a stack of repeated cells. The diagram on the right is the inside structure of a cell. The goal is to find a cell that yields an accurate network.
In our second paper, “Regularized Evolution for Image Classifier Architecture Search” (2018), we presented the results of applying evolutionary algorithms to the search space described above. The mutations modify the cell by randomly reconnecting the inputs (the arrows on the right diagram in the figure) or randomly replacing the operations (for example, they can replace the "max 3x3" in the figure, a max-pool operation, with an arbitrary alternative). These mutations are still relatively simple, but the initial conditions are not: the population is now initialized with models that must conform to the outer stack of cells, which was designed by an expert. Even though the cells in these seed models are random, we are no longer starting from simple models, which makes it easier to get to high-quality models in the end. If the evolutionary algorithm is contributing meaningfully, the final networks should be significantly better than the networks we already know can be constructed within this search space. Our paper shows that evolution can indeed find state-of-the-art models that either match or outperform hand-designs.

A Controlled Comparison
Even though the mutation/selection evolutionary process is not complicated, maybe an even more straightforward approach (like random search) could have done the same. Other alternatives, though not simpler, also exist in the literature (like reinforcement learning). Because of this, the main purpose of our second paper was to provide a controlled comparison between techniques.
Comparison between evolution, reinforcement learning, and random search for the purposes of architecture search. These experiments were done on the CIFAR-10 dataset, under the same conditions as Zoph et al. (2017), where the search space was originally used with reinforcement learning.
The figure above compares evolution, reinforcement learning, and random search. On the left, each curve represents the progress of an experiment, showing that evolution is faster than reinforcement learning in the earlier stages of the search. This is significant because with less compute power available, the experiments may have to stop early. Moreover evolution is quite robust to changes in the dataset or search space. Overall, the goal of this controlled comparison is to provide the research community with the results of a computationally expensive experiment. In doing so, it is our hope to facilitate architecture searches for everyone by providing a case study of the relationship between the different search algorithms. Note, for example, that the figure above shows that the final models obtained with evolution can reach very high accuracy while using fewer floating-point operations.

One important feature of the evolutionary algorithm we used in our second paper is a form of regularization: instead of letting the worst neural networks die, we remove the oldest ones — regardless of how good they are. This improves robustness to changes in the task being optimized and tends to produce more accurate networks in the end. One reason for this may be that since we didn't allow weight inheritance, all networks must train from scratch. Therefore, this form of regularization selects for networks that remain good when they are re-trained. In other words, because a model can be more accurate just by chance — noise in the training process means even identical architectures may get different accuracy values — only architectures that remain accurate through the generations will survive in the long run, leading to the selection of networks that retrain well. More details of this conjecture can be found in the paper.

The state-of-the-art models we evolved are nicknamed AmoebaNets, and are one of the latest results from our AutoML efforts. All these experiments took a lot of computation — we used hundreds of GPUs/TPUs for days. Much like a single modern computer can outperform thousands of decades-old machines, we hope that in the future these experiments will become household. Here we aimed to provide a glimpse into that future.

We would like to thank Alok Aggarwal, Yanping Huang, Andrew Selle, Sherry Moore, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Alex Kurakin, Quoc Le, Barret Zoph, Jon Shlens, Vijay Vasudevan, Vincent Vanhoucke, Megan Kacholia, Jeff Dean, and the rest of the Google Brain team for the collaborations that made this work possible.
Read the whole story
95 days ago
Share this story

A free plug-in brings extreme PaulStretch stretching to your DAW

1 Comment

You’ve heard Justin Bieber mangled into gorgeous ambient cascades of sound. Now, you can experience the magic of PaulStretch as a free plug-in.

It may give you that “A-ha” moment in ambient music. You know:

The developer has various warnings about using this plug-in, which for me make me want to use it even more. (Hey, no latency reporting to the DAW? Something weird in Cubase! No manual? Who cares! Let’s give it a go – first I’m going to run with scissors to grab a beer which I’ll drink at my laptop!)


The plugin is only suitable for radical transformation of sounds. It is not suitable at all for subtle time corrections and such. Ambient music and sound design are probably the most suitable use cases.

You had me at radical / not subtle.

Okay… yeah, this was probably meant for me:

You can use it two ways: either load an audio file, and just run PaulStretch in your DAW, or use it as a live processor on inputs. (That’s weird, given what it does – hey, there was some latency. Like… a whole lot of latency.)

It’s on Mac and Windows but code is available and Linux is “likely.”


If you want the original:



That does other nifty tricks, like binaural beats and additional stretch modes. And it’s more reliable than this plug-in. The one catch: building for Windows and Linux is easy enough, and there are pre-built binaries available, but Mac users are mostly on their own. There was one project a few years back – but it seems it hasn’t been updated. (The blog post explains some of the complexity):


But the plug-in I think just became the easiest way to use it. Now go forth and make long sounds and chill to them.

The post A free plug-in brings extreme PaulStretch stretching to your DAW appeared first on CDM Create Digital Music.

Read the whole story
118 days ago
wow listen
Share this story

Balls Into Bins In Distributed Systems

1 Share

Balls Into Bins In Distributed Systems

Throwing things can be fun.

If you've come across the Balls Into Bins problem, you probably heard about in context of hash tables. When you hash things into a hash table (especially with separate chaining) it's really useful to be able to ask "If I throw đ‘€ balls into đ‘ bins, what is the distribution of balls in bins?" You can see how this is fundamental to hash tables: the amortized complexity argument for hash tables depends on their being some load factor (i.e. đ‘€/đ‘) for which most bins contain a small number of items. Once this stops being true, lookup and insertion time on hash tables starts to get ugly. So from that perspective it's already clearly an important problem.

Load Balancing and Work Allocation

Hash tables aren't the only place that the Balls Into Bins problem is interesting. It comes up often in distributed systems, too. For one example, think about a load balancer (in this case a distributor of independent requests) sending load to some number of backends. Requests (đ‘€) are balls, and the backends are bins (đ‘) and typically there are multiple requests going to each backend (đ‘€ > đ‘). If we know how to solve for the number of balls in each bin, we can understand the limits of random load balancing, or whether we need a stateful load balancing algorithm like least connections. This is an important question to ask, because sharing consistent state limits scalability, and sharing eventually-consistent state can even make load balancing decisions worse. Load balancing is much easier if it can be done statelessly.

A related problem is push-based work allocation. Here, there is some co-ordinator handing out work items to a fleet of workers, and trying to have those workers do approximately equal amounts of work. One way that systems end up with this pattern is if they are using shuffle sharding or consistent hashing to distribute work items (or records). These hashing-based methods can be great for scaling, and so are widely used across all kinds of large-scale systems. Just as with load balancing, its interesting to be able to understand how well requests are distributed.

Traditionally, papers about this problem have been most concerned about the expectation of the maximum number of balls in a bin ("how bad can it get?"), but other statistics like the expectation of the mean and expectation of the median can be interesting when planning and designing for load. It's also interesting to understand the variance of the maximum, and the size of the right tail on the distribution of the maximum. If the maximum can get really high, but will do so infrequently, then load testing can be difficult.

Closed-Form Analysis

Gaston Gonnet's Expected Length of the Longest Probe Sequence in Hash Code Searching, was one of the first papers to tackle analyzing the problem, in context of separate-chaining hash tables2. Michael Mitzenmacher's PhD thesis (the power of two choices in randomized load balancing) simplifies Gonnet's analysis and finds, for the đ‘€=đ‘ case, the maximum number of balls is1 𝝝(log đ‘/log log đ‘). That's not a curve you'll come across often, so this is what it looks like:

In other words, it grows, but grows pretty slowly. Most real-world cases are probably going to have đ‘€>đ‘, and many will have đ‘€â‰Ťđ‘. To understand those cases, we can turn to one of my favorite papers in this area, Balls Into Bins: A Simple and Tight Analysis by Raab and Steger. They provide a great overview of the problem, a useful survey of the literature, and a table of bounds on the maximum number of balls in any bin, depending on the relationship between đ‘€ and đ‘. The proofs are interesting, and somewhat enlightening, but not necessary to understand to find the paper useful.

While this kind of analysis is very useful, when I've needed to solve these problems in the past, I haven't tended to use the results directly. Instead, I've used them to sanity check the output of simulations. This is where engineering practice diverges a bit from computer science theory (although it does it in a pretty theoretically rigorous way).

Simulating the Problem

There are a couple of limitations on the usefulness of the closed-form analysis of this problem. One problem is that it's fairly difficult to understand clearly (at least for me), and quite complex to communicate. The bigger problem, though, is that it's quite inflexible: extending the analysis to include cases like balls of different sizes (as requests are in the real world) and balls coming out of bins (requests completing) is difficult, and difficult to code review unless you are lucky enough to work with a team that's very mathematically sophisticated. The good news, though, is that this problem is exceptionally easy to simulate.

When I think about doing these kinds of simulations, I don't generally think about using specialized simulation tools or frameworks (although you could certainly do that). Instead, I generally think about just writing a few tens of lines of Python or R which directly try the thing that I want an answer for many times, and then output data in a form that's easy to plot. Computer simulation is a broad and subtle topic, but this kind of thing (throw balls into bins, count, repeat) avoids many of the subtleties because you can avoid floating point (its just counting) and because you can avoid being too concerned about the exact values.

Knowing the closed-form analysis makes it easy to sanity-check the simulation. According to Gonnet, the đ‘€=đ‘ case should approach logđ‘/loglogđ‘ (1+đ‘œ(1)), and we can plot that curve (choosing a value of đ‘œ(1) to minimize the difference) alongside the simulation results to see if the simulation matches the theory. The results look pretty good.

Gonnet's paper also contains a table of example values, which compare very well to our simulated and modelled numbers. That all increases our confidence that the simulation is telling us sensible things.

You can also extend this basic counting to be closer to real-world load-balancing. Follow a Poisson process (a fancy way of saying "use exponentially distributed random numbers to decide how long to wait") to add random balls into bins over time, and follow your completion time distribution (probably exponential too) to pull them out of the bins. Every so often, sample the size of the biggest bin. Next, output those samples for analysis. If you have real arrival-time data, and completion-time distributions, you can use those to avoid making any statistical assumptions. Which is nice.

When you've got the basic simulation, it's easy to add in things like different-sized requests, or bursts of traffic, or the effects of scaling up and down the backends.

Some Basic Results

For small M and N, the constant factors are a big problem. With đ‘€=đ‘=100, I get an expected maximum of around 4.2. In other words, we can expect the busiest backend to be over 4x busier than the average. That means that you either need to significantly over-scale all your backends, or put up with the problems that come with hotspotting. This problem also gets worse (although very very slowly, going back to the closed-form) with scale.

In the closer-to-reality case with đ‘€=1000 and đ‘=100, the gap shrinks. The expected maximum comes to as 18.8, compared to a mean (aka đ‘€/đ‘) of 10. That still means that the hottest backend gets 80% more traffic, but the gap is closing. By đ‘€=10000 and đ‘=100, the gap has closed to 25%, which starts to be close to the realm of acceptable. Up to đ‘€=100,000 and the gap is closed to 8%. In most distributed systems contexts, 8% is probably within the variation in performance due to other factors.

Still, the conclusion of all of this is that random load balancing (and random shuffle-sharding, and random consistent hashing) distributed load rather poorly when đ‘€ is small. Load-sensitive load-balancing, either stateful or stateless with an algorithm like best-of-two is still very much interesting and relevant. The world would be simpler and more convenient if that wasn't the case, but it is.


  1. That's a Big Theta, if you're not familiar with it wikipedia has a good explanation of what it means. If you don't feel like reading that, and replace it with a big O in your head, that's close enough in this case.
  2. That paper also contains some great analysis on the numerical properties of different hash-table probing strategies versus seperate chaining. If you like algorithm analysis, the conclusion section is particularly interesting.
Read the whole story
149 days ago
Share this story




More cat adventures.

Read the whole story
213 days ago
Share this story

Announcing OpenFermion: The Open Source Chemistry Package for Quantum Computers


“The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble.”
-Paul Dirac, Quantum Mechanics of Many-Electron Systems (1929)

In this passage, physicist Paul Dirac laments that while quantum mechanics accurately models all of chemistry, exactly simulating the associated equations appears intractably complicated. Not until 1982 would Richard Feynman suggest that instead of surrendering to the complexity of quantum mechanics, we might harness it as a computational resource. Hence, the original motivation for quantum computing: by operating a computer according to the laws of quantum mechanics, one could efficiently unravel exact simulations of nature. Such simulations could lead to breakthroughs in areas such as photovoltaics, batteries, new materials, pharmaceuticals and superconductivity. And while we do not yet have a quantum computer large enough to solve classically intractable problems in these areas, rapid progress is being made. Last year, Google published this paper detailing the first quantum computation of a molecule using a superconducting qubit quantum computer. Building on that work, the quantum computing group at IBM scaled the experiment to larger molecules, which made the cover of Nature last month.

Today, we announce the release of OpenFermion, the first open source platform for translating problems in chemistry and materials science into quantum circuits that can be executed on existing platforms. OpenFermion is a library for simulating the systems of interacting electrons (fermions) which give rise to the properties of matter. Prior to OpenFermion, quantum algorithm developers would need to learn a significant amount of chemistry and write a large amount of code hacking apart other codes to put together even the most basic quantum simulations. While the project began at Google, collaborators at ETH Zurich, Lawrence Berkeley National Labs, University of Michigan, Harvard University, Oxford University, Dartmouth College, Rigetti Computing and NASA all contributed to alpha releases. You can learn more details about this release in our paper, OpenFermion: The Electronic Structure Package for Quantum Computers.

One way to think of OpenFermion is as a tool for generating and compiling physics equations which describe chemical and material systems into representations which can be interpreted by a quantum computer1. The most effective quantum algorithms for these problems build upon and extend the power of classical quantum chemistry packages used and developed by research chemists across government, industry and academia. Accordingly, we are also releasing OpenFermion-Psi4 and OpenFermion-PySCF which are plugins for using OpenFermion in conjunction with the classical electronic structure packages Psi4 and PySCF.

The core OpenFermion library is designed in a quantum programming framework agnostic way to ensure compatibility with various platforms being developed by the community. This allows OpenFermion to support external packages which compile quantum assembly language specifications for diverse hardware platforms. We hope this decision will help establish OpenFermion as a community standard for putting quantum chemistry on quantum computers. To see how OpenFermion is used with diverse quantum programming frameworks, take a look at OpenFermion-ProjectQ and Forest-OpenFermion - plugins which link OpenFermion to the externally developed circuit simulation and compilation platforms known as ProjectQ and Forest.

The following workflow describes how a quantum chemist might use OpenFermion in order to simulate the energy surface of a molecule (for instance, by preparing the sort of quantum computation we described in our past blog post):
  1. The researcher initializes an OpenFermion calculation with specification of:
    • An input file specifying the coordinates of the nuclei in the molecule.
    • The basis set (e.g. cc-pVTZ) that should be used to discretize the molecule.
    • The charge and spin multiplicity (if known) of the system.
  1. The researcher uses the OpenFermion-Psi4 plugin or the OpenFermion-PySCF plugin to perform scalable classical computations which are used to optimally stage the quantum computation. For instance, one might perform a classical Hartree-Fock calculation to choose a good initial state for the quantum simulation.
  2. The researcher then specifies which electrons are most interesting to study on a quantum computer (known as an active space) and asks OpenFermion to map the equations for those electrons to a representation suitable for quantum bits, using one of the available procedures in OpenFermion, e.g. the Bravyi-Kitaev transformation.
  3. The researcher selects a quantum algorithm to solve for the properties of interest and uses a quantum compilation framework such as OpenFermion-ProjectQ to output the quantum circuit in assembly language which can be run on a quantum computer. If the researcher has access to a quantum computer, they then execute the experiment.
A few examples of what one might do with OpenFermion are demonstrated in ipython notebooks here, here and here. While quantum simulation is widely recognized as one of the most important applications of quantum computing in the near term, very few quantum computer scientists know quantum chemistry and even fewer chemists know quantum computing. Our hope is that OpenFermion will help to close the gap between these communities and bring the power of quantum computing to chemists and material scientists. If you’re interested, please checkout our GitHub repository - pull requests welcome!

1 If we may be allowed one sentence for the experts: the primary function of OpenFermion is to encode the electronic structure problem in second quantization defined by various basis sets and active spaces and then to transform those operators into spin Hamiltonians using various isomorphisms between qubit and fermion algebras.

Read the whole story
239 days ago
Share this story
Next Page of Stories