François Chollet
+ Your AuthorsArchive @fchollet Deep learning @google. Creator of Keras. Author of 'Deep Learning with Python'. Opinions are my own. Mar. 11, 2019 3 min read

Are you a deep learning researcher? Wondering if all this TensorFlow 2.0 stuff you heard about is relevant to you?

This thread is a crash course on everything you need to know to use TensorFlow 2.0 + Keras for deep learning research. Read on!

1) The first class you need to know is `Layer`. A Layer encapsulates a state (weights) and some computation (defined in the `call` method).

2) The `add_weight` method gives you a shortcut for creating weights.

3) It’s good practice to create weights in a separate `build` method, called lazily with the shape of the first inputs seen by your layer. Here, this pattern prevents us from having to specify `input_dim`:

4) You can automatically retrieve the gradients of the weights of a layer by calling it inside a GradientTape. Using these gradients, you can update the weights of the layer, either manually, or using an optimizer object. Of course, you can modify the gradients before using them.

5) Weights created by layers can be either trainable or non-trainable. They're exposed in the layer properties `trainable_weights` and `non_trainable_weights`. Here's a layer with a non-trainable weight:

6) Layers can be recursively nested to create bigger computation blocks. Each layer will track the weights of its sublayers (both trainable and non-trainable).

7) Layers can create losses during the forward pass. This is especially useful for regularization losses. The losses created by sublayers are recursively tracked by the parent layers.

8) These losses are cleared by the top-level layer at the start of each forward pass -- they don't accumulate. `layer.losses` always contain only the losses created during the *last* forward pass. You would typically use these losses by summing them when writing a training loop.

9) You know that TF 2.0 is eager by default. Running eagerly is great for debugging, but you will get better performance by compiling your computation into static graphs. Static graphs are a researcher's best friends! You can compile any function by wrapping it in a tf.function:

10) Some layers, in particular the `BatchNormalization` layer and the `Dropout` layer, have different behaviors during training and inference. For such layers, it is standard practice to expose a `training` (boolean) argument in the `call` method.

11) You have many built-in layers available, from Dense to Conv2D to LSTM to fancier ones like Conv2DTranspose or ConvLSTM2D. Be smart about reusing built-in functionality.

12) To build deep learning models, you don't have to use object-oriented programming all the time. All layers we've seen so far can also be composed functionally, like this (we call it the "Functional API"):

The Functional API tends to be more concise than subclassing, & provides a few other advantages (generally the same advantages that functional, typed languages provide over untyped OO development).

Learn more about the Functional API: 

However, note that the Functional API can only be used to define DAGs of layers -- recursive networks should be defined as `Layer` subclasses instead.

In your research workflows, you may often find yourself mix-and-matching OO models and Functional models.

That's all you need to get started with reimplementing most deep learning research papers in TensorFlow 2.0 and Keras!

Now let's check out a really quick example: hypernetworks.

A hypernetwork is a deep neural network whose weights are generated by another network (usually smaller).

Let's implement a really trivial hypernetwork: we'll take the `Linear` layer we defined earlier, and we'll use it to generate the weights of... another `Linear` layer.

Another quick example: implementing a VAE in either style, either subclassing (left) or the Functional API (right). I've posted this before. Find what works best for you!

This is the end of this thread. Play with these code examples in this Colab notebook:  🦄🚀

You can follow @fchollet.


Tip: mention @threader_app on a Twitter thread with the keyword “compile” to get a link to it.

Threader is an independent, ad-free project created by two developers. Our iOS Twitter client was featured as an App of the Day by Apple. Sign up today to compile, bookmark and archive your favorite threads.

Follow Threader