François Chollet @fchollet Deep learning @google. Creator of Keras, neural networks library. Author of 'Deep Learning with Python'. Opinions are my own. May. 26, 2019 1 min read

This is how you implement a network in Chainer. Chainer, the original eager-first deep learning framework, has had this API since launch, in mid-2015.

When PyTorch got started, it followed the Chainer template (in fact, the prototype of PyTorch was literally a fork of Chainer).

Nearly every day, I am getting ignorant messages saying, "PyTorch is an original innovation that TensorFlow/Keras copied". This is incorrect. Subclassing is a fairly obvious way to do things in Python, and Chainer had this API first. Many others followed.

I had been looking at adding a Model subclassing API to Keras as soon as late 2015 (before the Functional API even existed, and over a year before being aware of PyTorch), inspired by Chainer. Our first discussions about adding an eager execution mode also predate PyTorch.

By the time PyTorch came out, I had been looking at its API (which is exactly the Chainer API) for 1.5 year (since the release of Chainer). It wasn't exactly a shock. There was nothing we didn't already know.

To be clear, it's a good thing that API patterns and technical innovations are cross-pollinating among deep learning framework. The Keras API itself has a had a pretty big influence over libraries that came after. It's completely fine, and it all benefits end users.

But please stop saying, "TensorFlow/Keras copied PyTorch". It's an extremely ignorant take, not only false but also pretty offensive (especially to the Chainer folks).

