feed.xml

<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.7.3">Jekyll</generator><link href="https://pytorch.org/feed.xml" rel="self" type="application/atom+xml" /><link href="https://pytorch.org/" rel="alternate" type="text/html" /><updated>2018-10-30T07:52:48-07:00</updated><id>https://pytorch.org/</id><title type="html">PyTorch Website</title><subtitle>Scientific Computing...</subtitle><author><name>Facebook</name></author><entry><title type="html">The road to 1.0: production ready PyTorch</title><link href="https://pytorch.org/blog/the-road-to-1_0/" rel="alternate" type="text/html" title="The road to 1.0: production ready PyTorch" /><published>2018-05-02T00:00:00-07:00</published><updated>2018-05-02T00:00:00-07:00</updated><id>https://pytorch.org/blog/the-road-to-1_0</id><content type="html" xml:base="https://pytorch.org/blog/the-road-to-1_0/">&lt;p&gt;We would like to give you a preview of the roadmap for PyTorch 1.0 , the next release of PyTorch. Over the last year, we’ve had 0.2, 0.3 and 0.4 transform PyTorch from a [Torch+Chainer]-like interface into something cleaner, adding double-backwards, numpy-like functions, advanced indexing and removing Variable boilerplate. At this time, we’re confident that the API is in a reasonable and stable state to confidently release a 1.0.&lt;/p&gt;

&lt;p&gt;However, 1.0 isn’t just about stability of the interface.&lt;/p&gt;

&lt;p&gt;One of PyTorch’s biggest strengths is its first-class Python integration, imperative style, simplicity of the API and options. These are aspects that make PyTorch good for research and hackability.&lt;/p&gt;

&lt;p&gt;One of its biggest downsides has been production-support. What we mean by production-support is the countless things one has to do to models to run them efficiently at massive scale:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;exporting to C++-only runtimes for use in larger projects&lt;/li&gt;
  &lt;li&gt;optimizing mobile systems on iPhone, Android, Qualcomm and other systems&lt;/li&gt;
  &lt;li&gt;using more efficient data layouts and performing kernel fusion to do faster inference (saving 10% of speed or memory at scale is a big win)&lt;/li&gt;
  &lt;li&gt;quantized inference (such as 8-bit inference)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Startups, large companies and anyone who wants to build a product around PyTorch have asked for production support. At Facebook (the largest stakeholder for PyTorch) we have Caffe2, which has been the production-ready platform, running in our datacenters and shipping to more than 1 billion phones spanning eight generations of iPhones and six generations of Android CPU architectures. It has server-optimized inference on Intel / ARM, TensorRT support, and all the necessary bits for production. Considering all this value locked-in to a platform that the PyTorch team works quite closely with, &lt;strong&gt;we decided to marry PyTorch and Caffe2 which gives the production-level readiness for PyTorch&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Supporting production features without adding usability issues for our researchers and end-users needs creative solutions.&lt;/p&gt;

&lt;h2 id=&quot;production--pain-for-researchers&quot;&gt;Production != Pain for researchers&lt;/h2&gt;

&lt;p&gt;Adding production capabilities involves increasing the API complexity and number of configurable options for models. One configures memory-layouts (NCHW vs NHWC vs N,C/32,H,W,32, each providing different performance characteristics), quantization (8-bit? 3-bit?), fusion of low-level kernels (you used a Conv + BatchNorm + ReLU, let’s fuse them into a single kernel), separate backend options (MKLDNN backend for a few layers and NNPACK backend for other layers), etc.&lt;/p&gt;

&lt;p&gt;PyTorch’s central goal is to provide a great platform for research and hackability. So, while we add all these optimizations, we’ve been working with a hard design constraint to never trade these off against usability.&lt;/p&gt;

&lt;p&gt;To pull this off, we are introducing &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.jit&lt;/code&gt;, a just-in-time (JIT) compiler that at runtime takes your PyTorch models and rewrites them to run at production-efficiency. The JIT compiler can also export your model to run in a C++-only runtime based on Caffe2 bits.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;In 1.0, your code continues to work as-is, we’re not making any big changes to the existing API.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Making your model production-ready is an opt-in annotation, which uses the &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.jit&lt;/code&gt; compiler to export your model to a Python-less environment, and improving its performance. Let’s walk through the JIT compiler in detail.&lt;/p&gt;

&lt;h2 id=&quot;torchjit-a-jit-compiler-for-your-models&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.jit&lt;/code&gt;: A JIT-compiler for your models&lt;/h2&gt;

&lt;p&gt;We strongly believe that it’s hard to match the productivity you get from specifying your models directly as idiomatic Python code. This is what makes PyTorch so flexible, but it also means that PyTorch pretty much never knows the operation you’ll run next. This however is a big blocker for export/productionization and heavyweight automatic performance optimizations because they need full upfront knowledge of how the computation will look before it even gets executed.&lt;/p&gt;

&lt;p&gt;We provide two opt-in ways of recovering this information from your code, one based on tracing native python code and one based on compiling a subset of the python language annotated into a python-free intermediate representation. After thorough discussions we concluded that they’re both going to be useful in different contexts, and as such you will be able to mix and match them freely.&lt;/p&gt;

&lt;h2 id=&quot;tracing-mode&quot;&gt;Tracing Mode&lt;/h2&gt;

&lt;p&gt;The PyTorch tracer, &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.jit.trace&lt;/code&gt;, is a function that records all the native PyTorch operations performed in a code region, along with the data dependencies between them. In fact, PyTorch has had a tracer since 0.3, which has been used for exporting models through ONNX. What changes now, is that you no longer necessarily need to take the trace and run it elsewhere - PyTorch can re-execute it for you, using a carefully designed high-performance C++ runtime. As we develop PyTorch 1.0 this runtime will integrate all the optimizations and hardware integrations that Caffe2 provides.&lt;/p&gt;

&lt;p&gt;The biggest benefit of this approach is that it doesn’t really care how your Python code is structured — you can trace through generators or coroutines, modules or pure functions. Since we only record native PyTorch operators, these details have no effect on the trace recorded. This behavior, however, is a double-edged sword. For example, if you have a loop in your model, it will get unrolled in the trace, inserting a copy of the loop body for as many times as the loop ran. This opens up opportunities for zero-cost abstraction (e.g. you can loop over modules, and the actual trace will be loop-overhead free!), but on the other hand this will also affect data dependent loops (think of e.g. processing sequences of varying lengths), effectively hard-coding a single length into the trace.&lt;/p&gt;

&lt;p&gt;For networks that do not contain loops and if statements, tracing is non-invasive and is robust enough to handle a wide variety of coding styles. This code example illustrates what tracing looks like:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# This will run your nn.Module or regular Python function with the example&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# input that you provided. The returned callable can be used to re-execute&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# all operations that happened during the example run, but it will no longer&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# use the Python interpreter.&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;torch.jit&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;trace&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;traced_model&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;trace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;example_input&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;traced_fn&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;trace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;example_input&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# The training loop doesn't change. Traced model behaves exactly like an&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# nn.Module, except that you can't edit what it does or change its attributes.&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# Think of it as a &quot;frozen module&quot;.&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data_loader&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;loss&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;loss_fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;traced_model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;script-mode&quot;&gt;Script Mode&lt;/h2&gt;

&lt;p&gt;Tracing mode is a great way to minimize the impact on your code, but we’re also very excited about the models that fundamentally make use of control flow such as RNNs. Our solution to this is a scripting mode.&lt;/p&gt;

&lt;p&gt;In this case you write out a regular Python function, except that you can no longer use certain more complicated language features. Once you isolated the desired functionality, you let us know that you’d like the function to get compiled by decorating it with an &lt;code class=&quot;highlighter-rouge&quot;&gt;@script&lt;/code&gt; decorator. This annotation will transform your python function directly into our high-performance C++ runtime. This lets us recover all the PyTorch operations along with loops and conditionals. They will be embedded into our internal representation of this function, and will be accounted for every time this function is run.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;torch.jit&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;script&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@script&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;rnn_loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x_t&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;optimization-and-export&quot;&gt;Optimization and Export&lt;/h2&gt;

&lt;p&gt;Regardless of whether you use tracing or &lt;code class=&quot;highlighter-rouge&quot;&gt;@script&lt;/code&gt;, the result is a python-free representation of your model, which can be used to optimize the model or to export the model from python for use in production environments.&lt;/p&gt;

&lt;p&gt;Extracting bigger segments of the model into an intermediate representation makes it possible to do sophisticated whole-program optimizations and to offload computation to specialized AI accelerators which operate on graphs of computation. We have already been developing the beginnings of these optimizations, including passes that fuse GPU operations together to improve the performance of smaller RNN models.&lt;/p&gt;

&lt;p&gt;It also allows us to use existing high-performance backends available in Caffe2 today to run the model efficiently. Additionally, @script functions (and modules!) can be fully exported to ONNX in a way that retains their dynamic nature, such that you can easily run them in a Python-free environment using the model executors from Caffe2 or by transferring the model to any other framework supporting ONNX.&lt;/p&gt;

&lt;h2 id=&quot;usability&quot;&gt;Usability&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;We care deeply about maintaining our current level of usability and we know that execution of the code not directly in Python leads to harder debugging, but this is something that we think about a lot, and we’re making sure that you’re not getting locked in to a completely different programming language.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;First, we follow the principle of pay for what you use — if you don’t need to optimize or export your model, you do not have to use these new features and won’t see any downsides. Furthermore, use of traced or @script modules/functions can be done incrementally. For instance, all of these behaviors are allowed: You can trace part of your model and use the trace in a larger non-traced model. You can use tracing for 90% of your model, and use @script for the one sub-module that actually has some control flow in it. You can write a function using @script and have it call a native python function. If something appears incorrect in an @script function, you can remove the annotation and the code will execute in native python where it is easy to debug using your favorite tools and methods. Think of tracing and @script like type annotations using MyPy or TypeScript — each additional annotation can be tested incrementally, and none are required until you want to optimize or productionize.&lt;/p&gt;

&lt;p&gt;Most importantly, these modes will be built into the core of PyTorch so that mixing and matching them with your existing code can happen seamlessly.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: The name JIT for these components is a bit of a misnomer and comes from historical reasons. The tracing/function execution in PyTorch started out as an optimizing JIT compiler that generated fused CUDA kernels but then grew to encompass optimization, @script, and export. When it is ready for release we will likely rename this functionality to the hybrid frontend, but we wanted to present it here as it is named in the code so that you can follow along as we develop it.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;other-changes-and-improvements&quot;&gt;Other changes and improvements&lt;/h2&gt;

&lt;p&gt;Production support is the big feature for 1.0, but we will continue optimizing and fixing other parts of PyTorch as course of the standard release process.&lt;/p&gt;

&lt;p&gt;On the backend side of things, PyTorch will see some changes, which might affect user-written C and C++ extensions. We are replacing (or refactoring) the backend ATen library to incorporate features and optimizations from Caffe2.&lt;/p&gt;

&lt;h2 id=&quot;last-words&quot;&gt;Last Words&lt;/h2&gt;

&lt;p&gt;We aim to release 1.0 some time during the summer. You can follow-along our progress on the &lt;a href=&quot;https://github.com/pytorch/pytorch/pulls&quot;&gt;Pull Requests&lt;/a&gt; page.&lt;/p&gt;

&lt;p&gt;You can read this from the perspective of the Caffe2 project at: &lt;a href=&quot;https://caffe2.ai/blog/2018/05/02/Caffe2_PyTorch_1_0.html&quot;&gt;https://caffe2.ai/blog/2018/05/02/Caffe2_PyTorch_1_0.html&lt;/a&gt;&lt;/p&gt;</content><author><name>The PyTorch Team</name></author><summary type="html">We would like to give you a preview of the roadmap for PyTorch 1.0 , the next release of PyTorch. Over the last year, we’ve had 0.2, 0.3 and 0.4 transform PyTorch from a [Torch+Chainer]-like interface into something cleaner, adding double-backwards, numpy-like functions, advanced indexing and removing Variable boilerplate. At this time, we’re confident that the API is in a reasonable and stable state to confidently release a 1.0.</summary></entry><entry><title type="html">PyTorch 0.4.0 Migration Guide</title><link href="https://pytorch.org/blog/pytorch-0_4_0-migration-guide/" rel="alternate" type="text/html" title="PyTorch 0.4.0 Migration Guide" /><published>2018-04-22T00:00:00-07:00</published><updated>2018-04-22T00:00:00-07:00</updated><id>https://pytorch.org/blog/pytorch-0_4_0-migration-guide</id><content type="html" xml:base="https://pytorch.org/blog/pytorch-0_4_0-migration-guide/">&lt;p&gt;Welcome to the migration guide for PyTorch 0.4.0. In this release we introduced &lt;a href=&quot;https://github.com/pytorch/pytorch/releases/tag/v0.4.0&quot;&gt;many exciting new features and critical bug fixes&lt;/a&gt;, with the goal of providing users a better and cleaner interface. In this guide, we will cover the most important changes in migrating existing code from previous versions:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;Tensors&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;Variables&lt;/code&gt; have merged&lt;/li&gt;
  &lt;li&gt;Support for 0-dimensional (scalar) &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensors&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Deprecation of the &lt;code class=&quot;highlighter-rouge&quot;&gt;volatile&lt;/code&gt; flag&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;dtypes&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;devices&lt;/code&gt;, and Numpy-style &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt; creation functions&lt;/li&gt;
  &lt;li&gt;Writing device-agnostic code&lt;/li&gt;
  &lt;li&gt;New edge-case constraints on names of submodules, parameters, and buffers in &lt;code class=&quot;highlighter-rouge&quot;&gt;nn.Module&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;merging-tensor-and-variable-and-classes&quot;&gt;Merging &lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensors.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt;&lt;/a&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;Variable&lt;/code&gt; and classes&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensors.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.Tensor&lt;/code&gt;&lt;/a&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.autograd.Variable&lt;/code&gt; are now the same class. More precisely, &lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensors.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.Tensor&lt;/code&gt;&lt;/a&gt; is capable of tracking history and behaves like the old &lt;code class=&quot;highlighter-rouge&quot;&gt;Variable&lt;/code&gt;; &lt;code class=&quot;highlighter-rouge&quot;&gt;Variable&lt;/code&gt; wrapping continues to work as before but returns an object of type &lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensors.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.Tensor&lt;/code&gt;&lt;/a&gt;. This means that you don’t need the &lt;code class=&quot;highlighter-rouge&quot;&gt;Variable&lt;/code&gt; wrapper everywhere in your code anymore.&lt;/p&gt;

&lt;h3 id=&quot;the-type-of-a-tensor-has-changed&quot;&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;type()&lt;/code&gt; of a &lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensors.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt;&lt;/a&gt; has changed&lt;/h3&gt;

&lt;p&gt;Note also that the &lt;code class=&quot;highlighter-rouge&quot;&gt;type()&lt;/code&gt; of a Tensor no longer reflects the data type. Use &lt;code class=&quot;highlighter-rouge&quot;&gt;isinstance()&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;x.type()&lt;/code&gt;instead:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DoubleTensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# was torch.DoubleTensor&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;&quot;&amp;lt;class 'torch.Tensor'&amp;gt;&quot;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# OK: 'torch.DoubleTensor'&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;'torch.DoubleTensor'&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;isinstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DoubleTensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# OK: True&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;when-does-autograd-start-tracking-history-now&quot;&gt;When does &lt;a href=&quot;http://pytorch.org/docs/0.4.0/autograd.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;autograd&lt;/code&gt;&lt;/a&gt; start tracking history now?&lt;/h3&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;requires_grad&lt;/code&gt;, the central flag for &lt;a href=&quot;http://pytorch.org/docs/0.4.0/autograd.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;autograd&lt;/code&gt;&lt;/a&gt;, is now an attribute on &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensors&lt;/code&gt;. The same rules previously used for &lt;code class=&quot;highlighter-rouge&quot;&gt;Variables&lt;/code&gt; applies to &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensors&lt;/code&gt;; &lt;a href=&quot;http://pytorch.org/docs/0.4.0/autograd.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;autograd&lt;/code&gt;&lt;/a&gt; starts tracking history when any input &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt; of an operation has &lt;code class=&quot;highlighter-rouge&quot;&gt;requires_grad=True&lt;/code&gt;. For example,&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ones&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# create a tensor with requires_grad=False (default)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ones&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# another tensor with requires_grad=False&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;z&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# both inputs have requires_grad=False. so does the output&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# then autograd won't track this computation. let's verify!&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;backward&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;RuntimeError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;of&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tensors&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;does&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;require&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;grad&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;does&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;have&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;grad_fn&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# now create a tensor with requires_grad=True&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;w&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ones&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# add to the previous result that has require_grad=False&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;total&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;w&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# the total sum now requires grad!&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;total&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# autograd can compute the gradients as well&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;total&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;backward&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;grad&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# and no computation is wasted to compute gradients for x, y and z, which don't require grad&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;grad&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;grad&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;grad&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h4 id=&quot;manipulating-requires_grad-flag&quot;&gt;Manipulating &lt;code class=&quot;highlighter-rouge&quot;&gt;requires_grad&lt;/code&gt; flag&lt;/h4&gt;

&lt;p&gt;Other than directly setting the attribute, you can change this flag &lt;code class=&quot;highlighter-rouge&quot;&gt;in-place&lt;/code&gt; using &lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensors.html#torch.Tensor.requires_grad_&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;my_tensor.requires_grad_()&lt;/code&gt;&lt;/a&gt;, or, as in the above example, at creation time by passing it in as an argument (default is &lt;code class=&quot;highlighter-rouge&quot;&gt;False&lt;/code&gt;), e.g.,&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;existing_tensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;requires_grad_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;existing_tensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;my_tensor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zeros&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;my_tensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;what-about-data&quot;&gt;What about &lt;code class=&quot;highlighter-rouge&quot;&gt;.data?&lt;/code&gt;&lt;/h3&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;.data&lt;/code&gt; was the primary way to get the underlying &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt; from a &lt;code class=&quot;highlighter-rouge&quot;&gt;Variable&lt;/code&gt;. After this merge, calling &lt;code class=&quot;highlighter-rouge&quot;&gt;y = x.data&lt;/code&gt; still has similar semantics. So &lt;code class=&quot;highlighter-rouge&quot;&gt;y&lt;/code&gt; will be a &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt; that shares the same data with &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt;, is unrelated with the computation history of &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt;, and has &lt;code class=&quot;highlighter-rouge&quot;&gt;requires_grad=False&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;However, &lt;code class=&quot;highlighter-rouge&quot;&gt;.data&lt;/code&gt; can be unsafe in some cases. Any changes on &lt;code class=&quot;highlighter-rouge&quot;&gt;x.data&lt;/code&gt; wouldn’t be tracked by &lt;code class=&quot;highlighter-rouge&quot;&gt;autograd&lt;/code&gt;, and the computed gradients would be incorrect if &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt; is needed in a backward pass. A safer alternative is to use &lt;a href=&quot;http://pytorch.org/docs/master/autograd.html#torch.Tensor.detach&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;x.detach()&lt;/code&gt;&lt;/a&gt;, which also returns a &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt; that shares data with &lt;code class=&quot;highlighter-rouge&quot;&gt;requires_grad=False&lt;/code&gt;, but will have its in-place changes reported by &lt;code class=&quot;highlighter-rouge&quot;&gt;autograd&lt;/code&gt; if &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt; is needed in backward.&lt;/p&gt;

&lt;p&gt;Here is an example of the difference between &lt;code class=&quot;highlighter-rouge&quot;&gt;.data&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;x.detach()&lt;/code&gt; (and why we recommend using &lt;code class=&quot;highlighter-rouge&quot;&gt;detach&lt;/code&gt; in general).&lt;/p&gt;

&lt;p&gt;If you use &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor.detach()&lt;/code&gt;, the gradient computation is guaranteed to be correct.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigmoid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;detach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zero_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;

&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# modified by c.zero_() !!&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;

&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;backward&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# Requires the original value of out, but that was overwritten by c.zero_()&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;RuntimeError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;one&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;of&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;variables&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;needed&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gradient&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;computation&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;has&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;been&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;modified&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;by&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;an&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;However, using &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor.data&lt;/code&gt; can be unsafe and can easly result in incorrect gradients when a tensor is required for gradient computation but modified in-place.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sigmoid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zero_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;

&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# out  was modified by c.zero_()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;

&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;backward&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;grad&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# The result is very, very wrong because `out` changed!&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;support-for-0-dimensional-scalar-tensors&quot;&gt;Support for 0-dimensional (scalar) Tensors&lt;/h2&gt;

&lt;p&gt;Previously, indexing into a &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt; vector (1-dimensional tensor) gave a Python number but indexing into a &lt;code class=&quot;highlighter-rouge&quot;&gt;Variable&lt;/code&gt; vector gave (incosistently!) a vector of size &lt;code class=&quot;highlighter-rouge&quot;&gt;(1,)&lt;/code&gt;! Similar behavior existed with reduction functions, e.g. &lt;code class=&quot;highlighter-rouge&quot;&gt;tensor.sum()&lt;/code&gt; would return a Python number, but &lt;code class=&quot;highlighter-rouge&quot;&gt;variable.sum()&lt;/code&gt; would return a vector of size &lt;code class=&quot;highlighter-rouge&quot;&gt;(1,)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Fortunately, this release introduces proper scalar (0-dimensional tensor) support in PyTorch! Scalars can be created using the new &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.tensor&lt;/code&gt; function (which will be explained in more detail later; for now just think of it as the PyTorch equivalent of &lt;code class=&quot;highlighter-rouge&quot;&gt;numpy.array&lt;/code&gt;). Now you can do things like:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.1416&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;         &lt;span class=&quot;c&quot;&gt;# create a scalar directly&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.1416&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.1416&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# scalar is 0-dimensional&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([])&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;     &lt;span class=&quot;c&quot;&gt;# compare to a vector of size 1&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vector&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# this is a vector&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vector&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;2.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;3.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;4.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;5.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vector&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;                    &lt;span class=&quot;c&quot;&gt;# indexing into a vector gives a scalar&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;5.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vector&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;             &lt;span class=&quot;c&quot;&gt;# .item() gives the value as a Python number&lt;/span&gt;
&lt;span class=&quot;mf&quot;&gt;5.0&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mysum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mysum&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mysum&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;accumulating-losses&quot;&gt;Accumulating losses&lt;/h3&gt;

&lt;p&gt;Consider the widely used pattern &lt;code class=&quot;highlighter-rouge&quot;&gt;total_loss += loss.data[0]&lt;/code&gt;. Before 0.4.0. &lt;code class=&quot;highlighter-rouge&quot;&gt;loss&lt;/code&gt; was a &lt;code class=&quot;highlighter-rouge&quot;&gt;Variable&lt;/code&gt; wrapping a tensor of size &lt;code class=&quot;highlighter-rouge&quot;&gt;(1,)&lt;/code&gt;, but in 0.4.0 &lt;code class=&quot;highlighter-rouge&quot;&gt;loss&lt;/code&gt; is now a scalar and has &lt;code class=&quot;highlighter-rouge&quot;&gt;0&lt;/code&gt; dimensions. Indexing into a scalar doesn’t make sense (it gives a warning now, but will be a hard error in 0.5.0). Use &lt;code class=&quot;highlighter-rouge&quot;&gt;loss.item()&lt;/code&gt; to get the Python number from a scalar.&lt;/p&gt;

&lt;p&gt;Note that if you don’t convert to a Python number when accumulating losses, you may find increased memory usage in your program. This is because the right-hand-side of the above expression used to be a Python float, while it is now a zero-dim Tensor. The total loss is thus accumulating Tensors and their gradient history, which may keep around large autograd graphs for much longer than necessary.&lt;/p&gt;

&lt;h2 id=&quot;deprecation-of-volatile-flag&quot;&gt;Deprecation of volatile flag&lt;/h2&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;volatile&lt;/code&gt; flag is now deprecated and has no effect. Previously, any computation that involves a &lt;code class=&quot;highlighter-rouge&quot;&gt;Variable&lt;/code&gt; with &lt;code class=&quot;highlighter-rouge&quot;&gt;volatile=True&lt;/code&gt; wouldn’t be tracked by &lt;code class=&quot;highlighter-rouge&quot;&gt;autograd&lt;/code&gt;. This has now been replaced by a &lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#locally-disabling-gradient-computation&quot;&gt;set of more flexible context managers&lt;/a&gt; including &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.no_grad()&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.set_grad_enabled(grad_mode)&lt;/code&gt;, and others.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zeros&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;no_grad&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_train&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_grad_enabled&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is_train&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_grad_enabled&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# this can also be used as a function&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;set_grad_enabled&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;dtypes-devices-and-numpy-style-creation-functions&quot;&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensor_attributes.html#torch.torch.dtype&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;dtypes&lt;/code&gt;&lt;/a&gt;, &lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensor_attributes.html#torch.torch.device&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;devices&lt;/code&gt;&lt;/a&gt; and NumPy-style creation functions&lt;/h2&gt;

&lt;p&gt;In previous versions of PyTorch, we used to specify data type (e.g. float vs double), device type (cpu vs cuda) and layout (dense vs sparse) together as a “tensor type”. For example, &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.cuda.sparse.DoubleTensor&lt;/code&gt; was the &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt; type respresenting the &lt;code class=&quot;highlighter-rouge&quot;&gt;double&lt;/code&gt; data type, living on CUDA devices, and with &lt;a href=&quot;https://en.wikipedia.org/wiki/Sparse_matrix#Coordinate_list_(COO)&quot;&gt;COO sparse tensor&lt;/a&gt; layout.&lt;/p&gt;

&lt;p&gt;In this release, we introduce &lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensor_attributes.html#torch.torch.dtype&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.dtype&lt;/code&gt;&lt;/a&gt;, &lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensor_attributes.html#torch.torch.device&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.device&lt;/code&gt;&lt;/a&gt; and &lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensor_attributes.html#torch.torch.layout&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.layout&lt;/code&gt;&lt;/a&gt; classes to allow better management of these properties via NumPy-style creation functions.&lt;/p&gt;

&lt;h3 id=&quot;torchdtype&quot;&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensor_attributes.html#torch.torch.dtype&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.dtype&lt;/code&gt;&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Below is a complete list of available &lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensor_attributes.html#torch.torch.dtype&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.dtype&lt;/code&gt;&lt;/a&gt;s (data types) and their corresponding tensor types.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Data&lt;/th&gt;
      &lt;th&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;type torch.dtype&lt;/code&gt;&lt;/th&gt;
      &lt;th&gt;Tensor types&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;32-bit floating point&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.float32&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.float&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.*.FloatTensor&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;64-bit floating point&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.float64&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.double&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.*.DoubleTensor&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;16-bit floating point&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.float16&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.half&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.*.HalfTensor&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;8-bit integer (unsigned)&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.uint8&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.*.ByteTensor&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;8-bit integer (signed)&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.int8&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.*.CharTensor&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;16-bit integer (signed)&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.int16&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.short&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.*.ShortTensor&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;32-bit integer (signed)&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.int32&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.int&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.*.IntTensor&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;64-bit integer (signed)&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.int64&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.long&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.*.LongTensor&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The dtype of a tensor can be access via its &lt;code class=&quot;highlighter-rouge&quot;&gt;dtype&lt;/code&gt; attribute.&lt;/p&gt;

&lt;h3 id=&quot;torchdevice&quot;&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensor_attributes.html#torch.torch.device&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.device&lt;/code&gt;&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;A &lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensor_attributes.html#torch.torch.device&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.device&lt;/code&gt;&lt;/a&gt; contains a device type (&lt;code class=&quot;highlighter-rouge&quot;&gt;'cpu'&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;'cuda'&lt;/code&gt;) and optional device ordinal (id) for the device type. It can be initilized with &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.device('{device_type}')&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.device('{device_type}:{device_ordinal}')&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If the device ordinal is not present, this represents the current device for the device type; e.g., &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.device('cuda')&lt;/code&gt; is equivalent to &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.device('cuda:X')&lt;/code&gt; where &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt; is the result of &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.cuda.current_device()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The device of a tensor can be accessed via its &lt;code class=&quot;highlighter-rouge&quot;&gt;device&lt;/code&gt; attribute.&lt;/p&gt;

&lt;h3 id=&quot;torchlayout&quot;&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensor_attributes.html#torch.torch.layout&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.layout&lt;/code&gt;&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensor_attributes.html#torch.torch.layout&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.layout&lt;/code&gt;&lt;/a&gt; represents the data layout of a &lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensors.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt;&lt;/a&gt;. Currently &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.strided&lt;/code&gt; (dense tensors, the default) and &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.sparse_coo&lt;/code&gt; (sparse tensors with COO format) are supported.&lt;/p&gt;

&lt;p&gt;The layout of a tensor can be access via its &lt;code class=&quot;highlighter-rouge&quot;&gt;layout&lt;/code&gt; attribute.&lt;/p&gt;

&lt;h3 id=&quot;creating-tensors&quot;&gt;Creating Tensors&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#creation-ops&quot;&gt;Methods that create a&lt;/a&gt; &lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensors.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt;&lt;/a&gt; now also take in &lt;code class=&quot;highlighter-rouge&quot;&gt;dtype&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;device&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;layout&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;requires_grad&lt;/code&gt; options to specify the desired attributes on the returned &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt;. For example,&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;device&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;cuda:1&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;randn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;float64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.6344&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.8562&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.2758&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.8414&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.7962&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.0589&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1369&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.0462&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.4373&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;float64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'cuda:1'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# default is False&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zeros&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;requires_grad&lt;/span&gt;
&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h5 id=&quot;torchtensordata-&quot;&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.tensor&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.tensor(data, ...)&lt;/code&gt;&lt;/a&gt;&lt;/h5&gt;

&lt;p&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.tensor&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.tensor&lt;/code&gt;&lt;/a&gt; is one of the newly added &lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#creation-ops&quot;&gt;tensor creation methods&lt;/a&gt;. It takes in array-like data of all kinds and copies the contained values into a new &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt;. As mentioned earlier, &lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.tensor&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.tensor&lt;/code&gt;&lt;/a&gt; is the PyTorch equivalent of NumPy’s &lt;code class=&quot;highlighter-rouge&quot;&gt;numpy.array&lt;/code&gt;constructor. Unlike the &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.*Tensor&lt;/code&gt; methods, you can also create zero-dimensional &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt;s (aka scalars) this way (a single python number is treated as a Size in the &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.*Tensor methods&lt;/code&gt;). Moreover, if a &lt;code class=&quot;highlighter-rouge&quot;&gt;dtype&lt;/code&gt; argument isn’t given, it will infer the suitable &lt;code class=&quot;highlighter-rouge&quot;&gt;dtype&lt;/code&gt; given the data. It is the recommended way to create a tensor from existing data like a Python list. For example,&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cuda&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;cuda&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;half&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cuda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([[&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'cuda:0'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;               &lt;span class=&quot;c&quot;&gt;# scalar&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;2.3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# type inferece&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;float32&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;    &lt;span class=&quot;c&quot;&gt;# type inferece&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;int64&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We’ve also added more tensor creation methods. Some of them have &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.*_like&lt;/code&gt; and/or &lt;code class=&quot;highlighter-rouge&quot;&gt;tensor.new_*&lt;/code&gt; variants.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.*_like&lt;/code&gt; takes in an input &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt; instead of a shape. It returns a &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt; with same attributes as the input &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt; by default unless otherwise specified:&lt;/p&gt;

    &lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;randn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;float64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zeros_like&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;float64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zeros_like&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;int32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;tensor.new_*&lt;/code&gt; can also create &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensors&lt;/code&gt; with same attributes as &lt;code class=&quot;highlighter-rouge&quot;&gt;tensor&lt;/code&gt;, but it always takes in a shape argument:&lt;/p&gt;

    &lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;randn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;float64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_ones&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;1.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;float64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_ones&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;n&quot;&gt;tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;int32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To specify the desired shape, you can either use a tuple (e.g., &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.zeros((2, 3))&lt;/code&gt;) or variable arguments (e.g., &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.zeros(2, 3)&lt;/code&gt;) in most cases.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Name&lt;/th&gt;
      &lt;th&gt;Returned &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt;&lt;/th&gt;
      &lt;th&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.*_like&lt;/code&gt; variant&lt;/th&gt;
      &lt;th&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;tensor.new_*&lt;/code&gt; variant&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.empty&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.empty&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;unintialized memory&lt;/td&gt;
      &lt;td&gt;✔&lt;/td&gt;
      &lt;td&gt;✔&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.zeros&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.zeros&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;all zeros&lt;/td&gt;
      &lt;td&gt;✔&lt;/td&gt;
      &lt;td&gt;✔&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.ones&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.ones&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;all ones&lt;/td&gt;
      &lt;td&gt;✔&lt;/td&gt;
      &lt;td&gt;✔&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.full&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.full&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;filled with a given value&lt;/td&gt;
      &lt;td&gt;✔&lt;/td&gt;
      &lt;td&gt;✔&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.rand&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.rand&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;i.i.d. continuous Uniform[0, 1)&lt;/td&gt;
      &lt;td&gt;✔&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.randn&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.randn&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;i.i.d. &lt;code class=&quot;highlighter-rouge&quot;&gt;Normal(0, 1)&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;✔&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.randint&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.randint&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;i.i.d. discrete Uniform in given range&lt;/td&gt;
      &lt;td&gt;✔&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.randperm&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.randperm&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;random permutation of &lt;code class=&quot;highlighter-rouge&quot;&gt;{0, 1, ..., n - 1}&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.tensor&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.tensor&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;copied from existing data (list, NumPy ndarray, etc.)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;✔&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.from_numpy&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.from_numpy&lt;/code&gt;*&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;from NumPy &lt;code class=&quot;highlighter-rouge&quot;&gt;ndarray&lt;/code&gt; (sharing storage without copying)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.arange&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.arange&lt;/code&gt;&lt;/a&gt;, &lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.range&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.range&lt;/code&gt;&lt;/a&gt;, and &lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.linspace&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.linspace&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;uniformly spaced values in a given range&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.logspace&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.logspace&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;logarithmically spaced values in a given range&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.eye&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.eye&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;identity matrix&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;*: &lt;a href=&quot;http://pytorch.org/docs/0.4.0/torch.html#torch.from_numpy&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;torch.from_numpy&lt;/code&gt;&lt;/a&gt; only takes in a NumPy &lt;code class=&quot;highlighter-rouge&quot;&gt;ndarray&lt;/code&gt; as its input argument.&lt;/p&gt;

&lt;h2 id=&quot;writing-device-agnostic-code&quot;&gt;Writing device-agnostic code&lt;/h2&gt;

&lt;p&gt;Previous versions of PyTorch made it difficult to write code that was device agnostic (i.e. that could run on both CUDA-enabled and CPU-only machines without modification).&lt;/p&gt;

&lt;p&gt;PyTorch 0.4.0 makes this easier in two ways:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;device&lt;/code&gt; attribute of a Tensor gives the &lt;a href=&quot;http://pytorch.org/docs/0.4.0/tensor_attributes.html#torch.torch.device&quot;&gt;torch.device&lt;/a&gt; for all Tensors (&lt;code class=&quot;highlighter-rouge&quot;&gt;get_device&lt;/code&gt; only works for CUDA tensors)&lt;/li&gt;
  &lt;li&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;to&lt;/code&gt; method of &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensors&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;Modules&lt;/code&gt; can be used to easily move objects to different devices (instead of having to call &lt;code class=&quot;highlighter-rouge&quot;&gt;cpu()&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;cuda()&lt;/code&gt; based on the context)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We recommend the following pattern:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# at beginning of the script&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;device&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;cuda:0&quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cuda&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is_available&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;cpu&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# then whenever you get a new Tensor or Module&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# this won't copy if they are already on the desired device&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;model&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MyModule&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;new-edge-case-constraints-on-names-of-submodules-parameters-and-buffers-in-nnmodule&quot;&gt;New edge-case constraints on names of submodules, parameters, and buffers in &lt;code class=&quot;highlighter-rouge&quot;&gt;nn.Module&lt;/code&gt;&lt;/h2&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;name&lt;/code&gt; that is an empty string or contains &lt;code class=&quot;highlighter-rouge&quot;&gt;&quot;.&quot;&lt;/code&gt; is no longer permitted in &lt;code class=&quot;highlighter-rouge&quot;&gt;module.add_module(name, value)&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;module.add_parameter(name, value)&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;module.add_buffer(name, value)&lt;/code&gt; because such names may cause lost data in the &lt;code class=&quot;highlighter-rouge&quot;&gt;state_dict&lt;/code&gt;. If you are loading a checkpoint for modules containing such names, please update the module definition and patch the &lt;code class=&quot;highlighter-rouge&quot;&gt;state_dict&lt;/code&gt; before loading it.&lt;/p&gt;

&lt;h2 id=&quot;code-samples-putting-it-all-together&quot;&gt;Code Samples (Putting it all together)&lt;/h2&gt;

&lt;p&gt;To get a flavor of the overall recommended changes in 0.4.0, let’s look at a quick example for a common code pattern in both 0.3.1 and 0.4.0:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;0.3.1 (old):
    &lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;model&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MyRNN&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;use_cuda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;model&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;model&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cuda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# train&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;total_loss&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;train_loader&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;target&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zeros&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h_shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# init hidden&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;use_cuda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cuda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cuda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cuda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# get loss and optimize&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;total_loss&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;loss&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# evaluate&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;test_loader&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Variable&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;volatile&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;use_cuda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
  &lt;/li&gt;
  &lt;li&gt;0.4.0 (new):
    &lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# torch.device object used throughout this script&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;device&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;cuda&quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;use_cuda&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;cpu&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;model&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MyRNN&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# train&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;total_loss&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;train_loader&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;new_zeros&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h_shape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# has the same device &amp;amp; dtype as `input`&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# get loss and optimize&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;total_loss&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;loss&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;           &lt;span class=&quot;c&quot;&gt;# get Python number from 1-element Tensor&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# evaluate&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;no_grad&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;                   &lt;span class=&quot;c&quot;&gt;# operations inside don't track history&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;target&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;test_loader&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thank you for reading! Please refer to our &lt;a href=&quot;http://pytorch.org/docs/0.4.0/index.html&quot;&gt;documentation&lt;/a&gt; and &lt;a href=&quot;https://github.com/pytorch/pytorch/releases/tag/v0.4.0&quot;&gt;release notes&lt;/a&gt; for more details.&lt;/p&gt;

&lt;p&gt;Happy PyTorch-ing!&lt;/p&gt;</content><author><name>Facebook</name></author><summary type="html">Welcome to the migration guide for PyTorch 0.4.0. In this release we introduced many exciting new features and critical bug fixes, with the goal of providing users a better and cleaner interface. In this guide, we will cover the most important changes in migrating existing code from previous versions:</summary></entry><entry><title type="html">Tensor Comprehensions in PyTorch</title><link href="https://pytorch.org/blog/tensor-comprehensions/" rel="alternate" type="text/html" title="Tensor Comprehensions in PyTorch" /><published>2018-03-05T00:00:00-08:00</published><updated>2018-03-05T00:00:00-08:00</updated><id>https://pytorch.org/blog/tensor-comprehensions</id><content type="html" xml:base="https://pytorch.org/blog/tensor-comprehensions/">&lt;p&gt;Tensor Comprehensions (TC) is a tool that lowers the barrier for writing high-performance code. It generates GPU code from a simple high-level language and autotunes the code for specific input sizes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We highly recommend reading the &lt;a href=&quot;https://research.fb.com/announcing-tensor-comprehensions/&quot;&gt;Tensor Comprehensions blogpost&lt;/a&gt; first.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you ran into any of the following scenarios, TC is a useful tool for you.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Your PyTorch layer is large and slow, and you contemplated writing a dedicated C++ or CUDA code for it. But you don’t know how to program in CUDA or write low-level code.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;You wrote a CUDA layer, but it took a week to write, debug, optimize for speed. You wished you could do this in an hour.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;You want to fuse multiple layers like Conv-ReLU-BatchNorm or Linear-ReLU-Linear-ReLU in your network for speed, but it was quite difficult to comprehend&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Your research involves weird Tensor shapes that CuDNN and MKL are not optimized for. For example, you do convolutions of 13 x 24 with an input image of 143 x 55. You tried running it with CuDNN and it was slower than you wished.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Your code is slowed-down by transposing Tensors constantly to fit a particular memory layout. You wish it was easy to write custom code that operates efficiently on your input layout.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Tensor Comprehensions are seamless to use in PyTorch, interoperating with PyTorch Tensors and &lt;code class=&quot;highlighter-rouge&quot;&gt;nn&lt;/code&gt; Variables.&lt;/p&gt;

&lt;p&gt;Let us run through using TC with PyTorch.&lt;/p&gt;

&lt;h4 id=&quot;1-install-the-package&quot;&gt;1. Install the package&lt;/h4&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;conda install &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; pytorch &lt;span class=&quot;nt&quot;&gt;-c&lt;/span&gt; tensorcomp tensor_comprehensions
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;At this time we only provide Linux-64 binaries which have been tested on Ubuntu 16.04 and CentOS7.&lt;/p&gt;

&lt;p&gt;TC depends on heavyweight C++ projects such as &lt;a href=&quot;http://halide-lang.org/&quot;&gt;Halide&lt;/a&gt;, &lt;a href=&quot;https://github.com/wsmoses/Tapir-LLVM&quot;&gt;Tapir-LLVM&lt;/a&gt; and &lt;a href=&quot;http://isl.gforge.inria.fr/&quot;&gt;ISL&lt;/a&gt;. Hence, we rely on Anaconda to distribute these dependencies reliably. For the same reason, TC is not available via PyPI.&lt;/p&gt;

&lt;h4 id=&quot;2-import-the-python-package&quot;&gt;2. Import the python package&lt;/h4&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;tensor_comprehensions&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tc&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h4 id=&quot;3-define-the-tc-expression-and-create-a-python-function&quot;&gt;3. Define the TC expression and create a python function&lt;/h4&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;lang&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;
def fcrelu(float(B,M) I, float(N,M) W1, float(N) B1) -&amp;gt; (O1) {
    O1(b, n) +=! I(b, m) * W1(n, m)
    O1(b, n) = O1(b, n) + B1(n)
    O1(b, n) = fmax(O1(b, n), 0)
}
&quot;&quot;&quot;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;fcrelu&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;define&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lang&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;fcrelu&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This &lt;code class=&quot;highlighter-rouge&quot;&gt;fcrelu&lt;/code&gt; function takes PyTorch Tensors as input and returns a PyTorch Tensor. It takes input &lt;code class=&quot;highlighter-rouge&quot;&gt;I&lt;/code&gt;, weight &lt;code class=&quot;highlighter-rouge&quot;&gt;W1&lt;/code&gt;, bias &lt;code class=&quot;highlighter-rouge&quot;&gt;B1&lt;/code&gt; and returns output &lt;code class=&quot;highlighter-rouge&quot;&gt;O1&lt;/code&gt;.&lt;/p&gt;

&lt;h4 id=&quot;4-lets-create-some-dummy-input-tensors&quot;&gt;4. Let’s create some dummy input tensors&lt;/h4&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;N&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;128&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;W1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;randn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cuda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;randn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;N&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cuda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;randn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;N&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cuda&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h4 id=&quot;5-now-autotune-the-function-for-your-input-sizes&quot;&gt;5. Now autotune the function for your input sizes&lt;/h4&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;fcrelu&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;autotune&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;W1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cache&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;fcrelu_100_128_100.tc&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The autotuner is your biggest friend. You generally do not want to use a &lt;code class=&quot;highlighter-rouge&quot;&gt;tc&lt;/code&gt; function without autotuning it first.&lt;/p&gt;

&lt;p&gt;When the autotuning is running, the current best performance is displayed. If you are satisfied with the current result or you are out of time, stop the tuning procedure by pressing &lt;code class=&quot;highlighter-rouge&quot;&gt;Ctrl+C&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://pytorch.org/static/img/tc_autotuner.gif&quot; alt=&quot;tc-autotuner&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;cache&lt;/code&gt; saves the results of the autotuned kernel search and saves it to the file &lt;code class=&quot;highlighter-rouge&quot;&gt;fcrelu_100_128_100.tc&lt;/code&gt;. The next time you call the same line of code, it loads the results of the autotuning without recomputing it.&lt;/p&gt;

&lt;p&gt;The autotuner has a few hyperparameters (just like your ConvNet has learning rate, number of layers, etc.). We pick reasonable defaults, but you can read about using advanced options &lt;a href=&quot;https://facebookresearch.github.io/TensorComprehensions/framework/pytorch_integration/writing_layers.html#specifying-mapping-options&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h4 id=&quot;6-call-the-function-with-the-inputs-to-get-your-result&quot;&gt;6. Call the function with the inputs, to get your result&lt;/h4&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fcrelu&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;W1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now, let’s look at how to write TC expressions.&lt;/p&gt;

&lt;h2 id=&quot;a-quick-primer-on-the-tc-language&quot;&gt;A quick primer on the TC language&lt;/h2&gt;

&lt;p&gt;The TC notation focuses on the mathematical nature of the layer, leaving performance considerations to it’s backend code that uses Halide and polyhedral compilation techniques which accumulate decades of cutting edge Loop Nest Optimization (LNO) research.&lt;/p&gt;

&lt;p&gt;TC is close to &lt;a href=&quot;https://docs.scipy.org/doc/numpy/reference/generated/numpy.einsum.html&quot;&gt;np.einsum&lt;/a&gt;. We shall quickly learn TC by example&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;lang&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;
def matmul(float(M,N) A, float(N,K) B) -&amp;gt; (output) {
  output(i, j) +=! A(i, kk) * B(kk, j)
}
&quot;&quot;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In this example, we define a function &lt;code class=&quot;highlighter-rouge&quot;&gt;matmul&lt;/code&gt; which takes two input &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt; of shapes &lt;code class=&quot;highlighter-rouge&quot;&gt;M x N&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;N x K&lt;/code&gt; and returns a single &lt;code class=&quot;highlighter-rouge&quot;&gt;output&lt;/code&gt;. The shape of &lt;code class=&quot;highlighter-rouge&quot;&gt;output&lt;/code&gt; is automatically inferred by the TC language (discussed below).&lt;/p&gt;

&lt;p&gt;Let’s look at this line:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;!&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kk&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kk&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It says:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;output(i, j)&lt;/code&gt; means output is 2D.&lt;/li&gt;
  &lt;li&gt;for each location &lt;code class=&quot;highlighter-rouge&quot;&gt;output(i, j)&lt;/code&gt;, we add (&lt;code class=&quot;highlighter-rouge&quot;&gt;+=&lt;/code&gt;) &lt;code class=&quot;highlighter-rouge&quot;&gt;A(i, kk) * B(kk, j)&lt;/code&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;i&lt;/code&gt; is well-defined as all locations in &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt; dim=0, i.e. &lt;code class=&quot;highlighter-rouge&quot;&gt;i in range(0, M)&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;j&lt;/code&gt; is well-defined as all locations in &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt; dim=1, i.e. &lt;code class=&quot;highlighter-rouge&quot;&gt;j in range(0, K)&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;kk&lt;/code&gt; is inferred as all locations from &lt;code class=&quot;highlighter-rouge&quot;&gt;0&lt;/code&gt; to &lt;code class=&quot;highlighter-rouge&quot;&gt;N&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The shape of output is inferred from the maximum values &lt;code class=&quot;highlighter-rouge&quot;&gt;i&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;j&lt;/code&gt; can take, which is &lt;code class=&quot;highlighter-rouge&quot;&gt;M&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;K&lt;/code&gt;, so output is of size &lt;code class=&quot;highlighter-rouge&quot;&gt;M x K&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;!&lt;/code&gt; symbol initializes output with &lt;code class=&quot;highlighter-rouge&quot;&gt;0.0&lt;/code&gt;. It is equivalent to:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kk&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kk&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Scalar inputs and range constraints: implement AvgPool2d&lt;/strong&gt;&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;

def avgpool(float(B, C, H, W) input) -&amp;gt; (output) {{
  output(b, c, h, w) += input(b, c, h * {sH} + kh, w * {sW} + kw) where kh in 0:{kH}, kw in 0:{kW}
}}

&quot;&quot;&quot;&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;avgpool&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;define&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LANG&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;avgpool&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;constants&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;sH&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;sW&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;kH&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;kW&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;here the &lt;code class=&quot;highlighter-rouge&quot;&gt;where&lt;/code&gt; keyword can take ranges of values to operate on. &lt;code class=&quot;highlighter-rouge&quot;&gt;0:{kH}&lt;/code&gt; is equivalent &lt;code class=&quot;highlighter-rouge&quot;&gt;range(kH)&lt;/code&gt; in Python.&lt;/p&gt;

&lt;p&gt;Note: the syntax for passing in scalars is subject to change in the next release.&lt;/p&gt;

&lt;h2 id=&quot;torchnn-layers&quot;&gt;torch.nn layers&lt;/h2&gt;

&lt;p&gt;We added some sugar-coating around the basic PyTorch integration of TC to make it easy to integrate TC into larger &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.nn&lt;/code&gt; models by defining the forward and backward TC expressions and taking &lt;code class=&quot;highlighter-rouge&quot;&gt;Variable&lt;/code&gt; inputs / outputs. Here is an &lt;a href=&quot;https://github.com/facebookresearch/TensorComprehensions/blob/master/test_python/layers/test_convolution_train.py&quot;&gt;example&lt;/a&gt; of defining a convolution layer with TC.&lt;/p&gt;

&lt;h2 id=&quot;some-essentials-that-you-will-miss-were-working-on-them&quot;&gt;Some essentials that you will miss (we’re working on them)&lt;/h2&gt;

&lt;h3 id=&quot;autotuning-for-variable-length-sequences&quot;&gt;Autotuning for variable-length sequences&lt;/h3&gt;

&lt;p&gt;The TC auto-tuner requires all input sizes to be specified before-hand. For example, if you have input &lt;code class=&quot;highlighter-rouge&quot;&gt;I1&lt;/code&gt; which is an image batch, the autotuner wants to know the exact shape of &lt;code class=&quot;highlighter-rouge&quot;&gt;I1&lt;/code&gt; to generate an optimized kernel. You cannot specify: &lt;code class=&quot;highlighter-rouge&quot;&gt;image with height between 200 and 300&lt;/code&gt;. This is more essential in sequence data such as NLP, where each sentence can have a different length.&lt;/p&gt;

&lt;p&gt;The reason why the autotuner is non-parametric is because it’s harder and harder to auto-tune parametric constraints, this is active research. Hence, for the first release, we made a conscious decision to give you the tool in a form where we know it works well.&lt;/p&gt;

&lt;p&gt;As a work-around, if you know that you have a few specific shapes of interest, you can run the autotuner with these multiple shapes.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;relu&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;define&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LANG&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;relu&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;batch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;channels&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;autotune&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;batch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;channels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# image of size 32 x 32&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;autotune&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;batch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;channels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;48&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;48&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# image of size 48 x 48&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;tc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;autotune&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;batch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;channels&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# image of size 64 x 64&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now the autotuner is tuned for these three specific image sizes &lt;code class=&quot;highlighter-rouge&quot;&gt;32x32&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;48x48&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;64x64&lt;/code&gt;.&lt;/p&gt;

&lt;h3 id=&quot;lack-of-loops&quot;&gt;Lack of loops&lt;/h3&gt;

&lt;p&gt;If you want to write an RNN, it’s easy to see it as a &lt;code class=&quot;highlighter-rouge&quot;&gt;for&lt;/code&gt; loop over time. However, the TC language does not have loops yet. If you reallly want to write RNNs, you can write unrolled loops.&lt;/p&gt;

&lt;h3 id=&quot;strided-tensors&quot;&gt;Strided-Tensors&lt;/h3&gt;

&lt;p&gt;The TC backend does not support non-contiguous Tensors yet. If the inputs you give are not contiguous, they are made contiguous before passing to the TC backend.&lt;/p&gt;

&lt;h3 id=&quot;reshaping-tensors-within-a-tc-expression&quot;&gt;Reshaping Tensors within a TC expression&lt;/h3&gt;

&lt;p&gt;You cannot write this operation in TC: &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.matmul(...).view(...).mean(...)&lt;/code&gt;. Whenever there is need for a &lt;code class=&quot;highlighter-rouge&quot;&gt;view&lt;/code&gt; to change the shape of an input, you have to get the output, &lt;code class=&quot;highlighter-rouge&quot;&gt;view&lt;/code&gt; it at the PyTorch level.&lt;/p&gt;

&lt;h2 id=&quot;getting-started&quot;&gt;Getting Started&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://facebookresearch.github.io/TensorComprehensions/framework/pytorch_integration/writing_layers.html&quot;&gt;Walk through Tutorial&lt;/a&gt; to quickly get started with understanding and using Tensor Comprehensions PyTorch package.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/facebookresearch/TensorComprehensions/tree/master/test_python/layers&quot;&gt;Over 20 examples&lt;/a&gt; of various ML layers with TC, including &lt;code class=&quot;highlighter-rouge&quot;&gt;avgpool&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;maxpool&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;matmul&lt;/code&gt;, matmul - give output buffers and &lt;code class=&quot;highlighter-rouge&quot;&gt;batch-matmul&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;convolution&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;strided-convolution&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;batchnorm&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;copy&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;cosine similarity&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;Linear&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;Linear + ReLU&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;group-convolutions&lt;/code&gt;, strided &lt;code class=&quot;highlighter-rouge&quot;&gt;group-convolutions&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;indexing&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;Embedding&lt;/code&gt; (lookup table), small-mobilenet, &lt;code class=&quot;highlighter-rouge&quot;&gt;softmax&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;tensordot&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;transpose&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://facebookresearch.github.io/TensorComprehensions/framework/pytorch_integration/getting_started.html&quot;&gt;Detailed docs&lt;/a&gt; on Tensor Comprehensions and integration with PyTorch.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;communication&quot;&gt;Communication&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://tensorcomprehensions.herokuapp.com/&quot;&gt;Slack&lt;/a&gt;: For discussion around framework integration, build support, collaboration, etc. join our slack channel.&lt;/li&gt;
  &lt;li&gt;Email: tensorcomp@fb.com&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/facebookresearch/TensorComprehensions&quot;&gt;GitHub&lt;/a&gt;: bug reports, feature requests, install issues, RFCs, thoughts, etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;acknowledgements&quot;&gt;Acknowledgements&lt;/h2&gt;

&lt;p&gt;We would like to thank Soumith Chintala, &lt;a href=&quot;https://github.com/ezyang&quot;&gt;Edward Yang&lt;/a&gt; and &lt;a href=&quot;https://github.com/colesbury&quot;&gt;Sam Gross&lt;/a&gt; for their immense guidance and help in making the integration API nice and smooth. We would also like to thank rest of the PyTorch team and our pre-release users for their helpful feedback that guided us in making the integration better.&lt;/p&gt;</content><author><name>Priya Goyal (FAIR), Nicolas Vasilache (FAIR), Oleksandr Zinenko (Inria &amp; DI ENS), Theodoros Theodoridis (ETH Zürich), Zachary DeVito (FAIR), William S. Moses (MIT CSAIL), Sven Verdoolaege (FAIR), Andrew Adams (FAIR), Albert Cohen (Inria &amp; DI ENS &amp; FAIR)</name></author><summary type="html">Tensor Comprehensions (TC) is a tool that lowers the barrier for writing high-performance code. It generates GPU code from a simple high-level language and autotunes the code for specific input sizes.</summary></entry><entry><title type="html">PyTorch, a year in….</title><link href="https://pytorch.org/blog/a-year-in/" rel="alternate" type="text/html" title="PyTorch, a year in...." /><published>2018-01-19T09:00:00-08:00</published><updated>2018-01-19T09:00:00-08:00</updated><id>https://pytorch.org/blog/a-year-in</id><content type="html" xml:base="https://pytorch.org/blog/a-year-in/">&lt;p&gt;Today marks 1 year since PyTorch was released publicly. It’s been a wild ride — our quest to build a flexible deep learning research platform. Over the last year, we’ve seen an amazing community of people using, contributing to and evangelizing PyTorch — thank you for the love.&lt;/p&gt;

&lt;p&gt;Looking back, we wanted to summarize PyTorch over the past year: the progress, the news and highlights from the community.&lt;/p&gt;

&lt;h2 id=&quot;community&quot;&gt;Community&lt;/h2&gt;

&lt;p&gt;We’ve been blessed with a strong organic community of researchers and engineers who fell in love with PyTorch. The core team has engineers and researchers from multiple countries, companies and universities, and we couldn’t have made PyTorch what it is without each contribution.&lt;/p&gt;

&lt;h3 id=&quot;research-papers-packages-and-github&quot;&gt;Research papers, packages and Github&lt;/h3&gt;

&lt;p&gt;Within days of release, users from the community started to implement their favorite research papers in PyTorch and release the code on Github. Open-source code is a primary and essential tool for researchers today.&lt;/p&gt;

&lt;p&gt;Folks came together to create &lt;a href=&quot;https://github.com/pytorch/text&quot;&gt;torchtext&lt;/a&gt;, &lt;a href=&quot;https://github.com/pytorch/vision&quot;&gt;torchvision&lt;/a&gt; and &lt;a href=&quot;https://github.com/pytorch/audio&quot;&gt;torchaudio&lt;/a&gt; packages to help facilitate and democratize research in different domains.&lt;/p&gt;

&lt;p&gt;The first community package based on PyTorch came from Brandon Amos, &lt;a href=&quot;https://twitter.com/brandondamos/status/828652480573607937&quot;&gt;titled Block&lt;/a&gt;, and helped with easier manipulation of block matrices. The Locus Lab at &lt;strong&gt;CMU&lt;/strong&gt; subsequently went on to &lt;a href=&quot;https://github.com/locuslab&quot;&gt;publish PyTorch packages&lt;/a&gt; and implementations for most of their research. The first research paper code came from Sergey Zagoruyko titled &lt;a href=&quot;https://twitter.com/PyTorch/status/822561885744726016&quot;&gt;Paying more attention to attention&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Jun-Yan Zhu, Taesung Park, Phillip Isola, Alyosha Efros and team from &lt;strong&gt;U.C.Berkeley&lt;/strong&gt; released the hugely popular &lt;a href=&quot;https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix&quot;&gt;Cycle-GAN and pix2pix&lt;/a&gt; which does image to image transforms.&lt;/p&gt;

&lt;div class=&quot;text-center&quot;&gt;
  &lt;img src=&quot;https://pytorch.org/assets/images/horse2zebra.gif&quot; /&gt;
&lt;/div&gt;

&lt;p&gt;The researchers at &lt;strong&gt;HarvardNLP&lt;/strong&gt; and &lt;strong&gt;Systran&lt;/strong&gt; started developing and improving &lt;a href=&quot;https://github.com/OpenNMT/OpenNMT-py&quot;&gt;OpenNMT in PyTorch&lt;/a&gt;, seeded by initial reimplementation of the [Lua]Torch code from Adam Lerer at Facebook.&lt;/p&gt;

&lt;p&gt;The MagicPony team at &lt;strong&gt;Twitter&lt;/strong&gt; contributed implementations of their &lt;a href=&quot;https://twitter.com/Rob_Bishop/status/821793080877588480&quot;&gt;Super-resolution work early on into PyTorch’s examples&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Salesforce Research&lt;/strong&gt; released several packages, including their highlight release of &lt;a href=&quot;https://twitter.com/Smerity/status/917472260851560448&quot;&gt;PyTorch-QRNN&lt;/a&gt;, a type of RNN that is 2x to 17x faster than standard LSTMs optimized by CuDNN. James Bradbury and team form one of the most active and engaging forces in the PyTorch community.&lt;/p&gt;

&lt;blockquote class=&quot;twitter-tweet&quot; data-lang=&quot;en&quot;&gt;&lt;p lang=&quot;en&quot; dir=&quot;ltr&quot;&gt;We&amp;#39;re releasing &lt;a href=&quot;https://twitter.com/PyTorch?ref_src=twsrc%5Etfw&quot;&gt;@PyTorch&lt;/a&gt;-QRNN, 2-17x faster than NVIDIA&amp;#39;s cuDNN LSTM.&lt;br /&gt;Speed thanks to 50 lines of CUDA via CuPy.&lt;a href=&quot;https://t.co/KaWhN4yDZd&quot;&gt;https://t.co/KaWhN4yDZd&lt;/a&gt; &lt;a href=&quot;https://t.co/yoLYj3pMI0&quot;&gt;pic.twitter.com/yoLYj3pMI0&lt;/a&gt;&lt;/p&gt;&amp;mdash; Smerity (@Smerity) &lt;a href=&quot;https://twitter.com/Smerity/status/917472260851560448?ref_src=twsrc%5Etfw&quot;&gt;October 9, 2017&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async=&quot;&quot; src=&quot;https://platform.twitter.com/widgets.js&quot; charset=&quot;utf-8&quot;&gt;&lt;/script&gt;

&lt;p&gt;Researchers from &lt;strong&gt;Uber&lt;/strong&gt;, &lt;strong&gt;Northeastern&lt;/strong&gt; and &lt;strong&gt;Stanford&lt;/strong&gt; came together to form an active probabilistic programming community around their packages &lt;a href=&quot;http://pyro.ai/&quot;&gt;Pyro&lt;/a&gt; and &lt;a href=&quot;https://github.com/probtorch/probtorch&quot;&gt;ProbTorch&lt;/a&gt;. They are actively developing the torch.distributions core package. This community is so active and fast-moving, we had our first pytorch-probabilistic-programming meetup at NIPS 2017 with Fritz Obermeyer, Noah Goodman, Jan-Willem van de Meent, Brooks Paige, Dustin Tran and 22 additional attendees discussing how to make the world bayesian.&lt;/p&gt;

&lt;div class=&quot;text-center&quot;&gt;
  &lt;img src=&quot;https://pytorch.org/assets/images/probpackages.png&quot; width=&quot;40%&quot; /&gt;
&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;NVIDIA&lt;/strong&gt; Researchers released three high-quality repositories that implemented &lt;a href=&quot;https://github.com/NVIDIA/pix2pixHD&quot;&gt;pix2pix-HD&lt;/a&gt;, &lt;a href=&quot;https://github.com/NVIDIA/sentiment-discovery&quot;&gt;Sentiment Neuron&lt;/a&gt; and &lt;a href=&quot;https://github.com/NVIDIA/flownet2-pytorch&quot;&gt;FlowNet2&lt;/a&gt; papers. Their analysis of scalability of different &lt;a href=&quot;https://github.com/NVIDIA/sentiment-discovery/blob/master/analysis/scale.md&quot;&gt;Data Parallel models in PyTorch&lt;/a&gt; was helpful to the community.&lt;/p&gt;

&lt;div class=&quot;text-center&quot;&gt;
  &lt;img src=&quot;https://pytorch.org/assets/images/sentiment.png&quot; width=&quot;40%&quot; /&gt;
&lt;/div&gt;

&lt;p&gt;The Allen Institute for AI released &lt;a href=&quot;http://allennlp.org/&quot;&gt;AllenNLP&lt;/a&gt; which includes several state-of-the-art models in NLP — reference implementations and easy to use &lt;a href=&quot;http://demo.allennlp.org/machine-comprehension&quot;&gt;web demos&lt;/a&gt; for standard NLP tasks.&lt;/p&gt;

&lt;div class=&quot;text-center&quot;&gt;
  &lt;img src=&quot;https://pytorch.org/assets/images/allennlp.png&quot; width=&quot;40%&quot; /&gt;
&lt;/div&gt;

&lt;p&gt;We also had our first Kaggle winning team grt123 in July. They won the DataScience Bowl 2017 on Lung Cancer detection and &lt;a href=&quot;https://twitter.com/PyTorch/status/881573658166267904&quot;&gt;subsequently released their PyTorch implementations&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;On the visualization front, Tzu-Wei Huang implemented a &lt;a href=&quot;https://github.com/lanpa/tensorboard-pytorch&quot;&gt;TensorBoard-PyTorch plugin&lt;/a&gt; and Facebook AI Research released PyTorch compatibility for their &lt;a href=&quot;https://github.com/facebookresearch/visdom&quot;&gt;visdom&lt;/a&gt; visualization package.&lt;/p&gt;

&lt;div class=&quot;text-center&quot;&gt;
  &lt;img src=&quot;https://pytorch.org/assets/images/tensorboard_model.png&quot; width=&quot;40%&quot; /&gt;
  &lt;img src=&quot;https://pytorch.org/assets/images/visdom.png&quot; width=&quot;40%&quot; /&gt;
&lt;/div&gt;

&lt;p&gt;Lastly, &lt;strong&gt;Facebook AI Research&lt;/strong&gt; released several projects such as &lt;a href=&quot;https://github.com/facebookresearch/&quot;&gt;ParlAI, fairseq-py, VoiceLoop and FaderNetworks&lt;/a&gt; that implemented cutting-edge models and interfaced datasets in multiple domains.&lt;/p&gt;

&lt;p&gt;There are countless good projects that we haven’t highlighted for the lack of space, you can find a curated list &lt;a href=&quot;https://github.com/soumith?tab=stars&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We would also like to give a huge shout-out to folks who actively help others out on the Forums, especially &lt;a href=&quot;https://discuss.pytorch.org/u/ptrblck/summary&quot;&gt;ptrblck&lt;/a&gt;, &lt;a href=&quot;https://discuss.pytorch.org/u/jpeg729/summary&quot;&gt;jpeg729&lt;/a&gt;, &lt;a href=&quot;https://discuss.pytorch.org/u/quantscientist/summary&quot;&gt;QuantScientist&lt;/a&gt;, &lt;a href=&quot;https://discuss.pytorch.org/u/alband/summary&quot;&gt;albanD&lt;/a&gt;, &lt;a href=&quot;https://discuss.pytorch.org/u/tom/summary&quot;&gt;Thomas Viehmann&lt;/a&gt; and &lt;a href=&quot;https://discuss.pytorch.org/u/chenyuntc/summary&quot;&gt;chenyuntc&lt;/a&gt;. You are providing an invaluable service, thank you so much!&lt;/p&gt;

&lt;h2 id=&quot;metrics&quot;&gt;Metrics&lt;/h2&gt;

&lt;p&gt;In terms of sheer numbers,&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;87,769 lines of Python code on github that &lt;a href=&quot;https://github.com/search?l=Python&amp;amp;q=import+torch&amp;amp;type=Code&quot;&gt;import torch&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/search?q=pytorch&amp;amp;type=Repositories&quot;&gt;3,983 repositories on Github that mention PyTorch in their name or description&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;More than half a million downloads of PyTorch binaries. 651,916 to be precise.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;5,400 users&lt;/strong&gt; wrote &lt;strong&gt;21,500 posts&lt;/strong&gt; discussing 5,200 topics on our forums discuss.pytorch.org (http://discuss.pytorch.org/)&lt;/li&gt;
  &lt;li&gt;131 mentions of PyTorch on Reddit’s /r/machinelearning since the day of release. In the same period, TensorFlow was mentioned 255 times.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;research-metrics&quot;&gt;Research Metrics&lt;/h3&gt;

&lt;p&gt;PyTorch is a research-focused framework. So one of the metrics of interest is to see the usage of PyTorch in machine learning research papers.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;In the recent ICLR2018 conference submissions, PyTorch was mentioned in &lt;strong&gt;87 papers&lt;/strong&gt;, compared to TensorFlow at 228 papers, Keras at 42 papers, Theano and Matlab at 32 papers.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;https://twitter.com/fchollet/status/951828914103402497&quot;&gt;Monthly arxiv.org mentions for frameworks&lt;/a&gt; had PyTorch at 72 mentions, with TensorFlow at 273 mentions, Keras at 100 mentions, Caffe at 94 mentions and Theano at 53 mentions.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;courses-tutorials-and-books&quot;&gt;Courses, Tutorials and Books&lt;/h2&gt;

&lt;p&gt;When we released PyTorch, we had good API documentation, but our tutorials were limited to a few ipython notebooks — helpful, but not good enough.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/chsasank&quot;&gt;Sasank Chilamkurthy&lt;/a&gt; took it upon himself to revamp the tutorials into the &lt;a href=&quot;http://pytorch.org/tutorials/&quot;&gt;beautiful website&lt;/a&gt; that it is today.&lt;/p&gt;

&lt;div class=&quot;text-center&quot;&gt;
  &lt;img src=&quot;https://pytorch.org/assets/images/blog_combined_tutorials.png&quot; width=&quot;40%&quot; /&gt;
&lt;/div&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/spro/practical-pytorch&quot;&gt;Sean Robertson&lt;/a&gt; and &lt;a href=&quot;https://github.com/jcjohnson/pytorch-examples&quot;&gt;Justin Johnson&lt;/a&gt; wrote great new tutorials — in NLP, and to learn by example. &lt;a href=&quot;https://github.com/yunjey/pytorch-tutorial&quot;&gt;Yunjey Choi&lt;/a&gt; wrote a beautiful tutorial where most models were implemented in 30 lines or less.
Each new tutorial helped users find their way faster, with different approaches to learning.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://twitter.com/PyTorch/status/888500355943641088&quot;&gt;Goku Mohandas and Delip Rao&lt;/a&gt; switched the code content of their book-in-progress to use PyTorch.&lt;/p&gt;

&lt;p&gt;We’ve seen quite a few university machine learning courses being taught with PyTorch as the primary tool, such as Harvard’s &lt;a href=&quot;https://harvard-ml-courses.github.io/cs287-web/&quot;&gt;CS287&lt;/a&gt;. Taking it one step further and democratizing learning, we had three online courses pop up that teach using PyTorch.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Fast.ai’s&lt;/strong&gt; “Deep Learning for Coders” is a popular online course. In September, Jeremy and Rachel &lt;a href=&quot;http://www.fast.ai/2017/09/08/introducing-pytorch-for-fastai/&quot;&gt;announced that the next fast.ai courses will be nearly entirely based on PyTorch&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;Ritchie Ng, a researcher with ties to NUS Singapore and Tsinghua released &lt;a href=&quot;https://www.udemy.com/practical-deep-learning-with-pytorch/&quot;&gt;a Udemy course&lt;/a&gt; titled Practical Deep Learning with PyTorch.&lt;/li&gt;
  &lt;li&gt;Sung Kim from HKUST released an &lt;a href=&quot;https://www.youtube.com/playlist?list=PLlMkM4tgfjnJ3I-dbhO9JTw7gNty6o_2m&quot;&gt;online course on Youtube&lt;/a&gt; that was aimed towards a general audience, titled: “PyTorch Zero to All”.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;engineering&quot;&gt;Engineering&lt;/h2&gt;

&lt;p&gt;Over the last year we implemented multiple features, improved performance across the board and fixed lots of bugs. A full list of the work we’ve done is found in our &lt;a href=&quot;https://github.com/pytorch/pytorch/releases&quot;&gt;release notes&lt;/a&gt;.
Here are highlights from our work over the last year:&lt;/p&gt;

&lt;h2 id=&quot;higher-order-gradients&quot;&gt;Higher-order gradients&lt;/h2&gt;

&lt;p&gt;With the release of several papers that implement penalties of gradients and with ongoing research in 2nd order gradient methods, this was an essential and sought-after feature. In August, we implemented a generalized interface that can take n-th order derivatives and increased the coverage of functions that support higher-order gradients over time, such that at the moment of writing almost all ops support this.&lt;/p&gt;

&lt;h2 id=&quot;distributed-pytorch&quot;&gt;Distributed PyTorch&lt;/h2&gt;

&lt;p&gt;In August, we released a small distributed package that followed the highly popular MPI-collective approach. The package has multiple backends such as TCP, MPI, Gloo and NCCL2 to support various types of CPU/GPU collective operations and use-cases, and integrates distributed technologies such as Infiniband and RoCE. Distributed is hard, and we had bugs in the initial iteration. Over subsequent releases, we made the package more stable and improved performance.&lt;/p&gt;

&lt;h2 id=&quot;closer-to-numpy&quot;&gt;Closer to NumPy&lt;/h2&gt;

&lt;p&gt;One of the biggest demands from users were NumPy features that they were familiar with. Features such as Broadcasting and Advanced Indexing are convenient and save users a lot of verbosity. We implemented these features and started to align our API to be closer to NumPy. Over time, we expect to get closer and closer to NumPy’s API where appropriate.&lt;/p&gt;

&lt;h2 id=&quot;sparse-tensors&quot;&gt;Sparse Tensors&lt;/h2&gt;

&lt;p&gt;In March, we released a small package supporting sparse Tensors and in May we released CUDA support for the sparse package. The package is small and limited in functionality, and is used for implementing Sparse Embeddings and commonly used sparse paradigms in deep learning. This package is still small in scope and there’s demand to expand it — if you are interested in working on expanding the sparse package, reach out to us on our &lt;a href=&quot;https://discuss.pytorch.org/&quot;&gt;Discussion Boards&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;performance&quot;&gt;Performance&lt;/h2&gt;

&lt;p&gt;Performance is always an ongoing battle, especially for PyTorch which is a dynamic framework that wants to maximize flexibility. Over the last year, we’ve improved performance across board, from our core Tensor library to the neural network operators, writing faster micro-optimized across board.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;We’ve added specialized AVX and AVX2 intrinsics for Tensor operations&lt;/li&gt;
  &lt;li&gt;Wrote faster GPU kernels for frequent workloads like concatenation and Softmax (among many other things)&lt;/li&gt;
  &lt;li&gt;Rewrote the code for several neural network operators (too many to list), but notably nn.Embedding and group convolutions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Reducing framework overhead by 10x across board&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Since PyTorch is a dynamic graph framework, we create a new graph on the fly at every iteration of a training loop. Hence, the framework overhead has to be low, or the workload has to be large enough that the framework overhead is hidden. In August, the authors of DyNet (Graham Neubig and team) showcased that it’s much faster than PyTorch on small NLP models. This was an interesting challenge, we didn’t realize that models of those sizes were being trained. In a multi-month (and ongoing) effort, we embarked upon a significant rewrite of PyTorch internals that reduced the framework overhead from more than 10 microseconds per operator execution to as little as 1 microsecond.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ATen&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;As we embarked upon a redesign of the PyTorch internals, we built the &lt;a href=&quot;https://github.com/pytorch/pytorch/tree/master/aten&quot;&gt;ATen C++11&lt;/a&gt; library that now powers all of the PyTorch backend. ATen has an API that mirrors PyTorch’s Python API, which makes it a convenient C++ library for Tensor computation. ATen can be built and used independently of PyTorch.&lt;/p&gt;

&lt;h2 id=&quot;exporting-models-to-production--onnx-support-and-the-jit-compiler&quot;&gt;Exporting models to production — ONNX Support and the JIT compiler&lt;/h2&gt;

&lt;p&gt;One of the common requests we’ve received was to export PyTorch models to another framework. Users engaged in a rapid research cycle in PyTorch and when they were done, they wanted to ship it to larger projects with C++ only requirements.&lt;/p&gt;

&lt;p&gt;With this in mind, we built a tracer for PyTorch — which can export PyTorch models into an intermediate representation.
The subsequent trace can be either used to run the current PyTorch model more efficiently (by running optimization passes on it), or be converted to the &lt;a href=&quot;http://onnx.ai/&quot;&gt;ONNX&lt;/a&gt; format to be shipped to other frameworks such as Caffe2, MXNet, TensorFlow and others or directly to the hardware accelerated libraries like CoreML or TensorRT. Over the next year, you will hear more about the JIT compiler for performance improvements.&lt;/p&gt;

&lt;h2 id=&quot;users-being-funny-&quot;&gt;Users being funny :)&lt;/h2&gt;

&lt;p&gt;Our users express their support in funny ways, made us laugh, thanks for this :)&lt;/p&gt;

&lt;blockquote class=&quot;twitter-tweet&quot; data-lang=&quot;en&quot;&gt;&lt;p lang=&quot;en&quot; dir=&quot;ltr&quot;&gt;I&amp;#39;ve been using PyTorch a few months now and I&amp;#39;ve never felt better. I have more energy. My skin is clearer. My eye sight has improved.&lt;/p&gt;&amp;mdash; Andrej Karpathy (@karpathy) &lt;a href=&quot;https://twitter.com/karpathy/status/868178954032513024?ref_src=twsrc%5Etfw&quot;&gt;May 26, 2017&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async=&quot;&quot; src=&quot;https://platform.twitter.com/widgets.js&quot; charset=&quot;utf-8&quot;&gt;&lt;/script&gt;

&lt;blockquote class=&quot;twitter-tweet&quot; data-lang=&quot;en&quot;&gt;&lt;p lang=&quot;en&quot; dir=&quot;ltr&quot;&gt;Talk to your doctor to find out if PyTorch is right for you.&lt;/p&gt;&amp;mdash; Sean Robertson (@sprobertson) &lt;a href=&quot;https://twitter.com/sprobertson/status/868180795000750080?ref_src=twsrc%5Etfw&quot;&gt;May 26, 2017&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async=&quot;&quot; src=&quot;https://platform.twitter.com/widgets.js&quot; charset=&quot;utf-8&quot;&gt;&lt;/script&gt;

&lt;blockquote class=&quot;twitter-tweet&quot; data-lang=&quot;en&quot;&gt;&lt;p lang=&quot;en&quot; dir=&quot;ltr&quot;&gt;PyTorch gave me so much life that my skin got cleared, my grades are up, my bills are paid and my crops are watered.&lt;/p&gt;&amp;mdash; Adam Will ð️‍ð (@adam_will_do_it) &lt;a href=&quot;https://twitter.com/adam_will_do_it/status/868179679483764736?ref_src=twsrc%5Etfw&quot;&gt;May 26, 2017&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async=&quot;&quot; src=&quot;https://platform.twitter.com/widgets.js&quot; charset=&quot;utf-8&quot;&gt;&lt;/script&gt;

&lt;blockquote class=&quot;twitter-tweet&quot; data-lang=&quot;en&quot;&gt;&lt;p lang=&quot;en&quot; dir=&quot;ltr&quot;&gt;So have I! But my hair is also shiner and I&amp;#39;ve lost weight. &lt;a href=&quot;https://twitter.com/PyTorch?ref_src=twsrc%5Etfw&quot;&gt;@PyTorch&lt;/a&gt; for the win. &lt;a href=&quot;https://t.co/qgU4oIOB4K&quot;&gt;https://t.co/qgU4oIOB4K&lt;/a&gt;&lt;/p&gt;&amp;mdash; Mariya (@thinkmariya) &lt;a href=&quot;https://twitter.com/thinkmariya/status/868181991212044288?ref_src=twsrc%5Etfw&quot;&gt;May 26, 2017&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async=&quot;&quot; src=&quot;https://platform.twitter.com/widgets.js&quot; charset=&quot;utf-8&quot;&gt;&lt;/script&gt;</content><author><name>The PyTorch Team</name></author><summary type="html">Today marks 1 year since PyTorch was released publicly. It’s been a wild ride — our quest to build a flexible deep learning research platform. Over the last year, we’ve seen an amazing community of people using, contributing to and evangelizing PyTorch — thank you for the love.</summary></entry><entry><title type="html">PyTorch Internals Part II - The Build System</title><link href="https://pytorch.org/blog/a-tour-of-pytorch-internals-2/" rel="alternate" type="text/html" title="PyTorch Internals Part II - The Build System" /><published>2017-06-27T10:00:00-07:00</published><updated>2017-06-27T10:00:00-07:00</updated><id>https://pytorch.org/blog/a-tour-of-pytorch-internals-2</id><content type="html" xml:base="https://pytorch.org/blog/a-tour-of-pytorch-internals-2/">&lt;p&gt;In the first &lt;a href=&quot;/blog/a-tour-of-pytorch-internals-1/&quot;&gt;post&lt;/a&gt; I explained how we generate a &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.Tensor&lt;/code&gt; object that you can use in your Python interpreter. Next, I will explore the build system for PyTorch. The PyTorch codebase has a variety of components:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The core Torch libraries: TH, THC, THNN, THCUNN&lt;/li&gt;
  &lt;li&gt;Vendor libraries: CuDNN, NCCL&lt;/li&gt;
  &lt;li&gt;Python Extension libraries&lt;/li&gt;
  &lt;li&gt;Additional third-party libraries: NumPy, MKL, LAPACK&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;How does a simple invocation of &lt;code class=&quot;highlighter-rouge&quot;&gt;python setup.py install&lt;/code&gt; do the work that allows you to call &lt;code class=&quot;highlighter-rouge&quot;&gt;import torch&lt;/code&gt; and use the PyTorch library in your code?&lt;/p&gt;

&lt;p&gt;The first part of this document will explain the build process from and end-user point of view. This will explain how we take the components above to build the library. The second part of the document will be important for PyTorch developers. It will document ways to improve your iteration speed by building only a subset of the code that you are working on.&lt;/p&gt;

&lt;h3 id=&quot;setuptools-and-pytorchs-setup--function&quot;&gt;Setuptools and PyTorch’s setup( ) function&lt;/h3&gt;

&lt;p&gt;Python uses &lt;a href=&quot;https://setuptools.readthedocs.io/en/latest/index.html&quot;&gt;Setuptools&lt;/a&gt; to build the library. Setuptools is an extension to the original distutils system from the core Python library. The core component of Setuptools is the &lt;code class=&quot;highlighter-rouge&quot;&gt;setup.py&lt;/code&gt; file which contains all the information needed to build the project. The most important function is the &lt;code class=&quot;highlighter-rouge&quot;&gt;setup()&lt;/code&gt; function which serves as the main entry point. Let’s take a look at the one in PyTorch:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;setup&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;torch&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;version&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;version&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;description&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Tensors and Dynamic neural networks in Python with strong GPU acceleration&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;ext_modules&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;extensions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;cmdclass&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;'build'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;build&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;'build_py'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;build_py&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;'build_ext'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;build_ext&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;'build_deps'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;build_deps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;'build_module'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;build_module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;'develop'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;develop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;'install'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;install&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;'clean'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;packages&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;packages&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;package_data&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'torch'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;'lib/*.so*'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'lib/*.dylib*'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;'lib/torch_shm_manager'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;'lib/*.h'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;'lib/include/TH/*.h'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'lib/include/TH/generic/*.h'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;s&quot;&gt;'lib/include/THC/*.h'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'lib/include/THC/generic/*.h'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]},&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;install_requires&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'pyyaml'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The function is composed entirely of keyword arguments, which serve two purposes:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Metadata (e.g. name, description, version)&lt;/li&gt;
  &lt;li&gt;The contents of the package&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We are concerned with #2. Let’s break down the individual components:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;ext_modules&lt;/strong&gt;: Python modules are either “pure” modules, containing only Python code, or “extension” modules written in the low-level language of the Python implementation. Here we are listing the extension modules in the build, including the main &lt;code class=&quot;highlighter-rouge&quot;&gt;torch._C&lt;/code&gt; library that contains our Python Tensor&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;cmdclass&lt;/strong&gt;: When using the &lt;code class=&quot;highlighter-rouge&quot;&gt;setup.py&lt;/code&gt; script from the command line, the user must specify one or more “commands”, code snippets that perform a specific action. For example, the “install” command builds and installs the package. This mapping routes specific commands to functions in &lt;code class=&quot;highlighter-rouge&quot;&gt;setup.py&lt;/code&gt; that implement them&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;packages&lt;/strong&gt;: The list of packages in the project. These are “pure” - i.e. they only contain Python code. These are defined elsewhere in &lt;code class=&quot;highlighter-rouge&quot;&gt;setup.py&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;package_data&lt;/strong&gt;: Additional files that need to be installed into a package: in this case the header files and shared libraries that the build will generate must be included in our installation&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;install_requires&lt;/strong&gt;: In order to build PyTorch, we need pyyaml. Setuptools will handle making sure that pyyaml will be available, downloading and installing it if necessary&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We will consider these components in more detail, but for now it is instructive to look at the end product of an installation – i.e. what Setuptools does after building the code.&lt;/p&gt;

&lt;h3 id=&quot;site_packages&quot;&gt;site_packages&lt;/h3&gt;

&lt;p&gt;Third party packages are by default installed into the &lt;code class=&quot;highlighter-rouge&quot;&gt;lib/&amp;lt;version&amp;gt;/site_packages&lt;/code&gt;  directory associated with your Python binary. For example, because I am using an &lt;a href=&quot;https://conda.io/miniconda.html&quot;&gt;Miniconda&lt;/a&gt; environment, my Python binary is found at:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:pytorch &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;master&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;which python
~/local/miniconda2/envs/p3/bin/python
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;And thus packages are installed into:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;/home/killeent/local/miniconda2/envs/p3/lib/python3.6/site-packages
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;I installed PyTorch, and let’s take a look into torch folder in site-packages:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:site-packages&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cd &lt;/span&gt;torch
&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:torch&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;ls
&lt;/span&gt;autograd  backends  _C.cpython-36m-x86_64-linux-gnu.so  cuda  distributed  _dl.cpython-36m-x86_64-linux-gnu.so  functional.py  __init__.py  legacy  lib  multiprocessing  nn  optim  __pycache__  serialization.py  _six.py  sparse  storage.py  _tensor_docs.py  tensor.py  _tensor_str.py  _thnn  _torch_docs.py  utils  _utils.py  version.py
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note that everything we would expect to be here is here:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;All the “pure” packages are here [todo print packages from setup.py to explain]&lt;/li&gt;
  &lt;li&gt;The extension libraries are here - the ._C* and ._dl* shared libraries&lt;/li&gt;
  &lt;li&gt;The package_data is here: the contents of lib/ match exactly what we described in the setup function:&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:torch&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;ls &lt;/span&gt;lib/
include     libnccl.so.1  libTHC.so.1   libTHCUNN.so.1  libTHNN.so.1  libTH.so.1   THCUNN.h  torch_shm_manager libnccl.so  libshm.so     libTHCS.so.1  libTHD.so.1     libTHPP.so.1  libTHS.so.1  THNN.h
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The Python interpreter looks into &lt;code class=&quot;highlighter-rouge&quot;&gt;site_packages&lt;/code&gt; during an import. If we call &lt;code class=&quot;highlighter-rouge&quot;&gt;import torch&lt;/code&gt; in our Python code it will find the module here and initialize and import it. You can read more about the import system &lt;a href=&quot;https://docs.python.org/3/tutorial/modules.html&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;building-individual-parts&quot;&gt;Building Individual Parts&lt;/h3&gt;

&lt;p&gt;Next, we will look at the various individual components of the build from start to finish. This will illustrate how we combine all the code we mentioned in the introduction.&lt;/p&gt;

&lt;h3 id=&quot;backend-torch-and-vendor-libraries&quot;&gt;Backend Torch and Vendor Libraries&lt;/h3&gt;

&lt;p&gt;Let’s take a look at the &lt;code class=&quot;highlighter-rouge&quot;&gt;install&lt;/code&gt; cmd override in PyTorch’s &lt;code class=&quot;highlighter-rouge&quot;&gt;setup.py&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;install&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;setuptools&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;command&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;install&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;install&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;skip_build&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run_command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'build_deps'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;setuptools&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;command&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;install&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;install&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We note the first thing it does is run a command called “build_deps” - let’s take a look at it’s &lt;code class=&quot;highlighter-rouge&quot;&gt;run()&lt;/code&gt; method:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;tools.nnwrap&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;generate_wrappers&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;generate_nn_wrappers&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;build_all_cmd&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'bash'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch/lib/build_all.sh'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;WITH_CUDA&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;build_all_cmd&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'--with-cuda'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;WITH_NCCL&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;SYSTEM_NCCL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;build_all_cmd&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'--with-nccl'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;WITH_DISTRIBUTED&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;build_all_cmd&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'--with-distributed'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;subprocess&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;call&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;build_all_cmd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;sys&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;exit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;generate_nn_wrappers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Here we note that that we have a shell script &lt;code class=&quot;highlighter-rouge&quot;&gt;build_all.sh&lt;/code&gt; in the &lt;code class=&quot;highlighter-rouge&quot;&gt;torch/lib/&lt;/code&gt; directory. This script is configurable by whether we are on a system with CUDA enabled, the NCCL library enabled, and PyTorch’s distributed library enabled.&lt;/p&gt;

&lt;p&gt;Let’s take a look in &lt;code class=&quot;highlighter-rouge&quot;&gt;torch/lib&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:lib &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;master&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;ls
&lt;/span&gt;build_all.sh  libshm  nccl  README.md  TH  THC  THCS  THCUNN  THD  THNN  THPP  THS
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Here we see the directories for all the backend libraries. &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;THC&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;THNN&lt;/code&gt;,  &lt;code class=&quot;highlighter-rouge&quot;&gt;THCUNN&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;nccl&lt;/code&gt; are &lt;a href=&quot;https://developer.atlassian.com/blog/2015/05/the-power-of-git-subtree/&quot;&gt;git subtrees&lt;/a&gt; that are in sync with the libraries in e.g. &lt;a href=&quot;https://github.com/torch/torch7/tree/master/lib/TH&quot;&gt;github.com/torch&lt;/a&gt;. &lt;code class=&quot;highlighter-rouge&quot;&gt;THS&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;THCS&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;THD&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;THPP&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;libshm&lt;/code&gt; are libraries specific to PyTorch. All of the libraries contain &lt;code class=&quot;highlighter-rouge&quot;&gt;CMakeLists.txt&lt;/code&gt; - indicating they are built with CMake.&lt;/p&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;build_all.sh&lt;/code&gt; is essentially a script that runs the CMake configure step on all of these libraries, and then &lt;code class=&quot;highlighter-rouge&quot;&gt;make install&lt;/code&gt;. Let’s run &lt;code class=&quot;highlighter-rouge&quot;&gt;./build_all.sh&lt;/code&gt; and see what we are left with:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:lib &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;master&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;./build_all.sh &lt;span class=&quot;nt&quot;&gt;--with-cuda&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--with-nccl&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--with-distributed&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;various CMake output logs]
&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:lib &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;master&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;ls
&lt;/span&gt;build  build_all.sh  include  libnccl.so  libnccl.so.1  libshm  libshm.so  libTHC.so.1  libTHCS.so.1  libTHCUNN.so.1  libTHD.so.1  libTHNN.so.1  libTHPP.so.1  libTH.so.1  libTHS.so.1  nccl  README.md  TH  THC  THCS  THCUNN  THCUNN.h  THD  THNN  THNN.h  THPP  THS  tmp_install  torch_shm_manager
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now there are a number of extra things in the directory:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Shared library files for each library&lt;/li&gt;
  &lt;li&gt;Headers for &lt;code class=&quot;highlighter-rouge&quot;&gt;THNN&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;THCUNN&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;build&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;tmp_install&lt;/code&gt; directories&lt;/li&gt;
  &lt;li&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;torch_shm_manager&lt;/code&gt; executable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s explore further. In the shell script, we create the &lt;code class=&quot;highlighter-rouge&quot;&gt;build&lt;/code&gt; directory and a subdir for each library to build:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# We create a build directory for the library, which will&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# contain the cmake output. $1 is the library to be built&lt;/span&gt;
  mkdir &lt;span class=&quot;nt&quot;&gt;-p&lt;/span&gt; build/&lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt;
  &lt;span class=&quot;nb&quot;&gt;cd &lt;/span&gt;build/&lt;span class=&quot;nv&quot;&gt;$1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Thus e.g. &lt;code class=&quot;highlighter-rouge&quot;&gt;build/TH&lt;/code&gt; contains the CMake configuration output including the &lt;code class=&quot;highlighter-rouge&quot;&gt;Makefile&lt;/code&gt; for building TH, and also the result of running make install in this directory.&lt;/p&gt;

&lt;p&gt;Let’s also look at &lt;code class=&quot;highlighter-rouge&quot;&gt;tmp_install&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:lib &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;master&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;ls &lt;/span&gt;tmp_install/
bin  include  lib  share
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;tmp_install&lt;/code&gt; looks like a standard install directory containing binaries, header files and library files. For example, &lt;code class=&quot;highlighter-rouge&quot;&gt;tmp_install/include/TH&lt;/code&gt; contains all the &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; headers, and &lt;code class=&quot;highlighter-rouge&quot;&gt;tmp_install/lib/&lt;/code&gt; contains the &lt;code class=&quot;highlighter-rouge&quot;&gt;libTH.so.1&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;So why have this directory? It is used to compile the libraries that depend on each other. For example, the &lt;code class=&quot;highlighter-rouge&quot;&gt;THC&lt;/code&gt; library depends on the &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; library and its headers. This is referenced in the build shell script as arguments to the &lt;code class=&quot;highlighter-rouge&quot;&gt;cmake&lt;/code&gt; command:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# install_dir is tmp_install&lt;/span&gt;
cmake ...
	&lt;span class=&quot;nt&quot;&gt;-DTH_INCLUDE_PATH&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$INSTALL_DIR&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;/include&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
	&lt;span class=&quot;nt&quot;&gt;-DTH_LIB_PATH&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$INSTALL_DIR&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;/lib&quot;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And indeed if we look at the &lt;code class=&quot;highlighter-rouge&quot;&gt;THC&lt;/code&gt; library we built:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:lib &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;master&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;ldd libTHC.so.1
	...
	libTH.so.1 &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; /home/killeent/github/pytorch/torch/lib/tmp_install/lib/./libTH.so.1 &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;0x00007f84478b7000&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The way the &lt;code class=&quot;highlighter-rouge&quot;&gt;build_all.sh&lt;/code&gt; specifies the include and library paths is a little messy but this is representative of the overall idea. Finally, at the end of the script:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# If all the builds succeed we copy the libraries, headers,&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# binaries to torch/lib&lt;/span&gt;
cp &lt;span class=&quot;nv&quot;&gt;$INSTALL_DIR&lt;/span&gt;/lib/&lt;span class=&quot;k&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;
cp THNN/generic/THNN.h &lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;
cp THCUNN/generic/THCUNN.h &lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;
cp &lt;span class=&quot;nt&quot;&gt;-r&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;$INSTALL_DIR&lt;/span&gt;/include &lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;
cp &lt;span class=&quot;nv&quot;&gt;$INSTALL_DIR&lt;/span&gt;/bin/&lt;span class=&quot;k&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;As we can see, at the end, we copy everything to the top-level &lt;code class=&quot;highlighter-rouge&quot;&gt;torch/lib&lt;/code&gt; directory - explaining the contents we saw above. We’ll see why we do this next:&lt;/p&gt;

&lt;h3 id=&quot;nn-wrappers&quot;&gt;NN Wrappers&lt;/h3&gt;

&lt;p&gt;Briefly, let’s touch on the last part of the &lt;code class=&quot;highlighter-rouge&quot;&gt;build_deps&lt;/code&gt; command: &lt;code class=&quot;highlighter-rouge&quot;&gt;generate_nn_wrappers()&lt;/code&gt;.  We bind into the backend libraries using PyTorch’s custom &lt;code class=&quot;highlighter-rouge&quot;&gt;cwrap&lt;/code&gt; tooling, which we touched upon in a previous post. For binding &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;THC&lt;/code&gt; we manually write the YAML declarations for each function. However, due to the relative simplicity of the &lt;code class=&quot;highlighter-rouge&quot;&gt;THNN&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;THCUNN&lt;/code&gt; libraries, we auto-generate both the cwrap declarations and the resulting C++ code.&lt;/p&gt;

&lt;p&gt;The reason we copy the &lt;code class=&quot;highlighter-rouge&quot;&gt;THNN.h&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;THCUNN.h&lt;/code&gt; header files into &lt;code class=&quot;highlighter-rouge&quot;&gt;torch/lib&lt;/code&gt; is that this is where the &lt;code class=&quot;highlighter-rouge&quot;&gt;generate_nn_wrappers()&lt;/code&gt; code expects these files to be located. &lt;code class=&quot;highlighter-rouge&quot;&gt;generate_nn_wrappers()&lt;/code&gt; does a few things:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Parses the header files, generating cwrap YAML declarations and writing them to output &lt;code class=&quot;highlighter-rouge&quot;&gt;.cwrap&lt;/code&gt; files&lt;/li&gt;
  &lt;li&gt;Calls &lt;code class=&quot;highlighter-rouge&quot;&gt;cwrap&lt;/code&gt; with the appropriate plugins on these &lt;code class=&quot;highlighter-rouge&quot;&gt;.cwrap&lt;/code&gt; files to generate source code for each&lt;/li&gt;
  &lt;li&gt;Parses the headers &lt;em&gt;a second time&lt;/em&gt; to generate &lt;code class=&quot;highlighter-rouge&quot;&gt;THNN_generic.h&lt;/code&gt; - a library that takes &lt;code class=&quot;highlighter-rouge&quot;&gt;THPP&lt;/code&gt; Tensors, PyTorch’s “generic” C++ Tensor Library, and calls into the appropriate &lt;code class=&quot;highlighter-rouge&quot;&gt;THNN&lt;/code&gt;/&lt;code class=&quot;highlighter-rouge&quot;&gt;THCUNN&lt;/code&gt; library function based on the dynamic type of the Tensor&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If we take a look into &lt;code class=&quot;highlighter-rouge&quot;&gt;torch/csrc/nn&lt;/code&gt; after running &lt;code class=&quot;highlighter-rouge&quot;&gt;generate_nn_wrappers()&lt;/code&gt; we can see the output:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:nn &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;master&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;ls
&lt;/span&gt;THCUNN.cpp  THCUNN.cwrap  THNN.cpp  THNN.cwrap  THNN_generic.cpp  THNN_generic.cwrap  THNN_generic.h  THNN_generic.inc.h
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;For example, the code generates cwrap like:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[[
  name: FloatBatchNormalization_updateOutput
  return: void
  cname: THNN_FloatBatchNormalization_updateOutput
  arguments:
    - void* state
    - THFloatTensor* input
    - THFloatTensor* output
    - type: THFloatTensor*
      name: weight
      nullable: True
    - type: THFloatTensor*
      name: bias
      nullable: True
    - THFloatTensor* running_mean
    - THFloatTensor* running_var
    - THFloatTensor* save_mean
    - THFloatTensor* save_std
    - bool train
    - double momentum
    - double eps
]]
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;with corresponding &lt;code class=&quot;highlighter-rouge&quot;&gt;.cpp&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;extern&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;C&quot;&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THNN_FloatBatchNormalization_updateOutput&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THFloatTensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THFloatTensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THFloatTensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THFloatTensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THFloatTensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THFloatTensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THFloatTensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THFloatTensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;double&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;double&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;PyObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;FloatBatchNormalization_updateOutput&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PyObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_unused&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PyObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
	&lt;span class=&quot;c1&quot;&gt;// argument checking, unpacking
&lt;/span&gt;	 &lt;span class=&quot;n&quot;&gt;PyThreadState&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_save&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;try&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Py_UNBLOCK_THREADS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;THNN_FloatBatchNormalization_updateOutput&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg_input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg_output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg_weight&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg_bias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg_running_mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg_running_var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg_save_mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg_save_std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg_train&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg_momentum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg_eps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Py_BLOCK_THREADS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Py_RETURN_NONE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;catch&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(...)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_save&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;Py_BLOCK_THREADS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;throw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In the &lt;code class=&quot;highlighter-rouge&quot;&gt;THPP&lt;/code&gt; generated code, the function looks like this:&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;BatchNormalization_updateOutput&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;thpp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;thpp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;output&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;thpp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;weight&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;thpp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bias&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;thpp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;running_mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;thpp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;running_var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;thpp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;save_mean&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;thpp&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tensor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;save_std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;train&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;double&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;momentum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;double&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
	&lt;span class=&quot;c1&quot;&gt;// Call appropriate THNN function based on tensor type, whether its on CUDA, etc.
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We will look a little more at how these source files are used later.&lt;/p&gt;

&lt;h3 id=&quot;building-the-pure-python-modules&quot;&gt;“Building” the Pure Python Modules&lt;/h3&gt;

&lt;p&gt;Now that we have built the backend libraries (the “dependencies”) we can move forward with building the actual PyTorch code. The next Setuptools command that runs is &lt;code class=&quot;highlighter-rouge&quot;&gt;build_py&lt;/code&gt;, which is used to build all the “Pure” python modules in our library. These are the “packages” passed to &lt;code class=&quot;highlighter-rouge&quot;&gt;setup.py&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The packages are found using the Setuptools’ utility function &lt;code class=&quot;highlighter-rouge&quot;&gt;find_packages()&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;packages&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;find_packages&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exclude&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'tools.*'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,))&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'torch'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch._thnn'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.autograd'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.backends'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.cuda'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.distributed'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.legacy'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.multiprocessing'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.nn'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.optim'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.sparse'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.utils'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.autograd._functions'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.backends.cudnn'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.legacy.nn'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.legacy.optim'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.nn._functions'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.nn.backends'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.nn.modules'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.nn.parallel'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.nn.utils'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.nn._functions.thnn'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.utils.data'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.utils.ffi'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.utils.serialization'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.utils.trainer'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.utils.backcompat'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'torch.utils.trainer.plugins'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;As we can see, &lt;code class=&quot;highlighter-rouge&quot;&gt;find_package&lt;/code&gt; has recursively traversed the &lt;code class=&quot;highlighter-rouge&quot;&gt;torch&lt;/code&gt; directory, finding all the directory paths that have an &lt;code class=&quot;highlighter-rouge&quot;&gt;__init__.py&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;When building with Setuptools, the tool creates a &lt;code class=&quot;highlighter-rouge&quot;&gt;build&lt;/code&gt; directory in the distribution root, i.e. the same location as the &lt;code class=&quot;highlighter-rouge&quot;&gt;setup.py&lt;/code&gt; file. Because PyTorch is composed of both “Pure” python modules and Extension Modules, we need to preserve information about the Operating System and Python version used when performing the build. So if we look in my &lt;code class=&quot;highlighter-rouge&quot;&gt;build&lt;/code&gt; directory, we see:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:pytorch &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;master&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;ls &lt;/span&gt;build
lib.linux-x86_64-3.6  temp.linux-x86_64-3.6
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This indicates that I’ve built the project on &lt;code class=&quot;highlighter-rouge&quot;&gt;linux-x86-64&lt;/code&gt; using Python 3.6. The lib directory contains the library files, while the temp directory contains files generated during the build that aren’t needed in the final installation.&lt;/p&gt;

&lt;p&gt;Because “Pure” python modules are just Python code, and don’t need to be “compiled”, the &lt;code class=&quot;highlighter-rouge&quot;&gt;build_py&lt;/code&gt; process simply copies files from their locations as found by &lt;code class=&quot;highlighter-rouge&quot;&gt;find_packages&lt;/code&gt; to the equivalent location in &lt;code class=&quot;highlighter-rouge&quot;&gt;build/&lt;/code&gt;. So our build output is littered with lines like:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;copying torch/autograd/_functions/blas.py -&amp;gt; build/lib.linux-x86_64-3.6/torch/autograd/_functions
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We also noted earlier that we could pass files and directories to the &lt;code class=&quot;highlighter-rouge&quot;&gt;package_data&lt;/code&gt; keyword argument to the main &lt;code class=&quot;highlighter-rouge&quot;&gt;setup()&lt;/code&gt; function, and that Setuptools would handle copying those files to the installation location. During &lt;code class=&quot;highlighter-rouge&quot;&gt;build_py&lt;/code&gt;, these files are copied to the &lt;code class=&quot;highlighter-rouge&quot;&gt;build/&lt;/code&gt; directory, so we also see lines like:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;copying torch/lib/libTH.so.1 -&amp;gt; build/lib.linux-x86_64-3.6/torch/lib
...
copying torch/lib/include/THC/generic/THCTensor.h -&amp;gt; build/lib.linux-x86_64-3.6/torch/lib/include/THC/generic
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;building-the-extension-modules&quot;&gt;Building the Extension Modules&lt;/h3&gt;

&lt;p&gt;Finally, we need to build the Extension Modules, i.e. the PyTorch modules written in C++ using the CPython backend. This also constitutes the majority of the code logic in &lt;code class=&quot;highlighter-rouge&quot;&gt;setup.py&lt;/code&gt;. Our overridden &lt;code class=&quot;highlighter-rouge&quot;&gt;build_ext&lt;/code&gt; Command has some special logic before the extensions themselves are actually built:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;tools.cwrap&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cwrap&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;tools.cwrap.plugins.THPPlugin&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THPPlugin&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;tools.cwrap.plugins.ArgcountSortPlugin&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ArgcountSortPlugin&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;tools.cwrap.plugins.AutoGPU&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AutoGPU&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;tools.cwrap.plugins.BoolOption&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BoolOption&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;tools.cwrap.plugins.KwargsPlugin&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;KwargsPlugin&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;tools.cwrap.plugins.NullableArguments&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NullableArguments&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;tools.cwrap.plugins.CuDNNPlugin&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CuDNNPlugin&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;tools.cwrap.plugins.WrapDim&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;WrapDim&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;tools.cwrap.plugins.AssertNDim&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AssertNDim&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;tools.cwrap.plugins.Broadcast&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Broadcast&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;tools.cwrap.plugins.ProcessorSpecificPlugin&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ProcessorSpecificPlugin&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;thp_plugin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THPPlugin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;cwrap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'torch/csrc/generic/TensorMethods.cwrap'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;plugins&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;ProcessorSpecificPlugin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BoolOption&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;thp_plugin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;AutoGPU&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;condition&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'IS_CUDA'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ArgcountSortPlugin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;KwargsPlugin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;AssertNDim&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;WrapDim&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Broadcast&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;cwrap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'torch/csrc/cudnn/cuDNN.cwrap'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;plugins&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;CuDNNPlugin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NullableArguments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Recall above that I documented that we auto-generated C++ code for calling into the &lt;code class=&quot;highlighter-rouge&quot;&gt;THNN&lt;/code&gt; etc. libraries. Here is where we bind &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;THC&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;CuDNN&lt;/code&gt;. We take the YAML declarations in &lt;code class=&quot;highlighter-rouge&quot;&gt;TensorMethods.cwrap&lt;/code&gt;, and use them to generate output C++ source files that contain implementations that work within PyTorch’s C++ Ecosystem. For example, a simple declaration like zero_:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[[
  name: zero_
  cname: zero
  return: self
  arguments:
    - THTensor* self
]]
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Generates code like:&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt; &lt;span class=&quot;n&quot;&gt;PyObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;THPTensor_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zero_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PyObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PyObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PyObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
	&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;THTensor_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zero&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LIBRARY_STATE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg_self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
	&lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In the previous post we documented how these functions are tied to specific Tensor types, so I won’t expand on that there. For the build process its enough to know that these C++ files are generated prior to the extension being built, because these source files are used during Extension compilation.&lt;/p&gt;

&lt;h3 id=&quot;specifying-the-extensions&quot;&gt;Specifying the Extensions&lt;/h3&gt;

&lt;p&gt;Unlike pure modules, it’s not enough just to list modules or packages and expect the Setuptools to go out and find the right files; you have to specify the extension name, source file(s), and any compile/link requirements (include directories, libraries to link with, etc.).&lt;/p&gt;

&lt;p&gt;The bulk (200~ LOC at the time of this writing) of the &lt;code class=&quot;highlighter-rouge&quot;&gt;setup.py&lt;/code&gt; goes into specifying how to build these Extensions. Here, some of the choices we make in &lt;code class=&quot;highlighter-rouge&quot;&gt;build_all.sh&lt;/code&gt; begin to make sense. For example, we saw that our build script specified a &lt;code class=&quot;highlighter-rouge&quot;&gt;tmp_install&lt;/code&gt; directory where we installed our backend libraries. In our &lt;code class=&quot;highlighter-rouge&quot;&gt;setup.py&lt;/code&gt; code, we reference this directory when adding to the list of directories containing header files to include:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# tmp_install_path is torch/lib/tmp_install&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;include_dirs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;cwd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cwd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;torch&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;csrc&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;tmp_install_path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;/include&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;tmp_install_path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;/include/TH&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;tmp_install_path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;/include/THPP&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;tmp_install_path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;/include/THNN&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Similarly, we copied the shared object libraries to &lt;code class=&quot;highlighter-rouge&quot;&gt;torch/csrc&lt;/code&gt; at the end of the &lt;code class=&quot;highlighter-rouge&quot;&gt;build_all.sh&lt;/code&gt; script. We reference these locations directly in our &lt;code class=&quot;highlighter-rouge&quot;&gt;setup.py&lt;/code&gt; code when identifying libraries that we may link against:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# lib_path is torch/lib&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;TH_LIB&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lib_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'libTH.so.1'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;THS_LIB&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lib_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'libTHS.so.1'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;THC_LIB&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lib_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'libTHC.so.1'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;THCS_LIB&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lib_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'libTHCS.so.1'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;THNN_LIB&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lib_path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'libTHNN.so.1'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Let’s consider how we build the main &lt;code class=&quot;highlighter-rouge&quot;&gt;torch._C&lt;/code&gt; Extension Module:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;C&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Extension&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;torch._C&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;libraries&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;main_libraries&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;sources&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;main_sources&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;language&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'c++'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;extra_compile_args&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;main_compile_args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;extra_compile_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;include_dirs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;include_dirs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;library_dirs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library_dirs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;extra_link_args&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;extra_link_args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;main_link_args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;make_relative_rpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'lib'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)],&lt;/span&gt;
              &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;ul&gt;
  &lt;li&gt;The main libraries are all the libraries we link against. This includes things like &lt;code class=&quot;highlighter-rouge&quot;&gt;shm&lt;/code&gt;, PyTorch’s shared memory management library, and also system libraries like &lt;code class=&quot;highlighter-rouge&quot;&gt;cudart&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;cudnn&lt;/code&gt;. Note that the &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; libraries &lt;em&gt;are not&lt;/em&gt; listed here&lt;/li&gt;
  &lt;li&gt;The main sources are the C++ files that make up the C++ backend for PyTorch&lt;/li&gt;
  &lt;li&gt;The compile args are various flags that configure compilation. For example, we might want to add debug flags when compiling in debug mode&lt;/li&gt;
  &lt;li&gt;The include dirs are the paths to all the directories containing header files. This is also another example where the &lt;code class=&quot;highlighter-rouge&quot;&gt;build_all.sh&lt;/code&gt; script is important - for example, we look for the &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; header files in &lt;code class=&quot;highlighter-rouge&quot;&gt;torch/lib/tmp_install/include/TH&lt;/code&gt; - which is the install location we specified with our CMake configuration&lt;/li&gt;
  &lt;li&gt;The library dirs are directories to search for shared libraries at link time. For example, we include &lt;code class=&quot;highlighter-rouge&quot;&gt;torch/lib&lt;/code&gt; - the location we copied our &lt;code class=&quot;highlighter-rouge&quot;&gt;.so&lt;/code&gt; files to at the end of &lt;code class=&quot;highlighter-rouge&quot;&gt;build_all.sh&lt;/code&gt;, but also the paths to the CUDA and CuDNN directories&lt;/li&gt;
  &lt;li&gt;The link arguments are used when linking object files together to create the extension. In PyTorch, this includes more &lt;em&gt;normal&lt;/em&gt; options like decided to link &lt;code class=&quot;highlighter-rouge&quot;&gt;libstdc++&lt;/code&gt; statically. However, there is one key component: &lt;strong&gt;this is where we link the backend TH libraries&lt;/strong&gt;. Note that we have lines like:&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# The explicit paths to .so files we described above&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;main_link_args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;TH_LIB&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THS_LIB&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THPP_LIB&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THNN_LIB&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You might be wondering why we do this as opposed to adding these libraries to the list we pass to the &lt;code class=&quot;highlighter-rouge&quot;&gt;libraries&lt;/code&gt; keyword argument. After all, that is a list of libraries to link against. The issue is that Lua Torch installs often set the &lt;code class=&quot;highlighter-rouge&quot;&gt;LD_LIBRARY_PATH&lt;/code&gt; variable, and thus we could mistakenly link against a &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; library built for Lua Torch, instead of the library we have built locally. This would be problematic because the code could be out of date, and also there are various configuration options for Lua Torch’s &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; that would not play nicely with PyTorch.&lt;/p&gt;

&lt;p&gt;As such, we manually specify the paths to the shared libraries we generated directly to the linker.&lt;/p&gt;

&lt;p&gt;There are other extensions needed to power PyTorch and they are built in a similar way. The Setuptools library invokes the C++ compiler and linker to build all of these extensions. If the builds succeed, we have successfully &lt;em&gt;built&lt;/em&gt; the PyTorch library and we can move on to installation.&lt;/p&gt;

&lt;h3 id=&quot;installation&quot;&gt;Installation&lt;/h3&gt;

&lt;p&gt;After building has finished, installation is quite simple. We simply have to copy everything from our &lt;code class=&quot;highlighter-rouge&quot;&gt;build/lib.linux-x86_64-3.6&lt;/code&gt; directory to the appropriate installation directory. Recall that we noted above that this directory is the &lt;code class=&quot;highlighter-rouge&quot;&gt;site_packages&lt;/code&gt; directory associated with our Python binary. As a result, we see lines like:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;running install_lib
creating /home/killeent/local/miniconda2/envs/p3/lib/python3.6/site-packages/torch
copying build/lib.linux-x86_64-3.6/torch/_C.cpython-36m-x86_64-linux-gnu.so -&amp;gt; /home/killeent/local/miniconda2/envs/p3/lib/python3.6/site-packages/torch
copying build/lib.linux-x86_64-3.6/torch/_dl.cpython-36m-x86_64-linux-gnu.so -&amp;gt; /home/killeent/local/miniconda2/envs/p3/lib/python3.6/site-packages/torch
creating /home/killeent/local/miniconda2/envs/p3/lib/python3.6/site-packages/torch/_thnn
copying build/lib.linux-x86_64-3.6/torch/_thnn/_THNN.cpython-36m-x86_64-linux-gnu.so -&amp;gt; /home/killeent/local/miniconda2/envs/p3/lib/python3.6/site-packages/torch/_thnn
copying build/lib.linux-x86_64-3.6/torch/_thnn/_THCUNN.cpython-36m-x86_64-linux-gnu.so -&amp;gt; /home/killeent/local/miniconda2/envs/p3/lib/python3.6/site-packages/torch/_thnn
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Finally lets power up the Python interpreter. When the Python interpreter executes an import statement, it searches for Python code and extension modules along a search path. A default value for the path is configured into the Python binary when the interpreter is built.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# note we are now in my home directory&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:~&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;python
Python 3.6.1 |Continuum Analytics, Inc.| &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;default, Mar 22 2017, 19:54:23&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;GCC 4.4.7 20120313 &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;Red Hat 4.4.7-1&lt;span class=&quot;o&quot;&gt;)]&lt;/span&gt; on linux
Type &lt;span class=&quot;s2&quot;&gt;&quot;help&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;copyright&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;credits&quot;&lt;/span&gt; or &lt;span class=&quot;s2&quot;&gt;&quot;license&quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;more information.
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; import sys
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; sys.path
&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;''&lt;/span&gt;, &lt;span class=&quot;s1&quot;&gt;'/home/killeent/local/miniconda2/envs/p3/lib/python36.zip'&lt;/span&gt;, &lt;span class=&quot;s1&quot;&gt;'/home/killeent/local/miniconda2/envs/p3/lib/python3.6'&lt;/span&gt;, &lt;span class=&quot;s1&quot;&gt;'/home/killeent/local/miniconda2/envs/p3/lib/python3.6/lib-dynload'&lt;/span&gt;, &lt;span class=&quot;s1&quot;&gt;'/home/killeent/.local/lib/python3.6/site-packages'&lt;/span&gt;, &lt;span class=&quot;s1&quot;&gt;'/home/killeent/local/miniconda2/envs/p3/lib/python3.6/site-packages'&lt;/span&gt;, &lt;span class=&quot;s1&quot;&gt;'/home/killeent/github/pytorch'&lt;/span&gt;, &lt;span class=&quot;s1&quot;&gt;'/home/killeent/local/miniconda2/envs/p3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg'&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;As we can see, the &lt;code class=&quot;highlighter-rouge&quot;&gt;site-packages&lt;/code&gt; directory we copied our PyTorch installation to is part of search path. Now let’s load the &lt;code class=&quot;highlighter-rouge&quot;&gt;torch&lt;/code&gt; module and see its location:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;torch&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;inspect&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;inspect&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;getfile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;s&quot;&gt;'/home/killeent/local/miniconda2/envs/p3/lib/python3.6/site-packages/torch/__init__.py'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;As we can see, we have loaded the module from &lt;code class=&quot;highlighter-rouge&quot;&gt;site_packages&lt;/code&gt; as expected - and our build and installation is successful!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Python prepends the empty string to &lt;code class=&quot;highlighter-rouge&quot;&gt;sys.path&lt;/code&gt; to represent the current working directory - making it the first place we search for a module. So if we run Python from the pytorch directory, we would accidentally load the local version of PyTorch rather than our installed version. This is something to watch out for.&lt;/p&gt;

&lt;h3 id=&quot;addendum---developer-efficiency-3rd-party-libraries-things-i-didnt-cover&quot;&gt;Addendum - Developer Efficiency, 3rd Party Libraries, Things I Didn’t Cover&lt;/h3&gt;

&lt;p&gt;The entire installation loop for PyTorch can be quite time-consuming. On my devserver, it takes around 5 minutes for an installation from source. Often times, when developing PyTorch, we only want to work on a subset of the entire project, and re-build only that subset in order to test changes. Fortunately, our build system enables this.&lt;/p&gt;

&lt;h3 id=&quot;setuptools-develop-mode&quot;&gt;Setuptools Develop Mode&lt;/h3&gt;

&lt;p&gt;The main tool that supports this is Setuptools &lt;code class=&quot;highlighter-rouge&quot;&gt;develop&lt;/code&gt; command. The documentation states that:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;This command allows you to deploy your project’s source for use in one or more “staging areas” where it will be available for importing. This deployment is done in such a way that changes to the project source are immediately available in the staging area(s), without needing to run a build or install step after each change.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But how does it work? Suppose we run &lt;code class=&quot;highlighter-rouge&quot;&gt;python setup.py build develop&lt;/code&gt; in the PyTorch directory. The &lt;code class=&quot;highlighter-rouge&quot;&gt;build&lt;/code&gt; command is run, building our dependencies (&lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;THPP&lt;/code&gt;, etc.) and the extension libraries. However, if we look inside &lt;code class=&quot;highlighter-rouge&quot;&gt;site-packages&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:site-packages&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;ls&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-la&lt;/span&gt; torch&lt;span class=&quot;k&quot;&gt;*&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;-rw-r--r--&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt; 1 killeent users 31 Jun 27 08:02 torch.egg-link
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Looking at the contents of the &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.egg-link&lt;/code&gt; file, it simply references the PyTorch directory:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:site-packages&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cat &lt;/span&gt;torch.egg-link
/home/killeent/github/pytorch
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If we navigate back to the PyTorch directory, we see there is a new directory &lt;code class=&quot;highlighter-rouge&quot;&gt;torch.egg-info&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:pytorch &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;master&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;ls&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-la&lt;/span&gt; torch.egg-info/
total 28
drwxr-xr-x.  2 killeent users  4096 Jun 27 08:09 &lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;
drwxr-xr-x. 10 killeent users  4096 Jun 27 08:01 ..
&lt;span class=&quot;nt&quot;&gt;-rw-r--r--&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;  1 killeent users     1 Jun 27 08:01 dependency_links.txt
&lt;span class=&quot;nt&quot;&gt;-rw-r--r--&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;  1 killeent users   255 Jun 27 08:01 PKG-INFO
&lt;span class=&quot;nt&quot;&gt;-rw-r--r--&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;  1 killeent users     7 Jun 27 08:01 requires.txt
&lt;span class=&quot;nt&quot;&gt;-rw-r--r--&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;  1 killeent users 16080 Jun 27 08:01 SOURCES.txt
&lt;span class=&quot;nt&quot;&gt;-rw-r--r--&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;.&lt;/span&gt;  1 killeent users    12 Jun 27 08:01 top_level.txt
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This file contains metadata about the PyTorch project. For example, &lt;code class=&quot;highlighter-rouge&quot;&gt;requirements.txt&lt;/code&gt; lists all of the dependencies for setting up PyTorch:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:pytorch &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;master&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cat &lt;/span&gt;torch.egg-info/requires.txt
pyyaml
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Without going into too much detail, &lt;code class=&quot;highlighter-rouge&quot;&gt;develop&lt;/code&gt; allows us to essentially treat the PyTorch repo itself as if it were in &lt;code class=&quot;highlighter-rouge&quot;&gt;site-packages&lt;/code&gt;, so we can import the module and it just works:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:~&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;python
Python 3.6.1 |Continuum Analytics, Inc.| &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;default, Mar 22 2017, 19:54:23&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;GCC 4.4.7 20120313 &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;Red Hat 4.4.7-1&lt;span class=&quot;o&quot;&gt;)]&lt;/span&gt; on linux
Type &lt;span class=&quot;s2&quot;&gt;&quot;help&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;copyright&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;credits&quot;&lt;/span&gt; or &lt;span class=&quot;s2&quot;&gt;&quot;license&quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;more information.
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; import torch
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; torch.__file__
&lt;span class=&quot;s1&quot;&gt;'/home/killeent/github/pytorch/torch/__init__.py'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;As a result, the following consequences hold:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;If we change a Python source file, the changes are automatically picked up, and we don’t have to run any commands to let the Python interpreter &lt;em&gt;see&lt;/em&gt; this change&lt;/li&gt;
  &lt;li&gt;If we change a C++ Source File in one of the extension libraries, we can re-run the &lt;code class=&quot;highlighter-rouge&quot;&gt;develop&lt;/code&gt; command, it will re-build the extension&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thus we can develop the PyTorch codebases seamlessly, and test our changes in an easy way.&lt;/p&gt;

&lt;h4 id=&quot;working-on-the-dependency-libraries&quot;&gt;Working on the Dependency Libraries&lt;/h4&gt;

&lt;p&gt;If we are working on the dependencies (e.g. &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;THPP&lt;/code&gt;, etc.) we can re-build our changes more quickly by simply running the &lt;code class=&quot;highlighter-rouge&quot;&gt;build_deps&lt;/code&gt; command directly. This will automatically call into &lt;code class=&quot;highlighter-rouge&quot;&gt;build_all.sh&lt;/code&gt; to re-build our libraries, and copy the generated libraries appropriately. If we are using Setuptools &lt;code class=&quot;highlighter-rouge&quot;&gt;develop&lt;/code&gt; mode, we will be using the local extension library built in the PyTorch directory. Because we have specified the paths to the shared libraries when compiling our Extension Libraries, the changes will be picked up:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# we are using the local extension&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:~&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;python
Python 3.6.1 |Continuum Analytics, Inc.| &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;default, Mar 22 2017, 19:54:23&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;GCC 4.4.7 20120313 &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;Red Hat 4.4.7-1&lt;span class=&quot;o&quot;&gt;)]&lt;/span&gt; on linux
Type &lt;span class=&quot;s2&quot;&gt;&quot;help&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;copyright&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;credits&quot;&lt;/span&gt; or &lt;span class=&quot;s2&quot;&gt;&quot;license&quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;more information.
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; import torch
&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; torch._C.__file__
&lt;span class=&quot;s1&quot;&gt;'/home/killeent/github/pytorch/torch/_C.cpython-36m-x86_64-linux-gnu.so'&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# it references the local shared object library we just re-built&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:~&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;ldd /home/killeent/github/pytorch/torch/_C.cpython-36m-x86_64-linux-gnu.so
&lt;span class=&quot;c&quot;&gt;# ...&lt;/span&gt;
libTH.so.1 &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; /home/killeent/github/pytorch/torch/lib/libTH.so.1 &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;0x00007f543d0e2000&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;As such, we can test any changes here without having to do a full rebuild.&lt;/p&gt;

&lt;h4 id=&quot;3rd-party-libraries&quot;&gt;3rd Party Libraries&lt;/h4&gt;

&lt;p&gt;PyTorch has dependencies on some 3rd party libraries. The usual mechanism for using these libraries is to install them via Anaconda, and then link against them. For example, we can use the &lt;code class=&quot;highlighter-rouge&quot;&gt;mkl&lt;/code&gt; library with PyTorch by doing:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# installed to miniconda2/envs/p3/lib/libmkl_intel_lp64.so&lt;/span&gt;
conda install mkl
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And then as long as we have the path to this &lt;code class=&quot;highlighter-rouge&quot;&gt;lib&lt;/code&gt; directory on our &lt;code class=&quot;highlighter-rouge&quot;&gt;$CMAKE_PREFIX_PATH&lt;/code&gt;, it will successfully find this library when compiling:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# in the site-packages dir&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;p3&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; killeent@devgpu047:torch&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;ldd _C.cpython-36m-x86_64-linux-gnu.so
&lt;span class=&quot;c&quot;&gt;# ...&lt;/span&gt;
libmkl_intel_lp64.so &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; /home/killeent/local/miniconda2/envs/p3/lib/libmkl_intel_lp64.so &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;0x00007f3450bba000&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;not-covered-but-also-relevant&quot;&gt;Not Covered, But Also Relevant&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;How &lt;code class=&quot;highlighter-rouge&quot;&gt;ccache&lt;/code&gt; is used to speed up build times&lt;/li&gt;
  &lt;li&gt;How PyTorch’s top-level &lt;code class=&quot;highlighter-rouge&quot;&gt;__init__.py&lt;/code&gt; file handles the initial module import and pulling together all the various modules and extension libraries&lt;/li&gt;
  &lt;li&gt;The CMake build system, how the backend libraries are configured and built with CMake&lt;/li&gt;
&lt;/ul&gt;</content><author><name>Trevor Killeen</name></author><summary type="html">In the first post I explained how we generate a torch.Tensor object that you can use in your Python interpreter. Next, I will explore the build system for PyTorch. The PyTorch codebase has a variety of components:</summary></entry><entry><title type="html">A Tour of PyTorch Internals (Part I)</title><link href="https://pytorch.org/blog/a-tour-of-pytorch-internals-1/" rel="alternate" type="text/html" title="A Tour of PyTorch Internals (Part I)" /><published>2017-05-11T10:00:00-07:00</published><updated>2017-05-11T10:00:00-07:00</updated><id>https://pytorch.org/blog/a-tour-of-pytorch-internals-1</id><content type="html" xml:base="https://pytorch.org/blog/a-tour-of-pytorch-internals-1/">&lt;p&gt;The fundamental unit in PyTorch is the Tensor. This post will serve as an overview for how we implement Tensors in PyTorch, such that the user can interact with it from the Python shell. In particular, we want to answer four main questions:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;How does PyTorch extend the Python interpreter to define a Tensor type that can be manipulated from Python code?&lt;/li&gt;
  &lt;li&gt;How does PyTorch wrap the C libraries that actually define the Tensor’s properties and methods?&lt;/li&gt;
  &lt;li&gt;How does PyTorch cwrap work to generate code for Tensor methods?&lt;/li&gt;
  &lt;li&gt;How does PyTorch’s build system take all of these components to compile and generate a workable application?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;extending-the-python-interpreter&quot;&gt;Extending the Python Interpreter&lt;/h2&gt;

&lt;p&gt;PyTorch defines a new package &lt;code class=&quot;highlighter-rouge&quot;&gt;torch&lt;/code&gt;. In this post we will consider the &lt;code class=&quot;highlighter-rouge&quot;&gt;._C&lt;/code&gt; module. This module is known as an “extension module” - a Python module written in C. Such modules allow us to define new built-in object types (e.g. the &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&lt;/code&gt;) and to call C/C++ functions.&lt;/p&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;._C&lt;/code&gt; module is defined in &lt;code class=&quot;highlighter-rouge&quot;&gt;torch/csrc/Module.cpp&lt;/code&gt;. The &lt;code class=&quot;highlighter-rouge&quot;&gt;init_C()&lt;/code&gt; / &lt;code class=&quot;highlighter-rouge&quot;&gt;PyInit__C()&lt;/code&gt; function creates the module and adds the method definitions as appropriate. This module is passed around to a number of different &lt;code class=&quot;highlighter-rouge&quot;&gt;__init()&lt;/code&gt; functions that add further objects to the module, register new types, etc.&lt;/p&gt;

&lt;p&gt;One collection of these &lt;code class=&quot;highlighter-rouge&quot;&gt;__init()&lt;/code&gt; calls is the following:&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;ASSERT_TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;THPDoubleTensor_init&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ASSERT_TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;THPFloatTensor_init&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ASSERT_TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;THPHalfTensor_init&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ASSERT_TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;THPLongTensor_init&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ASSERT_TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;THPIntTensor_init&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ASSERT_TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;THPShortTensor_init&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ASSERT_TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;THPCharTensor_init&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ASSERT_TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;THPByteTensor_init&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;These &lt;code class=&quot;highlighter-rouge&quot;&gt;__init()&lt;/code&gt; functions add the Tensor object for each type to the &lt;code class=&quot;highlighter-rouge&quot;&gt;._C&lt;/code&gt; module so that they can be used in the module. Let’s learn how these methods work.&lt;/p&gt;

&lt;h2 id=&quot;the-thptensor-type&quot;&gt;The THPTensor Type&lt;/h2&gt;

&lt;p&gt;Much like the underlying &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;THC&lt;/code&gt; libraries, PyTorch defines a “generic” Tensor which is then specialized to a number of different types. Before considering how this specialization works, let’s first consider how defining a new type in Python works, and how we create the generic &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor&lt;/code&gt; type.&lt;/p&gt;

&lt;p&gt;The Python runtime sees all Python objects as variables of type &lt;code class=&quot;highlighter-rouge&quot;&gt;PyObject *&lt;/code&gt;, which serves as a “base type” for all Python objects. Every Python type contains the refcount for the object, and a pointer to the object’s &lt;em&gt;type object&lt;/em&gt;. The type object determines the properties of the type. For example, it might contain a list of methods associated with the type, and which C functions get called to implement those methods. The object also contains any fields necessary to represent its state.&lt;/p&gt;

&lt;p&gt;The formula for defining a new type is as follows:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Create a struct that defines what the new object will contain&lt;/li&gt;
  &lt;li&gt;Define the type object for the type&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The struct itself could be very simple. Inn Python, all floating point types are actually objects on the heap. The Python float struct is defined as:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;typedef&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;PyObject_HEAD&lt;/span&gt;
    &lt;span class=&quot;kt&quot;&gt;double&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ob_fval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PyFloatObject&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;PyObject_HEAD&lt;/code&gt; is a macro that brings in the code that implements an object’s reference counting, and a pointer to the corresponding type object. So in this case, to implement a float, the only other “state” needed is the floating point value itself.&lt;/p&gt;

&lt;p&gt;Now, let’s see the struct for our &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor&lt;/code&gt; type:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THPTensor&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;PyObject_HEAD&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;THTensor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;cdata&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Pretty simple, right? We are just wrapping the underlying &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; tensor by storing a pointer to it.&lt;/p&gt;

&lt;p&gt;The key part is defining the “type object” for a new type. An example definition of a type object for our Python float takes the form:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PyTypeObject&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;py_FloatType&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;PyVarObject_HEAD_INIT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;py.FloatObject&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;          &lt;span class=&quot;cm&quot;&gt;/* tp_name */&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;sizeof&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PyFloatObject&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;     &lt;span class=&quot;cm&quot;&gt;/* tp_basicsize */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_itemsize */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_dealloc */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_print */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_getattr */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_setattr */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_as_async */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_repr */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_as_number */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_as_sequence */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_as_mapping */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_hash  */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_call */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_str */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_getattro */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_setattro */&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;                         &lt;span class=&quot;cm&quot;&gt;/* tp_as_buffer */&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;Py_TPFLAGS_DEFAULT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;        &lt;span class=&quot;cm&quot;&gt;/* tp_flags */&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;A floating point number&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;cm&quot;&gt;/* tp_doc */&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The easiest way to think of a &lt;em&gt;type object&lt;/em&gt; is as a set of fields which define the properties of the object. For example, the &lt;code class=&quot;highlighter-rouge&quot;&gt;tp_basicsize&lt;/code&gt; field is set to &lt;code class=&quot;highlighter-rouge&quot;&gt;sizeof(PyFloatObject)&lt;/code&gt;. This is so that Python knows how much memory to allocate when calling &lt;code class=&quot;highlighter-rouge&quot;&gt;PyObject_New()&lt;/code&gt; for a &lt;code class=&quot;highlighter-rouge&quot;&gt;PyFloatObject.&lt;/code&gt; The full list of fields you can set is defined in &lt;code class=&quot;highlighter-rouge&quot;&gt;object.h&lt;/code&gt; in the CPython backend:
https://github.com/python/cpython/blob/master/Include/object.h.&lt;/p&gt;

&lt;p&gt;The type object for our &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor&lt;/code&gt; is &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensorType&lt;/code&gt;, defined in &lt;code class=&quot;highlighter-rouge&quot;&gt;csrc/generic/Tensor.cpp&lt;/code&gt;. This object defines the name, size, mapping methods, etc. for a &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;As an example, let’s take a look at the &lt;code class=&quot;highlighter-rouge&quot;&gt;tp_new&lt;/code&gt; function we set in the &lt;code class=&quot;highlighter-rouge&quot;&gt;PyTypeObject&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;PyTypeObject&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THPTensorType&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;PyVarObject_HEAD_INIT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;THPTensor_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pynew&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;cm&quot;&gt;/* tp_new */&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;tp_new&lt;/code&gt; function enables object creation. It is responsible for creating (as opposed to initializing) objects of that type and is equivalent to the &lt;code class=&quot;highlighter-rouge&quot;&gt;__new__()&lt;/code&gt; method at the Python level. The C implementation is a static method that is passed the type being instantiated and any arguments, and returns a newly created object.&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PyObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;THPTensor_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pynew&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PyTypeObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PyObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PyObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kwargs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;HANDLE_TH_ERRORS&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;Py_ssize_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num_args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PyTuple_Size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

  &lt;span class=&quot;n&quot;&gt;THPTensorPtr&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;self&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;THPTensor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tp_alloc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;type&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// more code below
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The first thing our new function does is allocate the &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor&lt;/code&gt;. It then runs through a series of initializations based off of the args passed to the function. For example, when creating a &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor&lt;/code&gt; &lt;em&gt;x&lt;/em&gt; from another &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor&lt;/code&gt; &lt;em&gt;y&lt;/em&gt;, we set the newly created &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor&lt;/code&gt;’s &lt;code class=&quot;highlighter-rouge&quot;&gt;cdata&lt;/code&gt; field to be the result of calling &lt;code class=&quot;highlighter-rouge&quot;&gt;THTensor_(newWithTensor)&lt;/code&gt; with the &lt;em&gt;y&lt;/em&gt;’s underlying &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; Tensor as an argument. Similar constructors exist for sizes, storages, NumPy arrays, and sequences.&lt;/p&gt;

&lt;p&gt;** Note that we solely use &lt;code class=&quot;highlighter-rouge&quot;&gt;tp_new&lt;/code&gt;, and not a combination of &lt;code class=&quot;highlighter-rouge&quot;&gt;tp_new&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;tp_init&lt;/code&gt; (which corresponds to the &lt;code class=&quot;highlighter-rouge&quot;&gt;__init__()&lt;/code&gt; function).&lt;/p&gt;

&lt;p&gt;The other important thing defined in Tensor.cpp is how indexing works. PyTorch Tensors support Python’s &lt;strong&gt;Mapping Protocol&lt;/strong&gt;. This allows us to do things like:&lt;/p&gt;
&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fill_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;//&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;//&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;etc&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;** Note that this indexing extends to Tensor with more than one dimension&lt;/p&gt;

&lt;p&gt;We are able to use the &lt;code class=&quot;highlighter-rouge&quot;&gt;[]&lt;/code&gt;-style notation by defining the three mapping methods described &lt;a href=&quot;https://docs.python.org/3.7/c-api/typeobj.html#c.PyMappingMethods&quot;&gt;here.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The most important methods are &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor_(getValue)&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor_(setValue)&lt;/code&gt; which describe how to index a Tensor, for returning a new Tensor/Scalar, or updating the values of an existing Tensor in place. Read through these implementations to better understand how PyTorch supports basic tensor indexing.&lt;/p&gt;

&lt;h3 id=&quot;generic-builds-part-one&quot;&gt;Generic Builds (Part One)&lt;/h3&gt;

&lt;p&gt;We could spend a ton of time exploring various aspects of the &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor&lt;/code&gt; and how it relates to defining a new Python object. But we still need to see how the &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor_(init)()&lt;/code&gt; function is translated to the &lt;code class=&quot;highlighter-rouge&quot;&gt;THPIntTensor_init()&lt;/code&gt; we used in our module initialization. How do we take our &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor.cpp&lt;/code&gt; file that defines a “generic” Tensor and use it to generate Python objects for all the permutations of types? To put it another way, &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor.cpp&lt;/code&gt; is littered with lines of code like:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THPTensor_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;New&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;THTensor_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LIBRARY_STATE_NOARGS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This illustrates both cases we need to make type-specific:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Our output code will call &lt;code class=&quot;highlighter-rouge&quot;&gt;THP&amp;lt;Type&amp;gt;Tensor_New(...)&lt;/code&gt; in place of &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor_(New)&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Our output code will call &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&amp;lt;Type&amp;gt;Tensor_new(...)&lt;/code&gt; in place of &lt;code class=&quot;highlighter-rouge&quot;&gt;THTensor_(new)&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, for all supported Tensor types, we need to “generate” source code that has done the above substitutions. This is part of the “build” process for PyTorch. PyTorch relies on Setuptools (https://setuptools.readthedocs.io/en/latest/) for building the package, and we define a &lt;code class=&quot;highlighter-rouge&quot;&gt;setup.py&lt;/code&gt; file in the top-level directory to customize the build process.&lt;/p&gt;

&lt;p&gt;One component building an Extension module using Setuptools is to list the source files involved in the compilation. However, our &lt;code class=&quot;highlighter-rouge&quot;&gt;csrc/generic/Tensor.cpp&lt;/code&gt; file is not listed! So how does the code in this file end up being a part of the end product?&lt;/p&gt;

&lt;p&gt;Recall that we are calling the &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor*&lt;/code&gt; functions (such as &lt;code class=&quot;highlighter-rouge&quot;&gt;init&lt;/code&gt;) from the directory above &lt;code class=&quot;highlighter-rouge&quot;&gt;generic&lt;/code&gt;. If we take a look in this directory, there is another file &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor.cpp&lt;/code&gt; defined. The last line of this file is important:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;//generic_include TH torch/csrc/generic/Tensor.cpp
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Note that this &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor.cpp&lt;/code&gt; file is included in &lt;code class=&quot;highlighter-rouge&quot;&gt;setup.py&lt;/code&gt;, but it is wrapped in a call to a Python helper function called &lt;code class=&quot;highlighter-rouge&quot;&gt;split_types&lt;/code&gt;. This function takes as input a file, and looks for the “//generic_include” string in the file contents. If it is found, it generates a new output file for each Tensor type, with the following changes:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The output file is renamed to &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor&amp;lt;Type&amp;gt;.cpp&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;The output file is slightly modified as follows:&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;cp&quot;&gt;# Before:
&lt;/span&gt;&lt;span class=&quot;c1&quot;&gt;//generic_include TH torch/csrc/generic/Tensor.cpp
&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;# After:
#define TH_GENERIC_FILE &quot;torch/src/generic/Tensor.cpp&quot;
#include &quot;TH/THGenerate&amp;lt;Type&amp;gt;Type.h&quot;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Including the header file on the second line has the side effect of including the source code in &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor.cpp&lt;/code&gt; with some additional context defined. Let’s take a look at one of the headers:&lt;/p&gt;

&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;cp&quot;&gt;#ifndef TH_GENERIC_FILE
#error &quot;You must define TH_GENERIC_FILE before including THGenerateFloatType.h&quot;
#endif
&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;#define real float
#define accreal double
#define TH_CONVERT_REAL_TO_ACCREAL(_val) (accreal)(_val)
#define TH_CONVERT_ACCREAL_TO_REAL(_val) (real)(_val)
#define Real Float
#define THInf FLT_MAX
#define TH_REAL_IS_FLOAT
#line 1 TH_GENERIC_FILE
#include TH_GENERIC_FILE
#undef accreal
#undef real
#undef Real
#undef THInf
#undef TH_REAL_IS_FLOAT
#undef TH_CONVERT_REAL_TO_ACCREAL
#undef TH_CONVERT_ACCREAL_TO_REAL
&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;#ifndef THGenerateManyTypes
#undef TH_GENERIC_FILE
#endif
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;What this is doing is bringing in the code from the generic &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor.cpp&lt;/code&gt; file and surrounding it with the following macro definitions. For example, we define real as a float, so any code in the generic Tensor implementation that refers to something as a real will have that real replaced with a float. In the corresponding file &lt;code class=&quot;highlighter-rouge&quot;&gt;THGenerateIntType.h&lt;/code&gt;, the same macro would replace &lt;code class=&quot;highlighter-rouge&quot;&gt;real&lt;/code&gt; with &lt;code class=&quot;highlighter-rouge&quot;&gt;int&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;These output files are returned from &lt;code class=&quot;highlighter-rouge&quot;&gt;split_types&lt;/code&gt; and added to the list of source files, so we can see how the &lt;code class=&quot;highlighter-rouge&quot;&gt;.cpp&lt;/code&gt; code for different types is created.&lt;/p&gt;

&lt;p&gt;There are a few things to note here: First, the &lt;code class=&quot;highlighter-rouge&quot;&gt;split_types&lt;/code&gt; function is not strictly necessary. We could wrap the code in &lt;code class=&quot;highlighter-rouge&quot;&gt;Tensor.cpp&lt;/code&gt; in a single file, repeating it for each type. The reason we split the code into separate files is to speed up compilation. Second, what we mean when we talk about the type replacement (e.g. replace real with a float) is that the C preprocessor will perform these substitutions during compilation. Merely surrounding the source code with these macros has no side effects until preprocessing.&lt;/p&gt;

&lt;h3 id=&quot;generic-builds-part-two&quot;&gt;Generic Builds (Part Two)&lt;/h3&gt;

&lt;p&gt;Now that we have source files for all the Tensor types, we need to consider how the corresponding header declarations are created, and also how the conversions from &lt;code class=&quot;highlighter-rouge&quot;&gt;THTensor_(method)&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor_(method)&lt;/code&gt; to &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&amp;lt;Type&amp;gt;Tensor_method&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;THP&amp;lt;Type&amp;gt;Tensor_method&lt;/code&gt; work. For example, &lt;code class=&quot;highlighter-rouge&quot;&gt;csrc/generic/Tensor.h&lt;/code&gt; has declarations like:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;THP_API&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PyObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THPTensor_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;New&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;THTensor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;We use the same strategy for generating code in the source files for the headers. In &lt;code class=&quot;highlighter-rouge&quot;&gt;csrc/Tensor.h&lt;/code&gt;, we do the following:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;cp&quot;&gt;#include &quot;generic/Tensor.h&quot;
#include &amp;lt;TH/THGenerateAllTypes.h&amp;gt;
&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;#include &quot;generic/Tensor.h&quot;
#include &amp;lt;TH/THGenerateHalfType.h&amp;gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This has the same effect, where we draw in the code from the generic header, wrapped with the same macro definitions, for each type. The only difference is that the resulting code is contained all within the same header file, as opposed to being split into multiple source files.&lt;/p&gt;

&lt;p&gt;Lastly, we need to consider how we “convert” or “substitute” the function types. If we look in the same header file, we see a bunch of &lt;code class=&quot;highlighter-rouge&quot;&gt;#define&lt;/code&gt; statements, including:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;cp&quot;&gt;#define THPTensor_(NAME)            TH_CONCAT_4(THP,Real,Tensor_,NAME)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This macro says that any string in the source code matching the format &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor_(NAME)&lt;/code&gt; should be replaced with &lt;code class=&quot;highlighter-rouge&quot;&gt;THPRealTensor_NAME&lt;/code&gt;, where Real is derived from whatever the symbol Real is &lt;code class=&quot;highlighter-rouge&quot;&gt;#define&lt;/code&gt;‘d to be at the time. Because our header code and source code is surrounded by macro definitions for all the types as seen above, after the preprocessor has run, the resulting code is what we would expect. The code in the &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; library defines the same macro for &lt;code class=&quot;highlighter-rouge&quot;&gt;THTensor_(NAME)&lt;/code&gt;, supporting the translation of those functions as well. In this way, we end up with header and source files with specialized code.&lt;/p&gt;

&lt;h4 id=&quot;module-objects-and-type-methods&quot;&gt;Module Objects and Type Methods&lt;/h4&gt;

&lt;p&gt;Now we have seen how we have wrapped &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt;’s Tensor definition in &lt;code class=&quot;highlighter-rouge&quot;&gt;THP&lt;/code&gt;, and generated THP methods such as &lt;code class=&quot;highlighter-rouge&quot;&gt;THPFloatTensor_init(...)&lt;/code&gt;. Now we can explore what the above code actually does in terms of the module we are creating. The key line in &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor_(init)&lt;/code&gt; is:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;cp&quot;&gt;# THPTensorBaseStr, THPTensorType are also macros that are specific
# to each type
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PyModule_AddObject&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;THPTensorBaseStr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PyObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;THPTensorType&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This function registers our Tensor objects to the extension module, so we can use THPFloatTensor, THPIntTensor, etc. in our Python code.&lt;/p&gt;

&lt;p&gt;Just being able to create Tensors isn’t very useful - we need to be able to call all the methods that &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; defines. A simple example shows calling the in-place &lt;code class=&quot;highlighter-rouge&quot;&gt;zero_&lt;/code&gt; method on a Tensor.&lt;/p&gt;
&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FloatTensor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;zero_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Let’s start by seeing how we add methods to newly defined types. One of the fields in the “type object” is &lt;code class=&quot;highlighter-rouge&quot;&gt;tp_methods&lt;/code&gt;. This field holds an array of method definitions (&lt;code class=&quot;highlighter-rouge&quot;&gt;PyMethodDef&lt;/code&gt;s) and is used to associate methods (and their underlying C/C++ implementations) with a type. Suppose we wanted to define a new method on our &lt;code class=&quot;highlighter-rouge&quot;&gt;PyFloatObject&lt;/code&gt; that replaces the value. We could implement this as follows:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PyObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PyFloatObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PyObject&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
	&lt;span class=&quot;kt&quot;&gt;double&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
	&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PyArg_ParseTuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;d&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
		&lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ob_fval&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
	&lt;span class=&quot;n&quot;&gt;Py_RETURN_NONE&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This is equivalent to the Python method:&lt;/p&gt;
&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
	&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ob_fval&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;It is instructive to read more about how defining methods works in CPython. In general, methods take as the first parameter the instance of the object, and optionally parameters for the positional arguments and keyword arguments. This static function is registered as a method on our float:&lt;/p&gt;
&lt;div class=&quot;language-cpp highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;PyMethodDef&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;float_methods&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
	&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;replace&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PyCFunction&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;METH_VARARGS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
	&lt;span class=&quot;s&quot;&gt;&quot;replace the value in the float&quot;&lt;/span&gt;
	&lt;span class=&quot;p&quot;&gt;},&lt;/span&gt;
	&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;cm&quot;&gt;/* Sentinel */&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This registers a method called replace, which is implemented by the C function of the same name. The &lt;code class=&quot;highlighter-rouge&quot;&gt;METH_VARARGS&lt;/code&gt; flag indicates that the method takes a tuple of arguments representing all the arguments to the function. This array is set to the &lt;code class=&quot;highlighter-rouge&quot;&gt;tp_methods&lt;/code&gt; field of the type object, and then we can use the &lt;code class=&quot;highlighter-rouge&quot;&gt;replace&lt;/code&gt; method on objects of that type.&lt;/p&gt;

&lt;p&gt;We would like to be able to call all of the methods for &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; tensors on our &lt;code class=&quot;highlighter-rouge&quot;&gt;THP&lt;/code&gt; tensor equivalents. However, writing wrappers for all of the &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; methods would be time-consuming and error prone. We need a better way to do this.&lt;/p&gt;

&lt;h3 id=&quot;pytorch-cwrap&quot;&gt;PyTorch cwrap&lt;/h3&gt;

&lt;p&gt;PyTorch implements its own cwrap tool to wrap the &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; Tensor methods for use in the Python backend. We define a &lt;code class=&quot;highlighter-rouge&quot;&gt;.cwrap&lt;/code&gt; file containing a series of C method declarations in our custom &lt;a href=&quot;http://yaml.org&quot;&gt;YAML format&lt;/a&gt;. The cwrap tool takes this file and outputs &lt;code class=&quot;highlighter-rouge&quot;&gt;.cpp&lt;/code&gt; source files containing the wrapped methods in a format that is compatible with our &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor&lt;/code&gt; Python object and the Python C extension method calling format. This tool is used to generate code to wrap not only &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt;, but also &lt;code class=&quot;highlighter-rouge&quot;&gt;CuDNN&lt;/code&gt;. It is defined to be extensible.&lt;/p&gt;

&lt;p&gt;An example YAML “declaration” for the in-place &lt;code class=&quot;highlighter-rouge&quot;&gt;addmv_&lt;/code&gt; function is as follows:&lt;/p&gt;
&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[[
  name: addmv_
  cname: addmv
  return: self
  arguments:
    - THTensor* self
    - arg: real beta
      default: AS_REAL(1)
    - THTensor* self
    - arg: real alpha
      default: AS_REAL(1)
    - THTensor* mat
    - THTensor* vec
]]
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The architecture of the cwrap tool is very simple. It reads in a file, and then processes it with a series of &lt;strong&gt;plugins.&lt;/strong&gt; See &lt;code class=&quot;highlighter-rouge&quot;&gt;tools/cwrap/plugins/__init__.py&lt;/code&gt; for documentation on all the ways a plugin can alter the code.&lt;/p&gt;

&lt;p&gt;The source code generation occurs in a series of passes. First, the YAML “declaration” is parsed and processed. Then the source code is generated piece-by-piece - adding things like argument checks and extractions, defining the method header, and the actual call to the underlying library such as &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt;. Finally, the cwrap tool allows for processing the entire file at a time. The resulting output for &lt;code class=&quot;highlighter-rouge&quot;&gt;addmv_&lt;/code&gt; can be &lt;a href=&quot;https://gist.github.com/killeent/c00de46c2a896335a52552604cc4d74b&quot;&gt;explored here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In order to interface with the CPython backend, the tool generates an array of &lt;code class=&quot;highlighter-rouge&quot;&gt;PyMethodDef&lt;/code&gt;s that can be stored or appended to the &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor&lt;/code&gt;’s &lt;code class=&quot;highlighter-rouge&quot;&gt;tp_methods&lt;/code&gt; field.&lt;/p&gt;

&lt;p&gt;In the specific case of wrapping Tensor methods, the build process first generates the output source file from &lt;code class=&quot;highlighter-rouge&quot;&gt;TensorMethods.cwrap&lt;/code&gt;. This source file is &lt;code class=&quot;highlighter-rouge&quot;&gt;#include&lt;/code&gt;‘d in the generic Tensor source file. This all occurs before the preprocessor does its magic. As a result, all of the method wrappers that are generated undergo the same pass as the &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor&lt;/code&gt; code above. Thus a single generic declaration and definition is specialized for each type as well.&lt;/p&gt;

&lt;h3 id=&quot;putting-it-all-together&quot;&gt;Putting It All Together&lt;/h3&gt;

&lt;p&gt;So far, we have shown how we extend the Python interpreter to create a new extension module, how such a module defines our new &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor&lt;/code&gt; type, and how we can generate source code for Tensors of all types that interface with &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt;. Briefly, we will touch on compilation.&lt;/p&gt;

&lt;p&gt;Setuptools allows us to define an Extension for compilation. The entire &lt;code class=&quot;highlighter-rouge&quot;&gt;torch._C&lt;/code&gt; extension is compiled by collecting all of the source files, header files, libraries, etc. and creating a setuptools &lt;code class=&quot;highlighter-rouge&quot;&gt;Extension&lt;/code&gt;. Then setuptools handles building the extension itself. I will explore the build process more in a subsequent post.&lt;/p&gt;

&lt;p&gt;To summarize, let’s revisit our four questions:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;How does PyTorch extend the Python interpreter to define a Tensor type that can be manipulated from Python code?&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It uses CPython’s framework for extending the Python interpreter and defining new types, while taking special care to generate code for all types.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;How does PyTorch wrap the C libraries that actually define the Tensor’s properties and methods?&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It does so by defining a new type, &lt;code class=&quot;highlighter-rouge&quot;&gt;THPTensor&lt;/code&gt;, that is backed by a &lt;code class=&quot;highlighter-rouge&quot;&gt;TH&lt;/code&gt; Tensor. Function calls are forwarded to this tensor via the CPython backend’s conventions.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;How does PyTorch cwrap work to generate code for Tensor methods?&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It takes our custom YAML-formatted code and generates source code for each method by processing it through a series of steps using a number of plugins.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;How does PyTorch’s build system take all of these components to compile and generate a workable application?&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It takes a bunch of source/header files, libraries, and compilation directives to build an extension using Setuptools.&lt;/p&gt;

&lt;p&gt;This is just a snapshot of parts of the build system for PyTorch. There is more nuance, and detail, but I hope this serves as a gentle introduction to a lot of the components of our Tensor library.&lt;/p&gt;

&lt;h3 id=&quot;resources&quot;&gt;Resources:&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://docs.python.org/3.7/extending/index.html&quot;&gt;https://docs.python.org/3.7/extending/index.html&lt;/a&gt; is invaluable for understanding how to write C/C++ Extension to Python&lt;/li&gt;
&lt;/ul&gt;</content><author><name>Trevor Killeen</name></author><summary type="html">The fundamental unit in PyTorch is the Tensor. This post will serve as an overview for how we implement Tensors in PyTorch, such that the user can interact with it from the Python shell. In particular, we want to answer four main questions:</summary></entry></feed>