Skip to content

Hacking the codebase

Prasun Anand edited this page Sep 3, 2019 · 22 revisions

Resources

  1. https://pytorch.org/blog/a-tour-of-pytorch-internals-1/
  2. https://pytorch.org/blog/a-tour-of-pytorch-internals-2/
  3. http://blog.ezyang.com/2019/05/pytorch-internals/
  4. https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md
  5. Libtorch => https://github.com/pytorch/pytorch/blob/master/docs/libtorch.rst Note: Method2 is more viable here.

Building Extension

Setup.py ==>

  1. Cmake Link1 Link2
  2. Define loading extension torch._C Link

Torch C extension (bindings)

Initialization of torch._C Link Notice the method list defined here and how methods are appended to the module Link

Other modules/objects from C extension:

  1. torch._C._functions
  2. torch._C._EngineBase
  3. torch._C._FunctionBase
  4. torch._C._LegacyVariableBase
  5. torch._C._CudaEventBase
  6. torch._C._CudaStreamBase
  7. torch._C.Generator
  8. "torch._C." THPStorageBaseStr // Note the ""
  9. torch._C._PtrWrapper

Implementation of torch.tensor

Check implementation of torch.tensor() i.e. (init())

  1. Tensor https://github.com/pytorch/pytorch/blob/e8ad167211e09b1939dcb4f462d3f03aa6a6f08a/torch/tensor.py#L20
  2. _TensorBase : Note this is an object added via PyModule_AddObject https://github.com/pytorch/pytorch/blob/e8ad167211e09b1939dcb4f462d3f03aa6a6f08a/torch/csrc/autograd/python_variable.cpp#L588

Note: torch.autograd.Variable class was used before PyTorch v0.4.0. Now Variable class has been deprecated. torch.autograd.Variable and torch.Tensor and the same now. https://pytorch.org/blog/pytorch-0_4_0-migration-guide/

Implementation of torch.tensor operators

See the section on torch._C.VariableFunctions.add. THPVariable_add in Edward's post

Adding to this take a look at https://github.com/pytorch/pytorch/tree/master/torch/csrc/autograd In the torch/csrc/autograd directory another folder called generated is created that contains all Python methods associated with torch.Tensor.

  1. Import TH/TH.h link
  2. Import ATen/Aten.h link

Torch Random Number Generators

  1. https://github.com/pytorch/pytorch/blob/14ecf92d4212996937a9a1ceadd2202bd828636e/torch/csrc/Generator.cpp#L46

Autograd

https://github.com/pytorch/pytorch/blob/master/docs/source/notes/autograd.rst

Module.cpp

THPModule_initNames THPModule_initExtension => Callback for python part. Used for additional initialization of python classes

void THPAutograd_initFunctions()

What is there in copy_utils.h? Check THPInsertStorageCopyFunction

Python Types (PyTypeObject)

  1. THPDtypeType
  2. THPDeviceType
  3. THPMemoryFormatType
  4. THPLayoutType
  5. THPGeneratorType
  6. THPWrapperType
  7. THPQSchemeType
  8. THPSizeType
  9. THPFInfoType

Tools Directory

tools directory is the most important one if you want to hack the Pytorch codebase. A lot of majic happens here i.e. code generation .