From a1edf8c0331b1d00071ba2caf96687f99ce7e6c4 Mon Sep 17 00:00:00 2001 From: Kye Date: Wed, 27 Sep 2023 14:43:45 -0400 Subject: [PATCH] clean up --- docs/zeta/nn/embeddings/xpos.md | 183 ++++++++++++++++++++++++++++++++ mkdocs.yml | 1 + 2 files changed, 184 insertions(+) create mode 100644 docs/zeta/nn/embeddings/xpos.md diff --git a/docs/zeta/nn/embeddings/xpos.md b/docs/zeta/nn/embeddings/xpos.md new file mode 100644 index 00000000..5cb7bcb7 --- /dev/null +++ b/docs/zeta/nn/embeddings/xpos.md @@ -0,0 +1,183 @@ +# `Zeta` Documentation + +## Table of Contents +1. [Introduction](#introduction) +2. [Purpose and Functionality](#purpose-and-functionality) +3. [Class: `XPOS`](#class-xpos) + - [Initialization](#initialization) + - [Parameters](#parameters) + - [Forward Method](#forward-method) +4. [Functions](#functions) + - [`fixed_pos_embedding`](#fixed-pos-embedding) + - [`rotate_every_two`](#rotate-every-two) + - [`duplicate_interleave`](#duplicate-interleave) + - [`apply_rotary_pos_emb`](#apply-rotary-pos-emb) +5. [Usage Examples](#usage-examples) + - [Using the `XPOS` Class](#using-the-xpos-class) + - [Using the Functions](#using-the-functions) +6. [Additional Information](#additional-information) + - [Positional Embeddings in Transformers](#positional-embeddings-in-transformers) +7. [References](#references) + +--- + +## 1. Introduction + +Welcome to the Zeta documentation for the `XPOS` class and related functions! Zeta is a powerful library for deep learning in PyTorch, and this documentation will provide a comprehensive understanding of the `XPOS` class and its associated functions. + +--- + +## 2. Purpose and Functionality + +The `XPOS` class and its related functions are designed to generate and apply rotary positional embeddings to input tensors. These embeddings are crucial for sequence-to-sequence models, particularly in transformer architectures. Below, we will explore their purpose and functionality. + +--- + +## 3. Class: `XPOS` + +The `XPOS` class is used to apply rotary positional embeddings to input tensors. These embeddings are essential for transformers to understand the positional information of elements in a sequence. + +### Initialization + +To create an instance of the `XPOS` class, you need to specify the following parameters: + +```python +XPOS( + head_dim: int = None, + scale_base: int = 512 +) +``` + +### Parameters + +- `head_dim` (int, optional): The dimensionality of the positional embeddings. If not specified, it defaults to `None`, which is used to calculate the dimension based on the input tensor. It is recommended to set this value explicitly for consistency. + +- `scale_base` (int, optional): The base value for scaling the positional embeddings. Default is `512`. + +### Forward Method + +The `forward` method of the `XPOS` class applies rotary positional embeddings to the input tensor. It can be called as follows: + +```python +output = xpos(input_tensor, offset=0, downscale=False) +``` + +- `input_tensor` (Tensor): The input tensor to which positional embeddings will be applied. + +- `offset` (int, optional): An offset value for positional embeddings. Default is `0`. + +- `downscale` (bool, optional): If `True`, the positional embeddings are downscaled. Default is `False`. + +--- + +## 4. Functions + +In addition to the `XPOS` class, there are several functions provided for working with positional embeddings. + +### `fixed_pos_embedding` + +This function generates fixed sine and cosine positional embeddings based on the input tensor's scale. + +```python +sin, cos = fixed_pos_embedding(x) +``` + +- `x` (Tensor): Input tensor of shape `(seq_len, dim)`. + +### `rotate_every_two` + +This function rearranges the elements of the input tensor by rotating every two elements. + +```python +output_tensor = rotate_every_two(input_tensor) +``` + +- `input_tensor` (Tensor): Input tensor of shape `(batch_size, seq_len, dim)`. + +### `duplicate_interleave` + +This function duplicates a matrix while interleaving the copy. + +```python +duplicated_matrix = duplicate_interleave(matrix) +``` + +- `matrix` (Tensor): Input matrix. + +### `apply_rotary_pos_emb` + +This function applies rotary positional embeddings to the input tensor. + +```python +output_tensor = apply_rotary_pos_emb(input_tensor, sin, cos, scale=1) +``` + +- `input_tensor` (Tensor): Input tensor of shape `(batch_size, seq_len, dim)`. +- `sin` (Tensor): Sine positional embeddings of shape `(seq_len, dim)`. +- `cos` (Tensor): Cosine positional embeddings of shape `(seq_len, dim)`. +- `scale` (float): Scaling factor for the positional embeddings. + +--- + +## 5. Usage Examples + +Let's explore some usage examples of the `XPOS` class and related functions to understand how to use them effectively. + +### Using the `XPOS` Class + +```python +from zeta import XPOS +import torch + +# Create an XPOS instance +xpos = XPOS(head_dim=256, scale_base=512) + +# Apply positional embeddings to an input tensor +input_tensor = torch.rand(16, 32, 256) # Example input tensor +output = xpos(input_tensor, offset=0, downscale=False) +``` + +### Using the Functions + +```python +from zeta import fixed_pos_embedding, rotate_every_two, duplicate_interleave, apply_rotary_pos_emb +import torch + +# Generate fixed positional embeddings +input_tensor = torch.rand(32, 512) # Example input tensor +sin, cos = fixed_pos_embedding(input_tensor) + +# Rotate every two elements in a tensor +input_tensor = torch.rand(16, 64, 256) # Example input tensor +output_tensor = rotate_every_two(input_tensor) + +# Duplicate and interleave a matrix +input_matrix = torch.rand(8, 8) # Example input matrix +duplicated_matrix = duplicate_interleave(input_matrix) + +# Apply rotary positional embeddings +input_tensor = torch.rand(16, 32, 256) # Example input tensor +output_tensor = apply_rotary_pos_emb(input_tensor, sin, cos, scale=1) +``` + +--- + +## 6. Additional Information + +### Positional Embeddings in Transformers + +Positional embeddings play a crucial role in transformers and other sequence-to-sequence models. They enable the model to understand the order of elements in a sequence, which is essential for tasks like natural language processing, machine translation, and text generation. + +--- + +## 7. References + +This documentation provides a comprehensive guide to the `XPOS` class and related functions in the Zeta library, explaining their purpose, functionality, parameters, and usage. You can now effectively integrate these components into your deep learning models, particularly in transformer-based architectures, for various sequence-based tasks. + +For further information on the underlying concepts and principles of positional embeddings in + + transformers, you may refer to the original paper: + +- [Attention Is All You Need (Transformer)](https://arxiv.org/abs/1706.03762) + +Please consult the official PyTorch documentation for any specific PyTorch-related details: [PyTorch Documentation](https://pytorch.org/docs/stable/index.html). \ No newline at end of file diff --git a/mkdocs.yml b/mkdocs.yml index 6737d46f..c52508f2 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -105,6 +105,7 @@ nav: - RotaryEmbeddings: "zeta/nn/embeddings/rope.md" - TruncatedRotaryEmbedding: "zeta/nn/embeddings/truncated_rope.md" - PositionalEmbedding: "zeta/nn/embeddings/positional_embeddings.md" + - XPOS: "zeta/nn/embeddings/xpos.md" - zeta.nn.modules: - Lora: "zeta/nn/modules/lora.md" - TokenLearner: "zeta/nn/modules/token_learner.md"