Skip to content

Commit

Permalink
Merge pull request #27 from imohitmayank/feb_2024
Browse files Browse the repository at this point in the history
Practical part added in Model Compression, and others.
  • Loading branch information
imohitmayank authored Mar 3, 2024
2 parents 272d028 + 72cfb6b commit 63013e0
Show file tree
Hide file tree
Showing 7 changed files with 467 additions and 12 deletions.
14 changes: 14 additions & 0 deletions docs/data_science_tools/python_snippets.md
Original file line number Diff line number Diff line change
Expand Up @@ -499,6 +499,20 @@ def send_message_to_slack(message):
send_message_to_slack("test")
```

## Colab Snippets

- [Google Colab](https://colab.research.google.com/) is the go-to place for many data scientists and machine learning engineers who are looking to perform quick analysis or training for free. Below are some snippets that can be useful in Colab.

- If you are getting `NotImplementedError: A UTF-8 locale is required. Got ANSI_X3.4-1968` or similar error when trying to run `!pip install` or similar CLI commands in Google Colab, you can fix it by running the following command before running `!pip install`. But note, this might break some imports. So make sure to import all the packages before running this command.

```python linenums="1"
import locale
locale.getpreferredencoding = lambda: "UTF-8"

# now import
# !import ...
```

<!-- ## Python Classmethod vs Staticmethod
https://stackoverflow.com/questions/12179271/meaning-of-classmethod-and-staticmethod-for-beginner -->
Binary file added docs/imgs/ml_modelcompression_quant_awq.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/imgs/ml_modelcompression_quant_awq2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/imgs/ml_quantization_thebloke_llama.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
72 changes: 72 additions & 0 deletions docs/machine_learning/ML_snippets.md
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,10 @@ torch.cuda.get_device_name(0)
## Output: 'GeForce MX110'
```

## Monitor GPU usage

- If you want to continuously monitor the GPU usage, you can use `watch -n 2 nvidia-smi --id=0` command. This will refresh the `nvidia-smi` output every 2 second.

## HuggingFace Tokenizer

- Tokenizer is a pre-processing step that converts the text into a sequence of tokens. [HuggingFace tokenizer](https://huggingface.co/docs/transformers/main_classes/tokenizer) is a wrapper around the [tokenizers library](https://github.com/huggingface/tokenizers), that contains multiple base algorithms for fast tokenization.
Expand Down Expand Up @@ -309,6 +313,74 @@ vocabulary = tokenizer.get_vocab()
# vocabulary['hello'] returns 7592
```

## Explore Model

- You can use the `summary` method to check the model's architecture. This will show the layers, their output shape and the number of parameters in each layer.

=== "Keras"
``` python linenums="1"
# import
from keras.models import Sequential
from keras.layers import Dense, Conv2D, MaxPooling2D, Flatten

# create a model
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(10, activation='softmax))

# print the model summary
model.summary()
```

=== "PyTorch"
``` python linenums="1"
# import
import torch
import torch.nn as nn

# create a model

class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 32, 3, 1)
self.conv2 = nn.Conv2d(32, 64, 3, 1)
self.conv3 = nn.Conv2d(64, 64, 3, 1)
self.fc1 = nn.Linear(1024, 64)
self.fc2 = nn.Linear(64, 10)

def forward(self, x):
x = F.relu(self.conv1(x))
x = F.max_pool2d(x, 2, 2)
x = F.relu(self.conv2(x))
x = F.max_pool2d(x, 2, 2)
x = F.relu(self.conv3(x))
x = x.view(-1, 1024)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return F.log_softmax(x, dim=1)

# create an instance of the model
model = Net()
# print the model summary
print(model)
```

- To check the named parameters of the model and their dtypes, you can use the following code,

=== "PyTorch"
``` python linenums="1"
print(f"Total number of names params: {len(list(model.named_parameters()))}")
print("They are - ")
for name, param in model.named_parameters():
print(name, param.dtype)
```
<!-- ## Tensor operations
- Tensors are the building blocks of any Deep Learning project. Here, let's go through some common tensor operations,
Expand Down
10 changes: 9 additions & 1 deletion docs/machine_learning/interview_questions.md
Original file line number Diff line number Diff line change
Expand Up @@ -466,4 +466,12 @@

=== "Answer"
XGBoost (Extreme Gradient Boosting) is a specific implementation of the Gradient Boosting method that uses a more efficient tree-based model and a number of techniques to speed up the training process and reduce overfitting. XGBoost is commonly used in machine learning competitions and it's one of the most popular libraries used for gradient boosting. It's used for classification and regression problems.
XGBoost (Extreme Gradient Boosting) is a specific implementation of the Gradient Boosting method that uses a more efficient tree-based model and a number of techniques to speed up the training process and reduce overfitting. XGBoost is commonly used in machine learning competitions and it's one of the most popular libraries used for gradient boosting. It's used for classification and regression problems.

!!! Question ""
=== "Question"
#### What is `group_size` in context of Quantization?

=== "Answer"
Group size is a parameter used in the quantization process that determines the number of weights or activations *(imagine weights in a row of matrix)* that are quantized together. A smaller group size can lead to better quantization accuracy, but it can also increase the memory and computational requirements of the model. Group size is an important hyperparameter that needs to be tuned to achieve the best trade-off between accuracy and efficiency. Note, the default groupsize for a GPTQ is 1024. [Refer this interesting Reddit discussion](https://www.reddit.com/r/LocalLLaMA/comments/12rtg82/what_is_group_size_128_and_why_do_30b_models_give/?rdt=46348)
Loading

0 comments on commit 63013e0

Please sign in to comment.