-
Notifications
You must be signed in to change notification settings - Fork 8
Files used in PyDeepFlow
This file defines common activation functions and their derivatives for neural networks. It uses a device
abstraction to support operations on both CPU (numpy
) and GPU (cupy
).
-
activation(x, func, device)
- Applies the specified activation function to the input
x
. - Supported activations:
- ReLU (
relu
) - Leaky ReLU (
leaky_relu
) - Sigmoid (
sigmoid
) - Tanh (
tanh
) - Softmax (
softmax
)
- ReLU (
- Parameters:
-
x
: Input data. -
func
: The activation function name. -
device
: The device (CPU/GPU) where calculations are performed.
-
- Returns: The activated output.
- Applies the specified activation function to the input
-
activation_derivative(x, func, device)
- Computes the derivative of the specified activation function.
- Parameters:
-
x
: Input data (activated output from the layer). -
func
: The activation function name. -
device
: The device (CPU/GPU) where calculations are performed.
-
- Returns: The derivative of the activation.
This file contains a Device
class that abstracts the operations for supporting both CPU (numpy
) and GPU (cupy
) computations.
-
Device
- A device abstraction for handling CPU/GPU operations seamlessly.
- Methods:
-
__init__(use_gpu=False)
: Initializes the device based on GPU availability. -
array(data)
: Converts input data to a numpy or cupy array. -
zeros(shape)
: Creates an array of zeros on the selected device. -
random()
: Returns the random module for the selected device. -
exp(x)
,dot(a, b)
,maximum(a, b)
,tanh(x)
: Perform operations on the selected device. -
where(condition, x, y)
,sum(x, axis=None, keepdims=False)
,log(x)
,max(x, axis=None, keepdims=False)
: Common matrix operations. -
asnumpy(x)
: Converts a cupy array back to a numpy array (used for GPU operations).
-
This file defines a learning rate scheduler for adjusting the learning rate during training.
-
LearningRateScheduler
- Implements learning rate scheduling strategies.
- Constructor:
__init__(initial_lr, strategy="decay", decay_rate=0.1, cycle_length=10, min_lr=1e-6)
- Parameters:
-
initial_lr
: Initial learning rate. -
strategy
: Learning rate strategy (decay
orcyclic
). -
decay_rate
: Exponential decay rate for the learning rate. -
cycle_length
: For cyclic learning rate, defines the number of epochs per cycle. -
min_lr
: Minimum learning rate.
-
- Methods:
-
get_lr(epoch)
: Returns the learning rate for the current epoch based on the selected strategy.
-
This file contains various loss functions and their derivatives, implemented to support both CPU and GPU computations using the Device
abstraction.
-
binary_crossentropy(y_true, y_pred, device)
- Computes binary crossentropy loss.
-
binary_crossentropy_derivative(y_true, y_pred, device)
- Computes the derivative of binary crossentropy.
-
mse(y_true, y_pred, device)
- Computes Mean Squared Error (MSE) loss.
-
mse_derivative(y_true, y_pred, device)
- Computes the derivative of MSE loss.
-
categorical_crossentropy(y_true, y_pred, device)
- Computes categorical crossentropy loss for multi-class classification.
-
categorical_crossentropy_derivative(y_true, y_pred, device)
- Computes the derivative of categorical crossentropy.
-
hinge_loss(y_true, y_pred, device)
- Computes hinge loss (used in SVMs).
-
hinge_loss_derivative(y_true, y_pred, device)
- Computes the derivative of hinge loss.
-
huber_loss(y_true, y_pred, device, delta=1.0)
- Computes Huber loss, a robust loss function for outliers.
-
huber_loss_derivative(y_true, y_pred, device, delta=1.0)
- Computes the derivative of Huber loss.
get_loss_function(loss_name)
- Returns the loss function by name.
get_loss_derivative(loss_name)
- Returns the derivative of the loss function by name.
This file defines a multi-layer artificial neural network (ANN) with support for binary and multi-class classification, GPU/CPU computation, and features like L2 regularization and dropout.
-
Multi_Layer_ANN
- Implements a feed-forward artificial neural network.
- Constructor:
__init__(X_train, Y_train, hidden_layers, activations, loss='categorical_crossentropy', use_gpu=False, l2_lambda=0.0, dropout_rate=0.0)
- Initializes the network architecture, device, loss functions, and regularization.
- Attributes:
-
layers
: The architecture of the network. -
activations
: List of activation functions used in hidden layers. -
weights
: List of weight matrices for each layer. -
biases
: List of bias vectors for each layer. -
loss_func
: The callable loss function. -
loss_derivative
: The derivative of the loss function. -
X_train
,y_train
: Training data (moved to GPU/CPU).
-
- Methods:
-
forward_propagation(X)
: Performs forward propagation. -
backpropagation(X, y, activations, Z_values, learning_rate)
: Performs backpropagation for weight updates. -
fit(epochs, learning_rate=0.01, lr_scheduler=None)
: Trains the model for a specified number of epochs. -
predict(X)
: Makes predictions based on input data. -
predict_prob(X)
: Predicts probabilities of the input data.
-
This module handles regularization techniques for the artificial neural network (ANN) implementation. Regularization helps prevent overfitting and improves the model's generalization performance by applying penalties on model complexity and reducing the risk of over-relying on specific weights.
This class provides methods for applying L2 regularization and Dropout, two common techniques for regularizing a neural network.
The constructor for the Regularization
class.
Parameters:
-
l2_lambda
(float
): Coefficient for L2 regularization. Default is0.0
, meaning no L2 regularization is applied if not specified. -
dropout_rate
(float
): The probability of dropping out units during training to prevent overfitting. Default is0.0
, meaning no dropout is applied if not specified.
Applies L2 regularization to the model's weights during training. L2 regularization penalizes large weights by adding the sum of their squared values to the cost function.
Parameters:
-
weights
(np.ndarray
orcp.ndarray
): The weight matrix to which L2 regularization should be applied. -
learning_rate
(float
): The learning rate for training. It determines the strength of the update step. -
batch_size
(int
): The size of the mini-batch in training, which is used to normalize the regularization penalty.
Returns:
-
np.ndarray
orcp.ndarray
: The adjusted weight matrix after applying L2 regularization.
Applies Dropout regularization to a given layer during training. Dropout is a technique where a fraction of the neurons in the layer are randomly dropped (set to zero) to prevent overfitting.
Parameters:
-
layer_output
(np.ndarray
orcp.ndarray
): The output of a layer to which dropout should be applied. -
training
(bool
): A flag indicating whether the model is in training mode. Dropout is only applied during training and is bypassed during inference (prediction).
Returns:
-
np.ndarray
orcp.ndarray
: The layer output after applying dropout. During training, a fraction of the outputs will be zeroed. If not in training mode, it returns the original output unchanged.
from .regularization import Regularization
# Initialize regularization with L2 lambda of 0.01 and a dropout rate of 0.2
regularizer = Regularization(l2_lambda=0.01, dropout_rate=0.2)
# Apply L2 regularization to weights
updated_weights = regularizer.apply_l2_regularization(weights, learning_rate=0.01, batch_size=32)
# Apply dropout during training to layer output
dropout_output = regularizer.apply_dropout(layer_output, training=True)