-
Notifications
You must be signed in to change notification settings - Fork 8
Hyperparameters
Ravin D edited this page Oct 25, 2024
·
2 revisions
Configuring hyperparameters allows you to fine-tune the model’s performance. Below is a list of the important hyperparameters and their significance.
- Description: Specifies the number of neurons in each hidden layer.
-
Example:
hidden_layers=[64, 32] # Two hidden layers with 64 and 32 neurons respectively.
-
Supported Activations:
relu
,sigmoid
,tanh
,leaky_relu
,softmax
. -
Example:
activations=['relu', 'sigmoid']
-
Supported Losses:
binary_crossentropy
,categorical_crossentropy
,mse
,hinge
,huber
. -
Example:
loss='categorical_crossentropy'
- Description: Controls the step size for each update during training.
-
Example:
learning_rate=0.001
- Description: Number of iterations over the entire dataset during training.
-
Example:
epochs=50
- Description: Adjusts the learning rate during training. You can specify schedules such as step decay, exponential decay, etc.
-
Example:
scheduler = LearningRateScheduler(lr_type='exponential_decay', initial_lr=0.001, decay_rate=0.9)
- Description: Adds a penalty equal to the square of the magnitude of the weights to prevent overfitting.
-
Example:
l2_lambda=0.01 # Strength of the regularization.
- Description: Randomly sets a fraction of neurons to zero during training to prevent overfitting.
-
Example:
dropout_rate=0.5 # Dropout rate of 50%.
- Description: Normalizes layer inputs to stabilize training and allow higher learning rates.
-
Example:
use_batch_norm=True # Enables batch normalization for all hidden layers.
- Description: Clips gradients during backpropagation to prevent exploding gradients.
-
Example:
clipping_threshold=1.0 # Maximum gradient norm.
To train a model with these configurations, here’s a quick example setup:
from pydeepflow.model import Multi_Layer_ANN
from pydeepflow.schedulers import LearningRateScheduler
# Sample Data
X_train = np.random.rand(100, 20) # Example features
Y_train = np.random.randint(0, 2, size=(100, 1)) # Example binary labels
# Initialize model
model = Multi_Layer_ANN(
X_train=X_train,
Y_train=Y_train,
hidden_layers=[64, 32],
activations=['relu', 'sigmoid'],
loss='binary_crossentropy',
use_gpu=False,
l2_lambda=0.01,
dropout_rate=0.5,
use_batch_norm=True
)
# Learning rate scheduler
scheduler = LearningRateScheduler(lr_type='exponential_decay', initial_lr=0.001, decay_rate=0.9)
# Train model
model.fit(
epochs=50,
learning_rate=0.001,
lr_scheduler=scheduler,
clipping_threshold=1.0,
verbose=True
)