A custom genetic algorithm is used to optimize the architecture of a neural network such that it can accurately calculate an arbitrary function for you.
This started as a submission for the very first /r/ProgrammerHumor hackathon, Over Engineered . In the future when humankind forgets the fundamentals of mathematics artificial intelligence will be used to calculate it for you. Although, I forgot to submit it before going to Burning Man and missed the deadline.
pip install -r requirements.txt
- Numpy
- Matplotlib
- TensorFlow (version > 2.0)
- open ipython
run genetic_network.py
usage: genetic_network.py [-h] [-e EPOCHS] [-p POPULATION] [-g GENERATIONS]
optional arguments:
-h, --help show this help message and exit
-e EPOCHS, --epochs EPOCHS
Number of training epochs
Initial population size
Number of generations
... let it train and breed, the best network will be saved to an h5py file
run evaluate.py --angle 40
Simulated data is used to train the neural network and custom data can be created using the function below
def create_data(func, NUM=10000):
X = np.random.choice( np.linspace(0,2*np.pi, NUM+1000), NUM, replace=False)
y = func(X)
return X.reshape(-1,1), y.reshape(-1,1)
Call the function like such: create_data( np.cos, 10000)
If you want to use a custom function, feel free but remember to change the range in which the function is evaluated. For the example above the data is evalulated between 0 and 2 pi, with 10000 random points inbetween.
TensorFlow is used to create a deep neural network that is eventually trained to compute a trig function. The class individual
has properties that pertain to building a parameterized machine learning model, creating random architectures and breeding/swapping traits between models for the genetic algorithm optimization. The neural network architecture is parameterized using a cos function:
layer_func = lambda x, A,w: (A*np.cos(w*np.linspace(0,np.pi/2,x))).astype(int)
where x corresponds to the number of layers, A is the number of neurons in the first layer with all subsequent layers being smaller than that and w changes the rate of neural degradation. 50 random samples from an initial network population looks like this:
Each network has an input size of 1 corresponding to an angle and an output size of 1 corresponding to the respective trig function evaluation for that angle.
A genetic algorithm is used to explore the parameter space from which the neural networks are formed. Only a few traits are parameterized for each neural network:
- number of neurons per layerbatch_size
- number of samples per training updatelearning_rate
- learning rate of SGDmomentum
- hyper parameter that accelerates SGD in the relevant direction and dampens oscillationsdecay
- time inverse decay of learning ratedropout
- rate of dropout after first layer
An example generating two random models and then breeding them to produce offspring:
from genetic_network import individual
parent1 = individual.randomize()
parent2 = individual.randomize()
baby1, baby2 = individual.breed(parent1, parent2)
The genetic algorithm applies random cross over when swapping traits between generations. There is also a 1% mutation rate during breeding which causes the offspring to have between 1-3 traits randomized. For more information just read the source code. The neural network associated with each individual can be viewed like such:
In [10]: parent1.model.summary()
Model: "model_8"
Layer (type) Output Shape Param #
input (InputLayer) [(None, 1)] 0
dense_22 (Dense) (None, 50) 100
dense_23 (Dense) (None, 42) 2142
dropout_2 (Dropout) (None, 42) 0
dense_24 (Dense) (None, 22) 946
dense_25 (Dense) (None, 1) 23
Total params: 3,211
Trainable params: 3,211
Non-trainable params: 0
In [11]: parent1.traits
{'layer_sizes': array([ 50, 42, 22]),
'batch_size': 54,
'learning_rate': 0.2094,
'momentum': 0.1181,
'decay': 0.0033,
'dropout': 0.1723}
While this program was made as a joke, the optimization using a genetic algorithm is something that can be used in modern day research particularly if you can train a neural network in a reasonable amount of time. Then you can leverage this ensamble sampling technique to find the best architecture.
I also wonder if a binary neural network could be optimized and ultimately replace the calculations a computer makes to compute trig functions on the lowest level? Like a neural network that just does bit operations