Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pickling state / pickling whole class issue #6

Open
devop01 opened this issue Nov 20, 2020 · 10 comments
Open

Pickling state / pickling whole class issue #6

devop01 opened this issue Nov 20, 2020 · 10 comments

Comments

@devop01
Copy link

devop01 commented Nov 20, 2020

As far as I understand there are problems with saving state of learned TM.
This is important for research development, as checkpoint can be created in a easy way.

I tried to pickle MultiClassTsetlinMachine (from pyTsetlinMachineParallel.tm ) as well as pyCUDA version and both failed.

Then I tried to pickle just TM state (from MNIST example):
eg:
tm = MultiClassTsetlinMachine(2000, 50, 10.0)
f = open..
pickle.dump(tm.get_state(),f)

#new session
f = open...
state = pickle.load(f)

tm2 = MultiClassTsetlinMachine(2000, 50, 10.0)
tm2.set_state(state)
Result:
Traceback (most recent call last):
File "<pyshell#15>", line 1, in
tm.set_state(state)
File "/.../TsetlinMachine/pyTsetlinMachineParallel/tm.py", line 314, in set_state
for i in range(self.number_of_classes):
AttributeError: 'MultiClassTsetlinMachine' object has no attribute 'number_of_classes'

if I update number of classes:
tm2.number_of_classes = 10
tm2.set_state(state)
program hangs, and after some time I got:

tm2.set_state(state)

=============================== RESTART: Shell ===============================

@andife
Copy link

andife commented Feb 7, 2021

Hello,
the question sounds very interesting and is also very relevant for me.

I would also like to do the calculation, the fit separately from the predict method, among other things also because the calculation takes some time. Based on the previous post, I have adapted an example to show my approach. I would be grateful for any hints on how to do it correctly.

from pyTsetlinMachine.tm import MultiClassTsetlinMachine
from pyTsetlinMachine.tools import Binarizer
import numpy as np

from sklearn import datasets
from sklearn.model_selection import train_test_split

import pickle
import pickletools 

breast_cancer = datasets.load_breast_cancer()
X = breast_cancer.data
Y = breast_cancer.target

b = Binarizer(max_bits_per_feature = 10)
b.fit(X)
X_transformed = b.transform(X)

tm = MultiClassTsetlinMachine(800, 40, 5.0)

print("\nMean accuracy over 10 runs:\n")
tm_results = np.empty(0)
for i in range(5):
 	X_train, X_test, Y_train, Y_test = train_test_split(X_transformed, Y, test_size=0.2)

 	tm.fit(X_train, Y_train, epochs=25)
 	with open('dat_' + str(i) + '.pickle', 'wb') as handle:
 		p =	pickle.dump(tm.get_state(), handle, protocol=pickle.HIGHEST_PROTOCOL)
		#pickletools.dis(p)


 	tm_results = np.append(tm_results, np.array(100*(tm.predict(X_test) == Y_test).mean()))
 	print("#%d Average Accuracy: %.2f%% +/- %.2f" % (i+1, tm_results.mean(), 1.96*tm_results.std()/np.sqrt(i+1)))


for i in range(5):
	with open('dat_' + str(i) + '.pickle','rb') as pickle_file:
		state= pickle.load(pickle_file)

	tm2 = MultiClassTsetlinMachine(800, 40, 5.0)             
	tm2.set_state(state)

	tm2_results = np.append(tm_results, np.array(100*(tm2.predict(X_test) == Y_test).mean()))
	print("#%d Average Accuracy: %.2f%% +/- %.2f" % (i+1, tm2_results.mean(), 1.96*tm2_results.std()/np.sqrt(i+1)))

My error message:

Mean accuracy over 10 runs:

 File "/mnt/c/ProjectsGit/tm-segment/BreastCancerDemo_pkl.py", line 42, in <module>
    tm2.set_state(state)
  File "/home/unix/miniconda3/lib/python3.8/site-packages/pyTsetlinMachine/tm.py", line 351, in set_state
    for i in range(self.number_of_classes):
AttributeError: 'MultiClassTsetlinMachine' object has no attribute 'number_of_classes'

@bmreiniger
Copy link

Just to link things together, that last post is now asked over at StackOverflow:
https://stackoverflow.com/q/66099424/10495893

@olegranmo
Copy link
Member

olegranmo commented Feb 8, 2021

Hi @bmreiniger and @andife! Thanks for bringing up this issue.

The get- and set state methods only operate on the state of the Tsetlin Automata from the C-part of the code.
It is the fit-function on the Python side that sets the other crucial parameters and creates the actual Tsetlin Machine structure in C, if self.mc_tm == None:

if self.mc_tm == None:
			self.number_of_classes = int(np.max(Y) + 1)

			if self.append_negated:
				self.number_of_features = X.shape[1]*2
			else:
				self.number_of_features = X.shape[1]

			self.number_of_patches = 1
			self.number_of_ta_chunks = int((self.number_of_features-1)/32 + 1)
			self.mc_tm = _lib.CreateMultiClassTsetlinMachine(self.number_of_classes, self.number_of_clauses, self.number_of_features, 1, self.number_of_ta_chunks, self.number_of_state_bits, self.T, self.s, self.s_range, self.boost_true_positive_feedback, self.weighted_clauses)

As you see from the above code, the number of features and classes is obtained automatically from the input data X and Y. So, one possible trick is to call fit on the training data first, with epochs=0. Then fit will set the number of classes and features and create the TM in C, without running any epochs over the data. After calling fit, you can call set_state to initialize the Tsetlin Automata from the saved state.

I am planning to add methods for this. Let me know if it works out!

@andife
Copy link

andife commented Feb 9, 2021

Hello @olegranmo, @bmreiniger;

Thanks for the solution for this issue. It works! (I used the code below for testing)

In general, I'm wondering if I really need/should load the original dataset, or if it's not a synthetic one that just contains the same number of features and classes. This would also correspond to the approach of https://github.com/cair/pyTsetlinMachine/pull/4/files.

from pyTsetlinMachine.tm import MultiClassTsetlinMachine
from pyTsetlinMachine.tools import Binarizer
import numpy as np

from sklearn import datasets
from sklearn.model_selection import train_test_split

import pickle
import pickletools 

breast_cancer = datasets.load_breast_cancer()
X = breast_cancer.data
Y = breast_cancer.target

b = Binarizer(max_bits_per_feature = 10)
b.fit(X)
X_transformed = b.transform(X)

tm = MultiClassTsetlinMachine(800, 40, 5.0)

print("\nMean accuracy over 10 runs:\n")
tm_results = np.empty(0)
tm2_results = np.empty(0)
for i in range(5):
 	X_train, X_test, Y_train, Y_test = train_test_split(X_transformed, Y, test_size=0.2,random_state=i)

 	tm.fit(X_train, Y_train, epochs=25)
 	with open('dat_' + str(i) + '.pickle', 'wb') as handle:
 		p =	pickle.dump(tm.get_state(), handle, protocol=pickle.HIGHEST_PROTOCOL)
		#pickletools.dis(p)


 	tm_results = np.append(tm_results, np.array(100*(tm.predict(X_test) == Y_test).mean()))
 	print("#%d Average Accuracy: %.2f%% +/- %.2f" % (i+1, tm_results.mean(), 1.96*tm_results.std()/np.sqrt(i+1)))

del tm

for i in range(5):
	with open('dat_' + str(i) + '.pickle','rb') as pickle_file:
		state = pickle.load(pickle_file)

	X2_train, X2_test, Y2_train, Y2_test = train_test_split(X_transformed, Y, test_size=0.2,random_state=i)
	#X2_train = X2_train + 100
	tm2 = MultiClassTsetlinMachine(800, 40, 5.0)   
	tm2.fit(X_train, Y_train, epochs=0) # X_train does not change, only for setting the correct dimensions

	tm2.set_state(state)

	tm2_results = np.append(tm2_results, np.array(100*(tm2.predict(X2_test) == Y2_test).mean()))
	print("sec2 #%d Average Accuracy: %.2f%% +/- %.2f" % (i+1, tm2_results.mean(), 1.96*tm2_results.std()/np.sqrt(i+1)))

@olegranmo
Copy link
Member

Great that it works, @andife! Yes, I guess you could for instance just use one example X with the correct number of features and the largest y-value (number of classes - 1).

@devop01
Copy link
Author

devop01 commented May 10, 2021

@olegranmo It seems this approach is not working for CUDA version.
For following script (Parallel version):

from pyTsetlinMachineParallel.tm import MultiClassTsetlinMachine
import numpy as np
from time import time

from keras.datasets import mnist
import pickle    

(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

X_train = np.where(X_train.reshape((X_train.shape[0], 28*28)) > 75, 1, 0) 
X_test = np.where(X_test.reshape((X_test.shape[0], 28*28)) > 75, 1, 0) 

tm = MultiClassTsetlinMachine(2000, 50, 10.0)

print("\nAccuracy over 10 epochs:\n")
for i in range(10):
    start_training = time()
    tm.fit(X_train, Y_train, epochs=1, incremental=True)
    stop_training = time()

    start_testing = time()
    result = 100*(tm.predict(X_test) == Y_test).mean()
    stop_testing = time()

    print("#%d Accuracy: %.2f%% Training: %.2fs Testing: %.2fs" % (i+1, result, stop_training-start_training, stop_testing-start_testing))

print("Saving TM state...")
    
with open('tm001.pkl', 'wb') as handle:
    pickle.dump(tm.get_state(), handle, protocol=pickle.HIGHEST_PROTOCOL)    

print("Loading TM state...")
tm2 = MultiClassTsetlinMachine(2000, 50, 10.0)
tm2.fit(X_train, Y_train, epochs=0)

with open('tm001.pkl','rb') as pickle_file:
    state = pickle.load(pickle_file)
tm2.set_state(state)

print("Continue training")
for i in range(5):
    start_training = time()
    tm2.fit(X_train, Y_train, epochs=1, incremental=True)
    stop_training = time()

    start_testing = time()
    result = 100*(tm2.predict(X_test) == Y_test).mean()
    stop_testing = time()

    print("#%d Accuracy: %.2f%% Training: %.2fs Testing: %.2fs" % (i+1, result, stop_training-start_training, stop_testing-start_testing))

results of reloaded TM are the same as at the end of training:
Accuracy over 10 epochs:

#1 Accuracy: 94.27% Training: 35.89s Testing: 21.07s
#2 Accuracy: 95.53% Training: 26.94s Testing: 21.09s
#3 Accuracy: 95.97% Training: 25.63s Testing: 21.27s
#4 Accuracy: 96.56% Training: 24.98s Testing: 21.72s
#5 Accuracy: 96.72% Training: 24.10s Testing: 21.48s
#6 Accuracy: 96.77% Training: 23.22s Testing: 21.47s
#7 Accuracy: 96.88% Training: 23.08s Testing: 21.86s
#8 Accuracy: 96.87% Training: 22.60s Testing: 21.47s
#9 Accuracy: 97.10% Training: 22.38s Testing: 21.59s
#10 Accuracy: 97.16% Training: 22.14s Testing: 21.63s
Saving TM state...
Loading TM state...
Continue training
#1 Accuracy: 97.18% Training: 21.44s Testing: 21.19s
#2 Accuracy: 97.25% Training: 21.31s Testing: 21.22s
#3 Accuracy: 97.22% Training: 20.92s Testing: 20.78s
#4 Accuracy: 97.10% Training: 21.36s Testing: 21.17s
#5 Accuracy: 97.49% Training: 20.65s Testing: 21.22s

but for CUDA version:

from PyTsetlinMachineCUDA.tm import MultiClassTsetlinMachine
import numpy as np
from time import time

from keras.datasets import mnist
import pickle

(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

X_train = np.where(X_train.reshape((X_train.shape[0], 28*28)) > 75, 1, 0) 
X_test = np.where(X_test.reshape((X_test.shape[0], 28*28)) > 75, 1, 0) 

tm = MultiClassTsetlinMachine(2000, 50*16, 10.0, max_weight=16)

print("\nAccuracy over 10 epochs:\n")
for i in range(10):
    start_training = time()
    tm.fit(X_train, Y_train, epochs=1, incremental=True)
    stop_training = time()

    start_testing = time()
    result = 100*(tm.predict(X_test) == Y_test).mean()
    stop_testing = time()

    print("#%d Accuracy: %.2f%% Training: %.2fs Testing: %.2fs" % (i+1, result, stop_training-start_training, stop_testing-start_testing))

print("Saving TM state...")
    
with open('tm002.pkl', 'wb') as handle:
    pickle.dump(tm.get_state(), handle, protocol=pickle.HIGHEST_PROTOCOL)    

print("Loading TM state...")
tm2 = MultiClassTsetlinMachine(2000, 50*16, 10.0, max_weight=16)
tm2.fit(X_train, Y_train, epochs=0)

with open('tm002.pkl','rb') as pickle_file:
    state = pickle.load(pickle_file)
tm2.set_state(state)

print("Continue training")
for i in range(5):
    start_training = time()
    tm2.fit(X_train, Y_train, epochs=1, incremental=True)
    stop_training = time()

    start_testing = time()
    result = 100*(tm2.predict(X_test) == Y_test).mean()
    stop_testing = time()

    print("#%d Accuracy: %.2f%% Training: %.2fs Testing: %.2fs" % (i+1, result, stop_training-start_training, stop_testing-start_testing))

there is no effect after reload, it's like script is starting from zero:
Accuracy over 10 epochs:

#1 Accuracy: 92.80% Training: 9.59s Testing: 1.08s
#2 Accuracy: 94.21% Training: 8.10s Testing: 1.15s
#3 Accuracy: 95.56% Training: 7.85s Testing: 1.07s
#4 Accuracy: 96.01% Training: 7.44s Testing: 1.07s
#5 Accuracy: 96.38% Training: 7.38s Testing: 1.07s
#6 Accuracy: 96.61% Training: 7.63s Testing: 1.12s
#7 Accuracy: 96.69% Training: 7.79s Testing: 1.13s
#8 Accuracy: 96.94% Training: 7.74s Testing: 1.14s
#9 Accuracy: 97.06% Training: 7.50s Testing: 1.14s
#10 Accuracy: 97.09% Training: 7.80s Testing: 1.06s
Saving TM state...
Loading TM state...
Continue training
#1 Accuracy: 92.97% Training: 9.77s Testing: 1.02s
#2 Accuracy: 94.23% Training: 8.06s Testing: 1.15s
#3 Accuracy: 95.66% Training: 8.17s Testing: 1.14s
#4 Accuracy: 96.08% Training: 7.42s Testing: 1.06s
#5 Accuracy: 96.56% Training: 7.88s Testing: 1.06s

@olegranmo
Copy link
Member

Hi @devop01 - thanks for reporting! I have started adding pickle support, just completed for PyTsetlinMachine.

@olegranmo
Copy link
Member

olegranmo commented Jun 19, 2021

Hi again @devop01, just added pickle support for PyTsetlinMachineCUDA!

@devop01
Copy link
Author

devop01 commented Jun 22, 2021

Hi @olegranmo

Thank you for new version :)
I reinstalled it, and results of training after loading a pickle didn't improve unfortunately.
I'm not sure if this is issue with my CUDA setup, or something else is missing.
Comparison of TM before and after a pickle could improve investigation, so I think overriding eq
operator could be helpful.

Accuracy over 25 epochs:

#1 Accuracy: 92.89% Training: 9.78s Testing: 1.06s
#2 Accuracy: 94.30% Training: 7.76s Testing: 1.11s
#3 Accuracy: 95.64% Training: 7.80s Testing: 1.10s
#4 Accuracy: 96.06% Training: 7.67s Testing: 1.10s
#5 Accuracy: 96.37% Training: 7.62s Testing: 1.11s
#6 Accuracy: 96.67% Training: 7.89s Testing: 1.15s
#7 Accuracy: 96.77% Training: 7.79s Testing: 1.10s
#8 Accuracy: 96.95% Training: 7.57s Testing: 1.11s
#9 Accuracy: 97.06% Training: 7.54s Testing: 1.09s
#10 Accuracy: 97.23% Training: 7.50s Testing: 1.09s
#11 Accuracy: 97.23% Training: 7.47s Testing: 1.09s
#12 Accuracy: 97.32% Training: 7.46s Testing: 1.14s
#13 Accuracy: 97.36% Training: 7.70s Testing: 1.14s
#14 Accuracy: 97.48% Training: 7.62s Testing: 1.09s
#15 Accuracy: 97.54% Training: 7.40s Testing: 1.08s
#16 Accuracy: 97.60% Training: 7.40s Testing: 1.09s
#17 Accuracy: 97.54% Training: 7.39s Testing: 1.09s
#18 Accuracy: 97.53% Training: 7.38s Testing: 1.09s
#19 Accuracy: 97.60% Training: 7.38s Testing: 1.09s
#20 Accuracy: 97.50% Training: 7.37s Testing: 1.09s
#21 Accuracy: 97.63% Training: 7.37s Testing: 1.09s
#22 Accuracy: 97.75% Training: 7.37s Testing: 1.09s
#23 Accuracy: 97.70% Training: 7.36s Testing: 1.09s
#24 Accuracy: 97.67% Training: 7.36s Testing: 1.09s
#25 Accuracy: 97.70% Training: 7.37s Testing: 1.09s
Saving TM state...
Loading TM state...
Comparing tm1 and tm2
False
Continue training
#1 Accuracy: 92.95% Training: 9.86s Testing: 1.05s
#2 Accuracy: 94.16% Training: 7.77s Testing: 1.11s
#3 Accuracy: 95.64% Training: 7.83s Testing: 1.10s
#4 Accuracy: 96.21% Training: 7.70s Testing: 1.10s
#5 Accuracy: 96.43% Training: 7.63s Testing: 1.10s

Below is a full code I used:

from PyTsetlinMachineCUDA.tm import MultiClassTsetlinMachine
import numpy as np
from time import time

from keras.datasets import mnist
import pickle

(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

X_train = np.where(X_train.reshape((X_train.shape[0], 28*28)) > 75, 1, 0) 
X_test = np.where(X_test.reshape((X_test.shape[0], 28*28)) > 75, 1, 0) 

tm = MultiClassTsetlinMachine(2000, 50*16, 10.0, max_weight=16)

print("\nAccuracy over 25 epochs:\n")
for i in range(25):
    start_training = time()
    tm.fit(X_train, Y_train, epochs=1, incremental=True)
    stop_training = time()

    start_testing = time()
    result = 100*(tm.predict(X_test) == Y_test).mean()
    stop_testing = time()

    print("#%d Accuracy: %.2f%% Training: %.2fs Testing: %.2fs" % (i+1, result, stop_training-start_training, stop_testing-start_testing))

print("Saving TM state...")
    
with open('tm005.pkl', 'wb') as handle:
    pickle.dump(tm, handle, protocol=pickle.HIGHEST_PROTOCOL)

print("Loading TM state...")

with open('tm005.pkl','rb') as pickle_file:
    tm2 = pickle.load(pickle_file)

print("Comparing tm1 and tm2")
print(tm == tm2)

print("Continue training")
for i in range(5):
    start_training = time()
    tm2.fit(X_train, Y_train, epochs=1, incremental=True)
    stop_training = time()

    start_testing = time()
    result = 100*(tm2.predict(X_test) == Y_test).mean()
    stop_testing = time()

    print("#%d Accuracy: %.2f%% Training: %.2fs Testing: %.2fs" % (i+1, result, stop_training-start_training, stop_testing-start_testing))

@olegranmo
Copy link
Member

Hi @devop01, this happens because the local voting tallies used for asynchronous parallel learning is not stored as part of the state. Everything is reinitialized when you start training again. Will fix this in the next update!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants