Tensorflow 2.14 on Apple Silicon CPU (not GPU) - for performance comparison #36

obriensystems · 2024-12-03T19:35:45Z

GPU parallelism works out of the box - see #32
CPU parallelism seems to be defaulted at 33% of the cores
Attempting to optimize for M1 Max (10 core 8p/2e) and M4 Max (16 core 12p/4e)

set the threads to 4 x the total number of cores

M1Max = 10 (8p/2e) = 40 threads
M1Max = 16 (12p/4e) = 64 threads

M1 Max
default parallelism for
os.environ["OMP_NUM_THREADS"] = "10"
tf.config.threading.set_inter_op_parallelism_threads(0)
tf.config.threading.set_intra_op_parallelism_threads(0)

 Python	359.3	1:17.32	60	7	Apple	0.0	0.01	61795	michaelobrien		0 bytes	0 bytes	0 bytes	No	No	No	(null)	No	0 bytes	0	0 bytes	0	0 bytes	0 bytes	0 bytes	-	0 bytes	0 bytes	No	(null)

The text was updated successfully, but these errors were encountered:

obriensystems self-assigned this Dec 3, 2024

obriensystems added a commit that referenced this issue Dec 3, 2024

#36 - Update tflow.py for ARM64 parallelism

3fb7172

obriensystems added a commit that referenced this issue Dec 3, 2024

#36 - Update tflow.py for ARM64 parallelism

3f64906

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensorflow 2.14 on Apple Silicon CPU (not GPU) - for performance comparison #36

Tensorflow 2.14 on Apple Silicon CPU (not GPU) - for performance comparison #36

obriensystems commented Dec 3, 2024 •

edited

Loading

Tensorflow 2.14 on Apple Silicon CPU (not GPU) - for performance comparison #36

Tensorflow 2.14 on Apple Silicon CPU (not GPU) - for performance comparison #36

Comments

obriensystems commented Dec 3, 2024 • edited Loading

obriensystems commented Dec 3, 2024 •

edited

Loading