Optimizing Model¶

List of optimization parameters we could try¶

  • Optimizer function / algorithms could be fine tuned to produce the best fit
  • Loss function / algorithms could be replaced with more suitable one. Because the conductivity highly diverges.
  • Other subtle parameters: learning_rate, decay, epochs, batch_size, neuron layers and activation function.
  • Scalers: Both X and Y values are strongly unrelated due to various types of fillers (and polymers).
  • Hardware accelation: By default, it utilizes which are available in system. In my case, it automatically runs on GPU NVIDIA GeForce RTX 2060

epochs¶

As the dataset includes multiple ranges of magnitude-order, the iterations of training are large and depending on configuration of optimizing parameters, the number of epochs can go up to thoudsands.

Although more iteration could produce better fitting, but will be overfitting if the number is too high. We need to watch how the losses of training data and testing data behave.

batch_size¶

Too large of a batch size will lead to poor generalization

Smaller batch sizes could shown to have faster convergence to “good” solutions.

In [ ]:
import tensorflow as tf

GPU performance¶

Use tf.config.list_physical_devices('GPU') to confirm that TensorFlow is using the GPU.

Use less memory¶

Using tf.config.set_visible_devices to restrict TensorFlow to only allocate 1GB of memory on the first GPU

gpus = tf.config.list_physical_devices('GPU')
if gpus:
  try:
    tf.config.set_logical_device_configuration( gpus[0],
        [tf.config.LogicalDeviceConfiguration(memory_limit=1024)])
    logical_gpus = tf.config.list_logical_devices('GPU')
    print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
  except RuntimeError as e:
    # Virtual devices must be set before GPUs have been initialized
    print(e)

Reference sources¶

  • Use a GPU
  • tf.config.experimental
  • TensorFlow Profiler
In [ ]:
tf.config.experimental.list_physical_devices(device_type=None)
Out[ ]:
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
 PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
In [ ]:
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
Num GPUs Available:  1
In [ ]:
gpus = tf.config.list_physical_devices('GPU')
if gpus:
  try:
    tf.config.set_logical_device_configuration( gpus[0],
        [tf.config.LogicalDeviceConfiguration(memory_limit=1024)])
    logical_gpus = tf.config.list_logical_devices('GPU')
    print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
  except RuntimeError as e:
    # Virtual devices must be set before GPUs have been initialized
    print(e)
1 Physical GPUs, 1 Logical GPUs

Issue with Windows 10¶

There is an issue when using GPU but nothing is calculated by GPU. You might need to turn on Hardware-Accelerated GPU Scheduling. Go to Settings > Graphics > Turn on. More detail can be found in this link here Demo image

Turn off GPU¶

In case you want to turn off GPU acceleration, here are choices.

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

Or

try:
    # Disable all GPUS
    tf.config.set_visible_devices([], 'GPU')
    visible_devices = tf.config.get_visible_devices()
    for device in visible_devices:
        assert device.device_type != 'GPU'
except:
    # Invalid device or cannot modify virtual devices once initialized.
    pass

Source

In [ ]:
 

Parallelism¶

We can get/set number of threads used for parallelism between independent operations get_inter_op_parallelism_threads or number of threads used within an individual op for parallelism get_intra_op_parallelism_threads.

A value of 0 means the system picks an appropriate number.

tf.config.threading.set_inter_op_parallelism_threads(4)
tf.config.threading.set_intra_op_parallelism_threads(16)
In [ ]:
tf.config.threading.get_inter_op_parallelism_threads()
Out[ ]:
0
In [ ]:
tf.config.optimizer.get_experimental_options()
Out[ ]:
{'disable_model_pruning': False, 'disable_meta_optimizer': False}
In [ ]: