tutorial validation (#185)

Co-authored-by: Ben Volokh <89551265+benv123@users.noreply.github.com>
This commit is contained in:
Nicola Demo
2023-10-17 10:54:31 +02:00
parent 2e2fe93458
commit 32ff5de1f4
38 changed files with 1072 additions and 1006 deletions

View File

@@ -21,6 +21,13 @@ First of all we import the modules needed for the tutorial. Importing
from pina.problem import AbstractProblem
import matplotlib.pyplot as plt
.. parsed-literal::
/opt/sissa/apps/intelpython/2022.0.2/intelpython/latest/lib/python3.9/site-packages/scipy/__init__.py:138: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.26.0)
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion} is required for this version of "
Data Generation
---------------
@@ -52,7 +59,7 @@ taken from the authors original reference.
x = torch.tensor(data['x'], dtype=torch.float)[0]
y = torch.tensor(data['y'], dtype=torch.float)[0]
Lets visualize some data
Let's visualize some data
.. code:: ipython3
@@ -66,7 +73,7 @@ Lets visualize some data
.. image:: tutorial_files/tutorial_6_0.png
.. image:: output_6_0.png
We now create the neural operator class. It is a very simple class,
@@ -107,10 +114,19 @@ training using supervised learning.
.. parsed-literal::
GPU available: False, used: False
/u/n/ndemo/.local/lib/python3.9/site-packages/torch/cuda/__init__.py:611: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Missing logger folder: /u/n/ndemo/PINA/tutorials/tutorial5/lightning_logs
2023-10-17 10:41:03.316644: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-10-17 10:41:03.333768: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-10-17 10:41:03.383188: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-17 10:41:07.712785: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
----------------------------------------
@@ -123,21 +139,20 @@ training using supervised learning.
0.002 Total estimated model params size (MB)
.. parsed-literal::
Epoch 99: : 1it [00:00, 15.95it/s, v_num=85, mean_loss=0.105]
.. parsed-literal::
Training: 0it [00:00, ?it/s]
.. parsed-literal::
/u/n/ndemo/.local/lib/python3.9/site-packages/torch/_tensor.py:1386: UserWarning: The use of `x.T` on tensors of dimension other than 2 to reverse their shape is deprecated and it will throw an error in a future release. Consider `x.mT` to transpose batches of matrices or `x.permute(*torch.arange(x.ndim - 1, -1, -1))` to reverse the dimensions of a tensor. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3614.)
ret = func(*args, **kwargs)
`Trainer.fit` stopped: `max_epochs=100` reached.
.. parsed-literal::
Epoch 99: : 1it [00:00, 15.53it/s, v_num=85, mean_loss=0.105]
The final loss is pretty high… We can calculate the error by importing
The final loss is pretty high... We can calculate the error by importing
``LpLoss``.
.. code:: ipython3
@@ -157,8 +172,8 @@ The final loss is pretty high… We can calculate the error by importing
.. parsed-literal::
Final error training 56.06%
Final error testing 55.95%
Final error training 56.86%
Final error testing 56.82%
Solving the problem with a Fuorier Neural Operator (FNO)
@@ -191,10 +206,11 @@ operator this approach is better suited, as we shall see.
.. parsed-literal::
GPU available: False, used: False
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
----------------------------------------
@@ -207,22 +223,19 @@ operator this approach is better suited, as we shall see.
2.364 Total estimated model params size (MB)
.. parsed-literal::
Epoch 19: : 1it [00:02, 2.65s/it, v_num=84, mean_loss=0.0294]
Training: 0it [00:00, ?it/s]
.. parsed-literal::
`Trainer.fit` stopped: `max_epochs=20` reached.
.. parsed-literal::
Epoch 19: : 1it [00:02, 2.67s/it, v_num=84, mean_loss=0.0294]
We can clearly see that with 1/3 of the total epochs the loss is lower.
Lets see in testing.. Notice that the number of parameters is way
Let's see in testing.. Notice that the number of parameters is way
higher than a ``FeedForward`` network. We suggest to use GPU or TPU for
a speed up in training.
@@ -237,13 +250,13 @@ a speed up in training.
.. parsed-literal::
Final error training 26.05%
Final error testing 25.58%
Final error training 26.19%
Final error testing 25.89%
As we can see the loss is way lower!
Whats next?
What's next?
------------
We have made a very simple example on how to use the ``FNO`` for

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 15 KiB