1076 lines
172 KiB
Plaintext
Vendored
1076 lines
172 KiB
Plaintext
Vendored
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "48dd2795",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Tutorial: Unstructured convolutional autoencoder via continuous convolution\n",
|
|
"\n",
|
|
"[](https://colab.research.google.com/github/mathLab/PINA/blob/master/tutorials/tutorial4/tutorial.ipynb)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "25770254",
|
|
"metadata": {},
|
|
"source": [
|
|
"In this tutorial, we will show how to use the Continuous Convolutional Filter, and how to build common Deep Learning architectures with it. The implementation of the filter follows the original work [*A Continuous Convolutional Trainable Filter for Modelling Unstructured Data*](https://arxiv.org/abs/2210.13416)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "80e8bfac",
|
|
"metadata": {},
|
|
"source": [
|
|
"First of all we import the modules needed for the tutorial:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "5ae7c0e8",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"/home/matte_b/PINA/pina/solvers/__init__.py: DeprecationWarning: 'pina.solvers' is deprecated and will be removed in future versions. Please use 'pina.solver' instead.\n",
|
|
"/home/matte_b/PINA/pina/model/layers/__init__.py: DeprecationWarning: 'pina.model.layers' is deprecated and will be removed in future versions. Please use 'pina.model.block' instead.\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"## routine needed to run the notebook on Google Colab\n",
|
|
"try:\n",
|
|
" import google.colab\n",
|
|
" IN_COLAB = True\n",
|
|
"except:\n",
|
|
" IN_COLAB = False\n",
|
|
"if IN_COLAB:\n",
|
|
" !pip install \"pina-mathlab\"\n",
|
|
"\n",
|
|
"import torch \n",
|
|
"import matplotlib.pyplot as plt \n",
|
|
"plt.style.use('tableau-colorblind10')\n",
|
|
"from pina.problem import AbstractProblem\n",
|
|
"from pina.solvers import SupervisedSolver\n",
|
|
"from pina.trainer import Trainer\n",
|
|
"from pina import Condition, LabelTensor\n",
|
|
"from pina.model.layers import ContinuousConvBlock \n",
|
|
"import torchvision # for MNIST dataset\n",
|
|
"from pina.model import FeedForward # for building AE and MNIST classification"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "4094758f",
|
|
"metadata": {},
|
|
"source": [
|
|
"The tutorial is structured as follow: \n",
|
|
"* [Continuous filter background](#continuous-filter-background): understand how the convolutional filter works and how to use it.\n",
|
|
"* [Building a MNIST Classifier](#building-a-mnist-classifier): show how to build a simple classifier using the MNIST dataset and how to combine a continuous convolutional layer with a feedforward neural network. \n",
|
|
"* [Building a Continuous Convolutional Autoencoder](#building-a-continuous-convolutional-autoencoder): show how to use the continuous filter to work with unstructured data for autoencoding and up-sampling."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "87327478",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Continuous filter background"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "7f1aa4ef",
|
|
"metadata": {},
|
|
"source": [
|
|
"As reported by the authors in the original paper: in contrast to discrete convolution, continuous convolution is mathematically defined as:\n",
|
|
"\n",
|
|
"$$\n",
|
|
" \\mathcal{I}_{\\rm{out}}(\\mathbf{x}) = \\int_{\\mathcal{X}} \\mathcal{I}(\\mathbf{x} + \\mathbf{\\tau}) \\cdot \\mathcal{K}(\\mathbf{\\tau}) d\\mathbf{\\tau},\n",
|
|
"$$\n",
|
|
"where $\\mathcal{K} : \\mathcal{X} \\rightarrow \\mathbb{R}$ is the *continuous filter* function, and $\\mathcal{I} : \\Omega \\subset \\mathbb{R}^N \\rightarrow \\mathbb{R}$ is the input function. The continuous filter function is approximated using a FeedForward Neural Network, thus trainable during the training phase. The way in which the integral is approximated can be different, currently on **PINA** we approximate it using a simple sum, as suggested by the authors. Thus, given $\\{\\mathbf{x}_i\\}_{i=1}^{n}$ points in $\\mathbb{R}^N$ of the input function mapped on the $\\mathcal{X}$ filter domain, we approximate the above equation as:\n",
|
|
"$$\n",
|
|
" \\mathcal{I}_{\\rm{out}}(\\mathbf{\\tilde{x}}_i) = \\sum_{{\\mathbf{x}_i}\\in\\mathcal{X}} \\mathcal{I}(\\mathbf{x}_i + \\mathbf{\\tau}) \\cdot \\mathcal{K}(\\mathbf{x}_i),\n",
|
|
"$$\n",
|
|
"where $\\mathbf{\\tau} \\in \\mathcal{S}$, with $\\mathcal{S}$ the set of available strides, corresponds to the current stride position of the filter, and $\\mathbf{\\tilde{x}}_i$ points are obtained by taking the centroid of the filter position mapped on the $\\Omega$ domain. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "a2ea9c78",
|
|
"metadata": {},
|
|
"source": [
|
|
"We will now try to pratically see how to work with the filter. From the above definition we see that what is needed is:\n",
|
|
"1. A domain and a function defined on that domain (the input)\n",
|
|
"2. A stride, corresponding to the positions where the filter needs to be $\\rightarrow$ `stride` variable in `ContinuousConv`\n",
|
|
"3. The filter rectangular domain $\\rightarrow$ `filter_dim` variable in `ContinuousConv`"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ac896875",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Input function\n",
|
|
"\n",
|
|
"The input function for the continuous filter is defined as a tensor of shape: $$[B \\times N_{in} \\times N \\times D]$$ where $B$ is the batch_size, $N_{in}$ is the number of input fields, $N$ the number of points in the mesh, $D$ the dimension of the problem. In particular:\n",
|
|
"* $D$ is the number of spatial variables + 1. The last column must contain the field value. For example for 2D problems $D=3$ and the tensor will be something like `[first coordinate, second coordinate, field value]`\n",
|
|
"* $N_{in}$ represents the number of vectorial function presented. For example a vectorial function $f = [f_1, f_2]$ will have $N_{in}=2$ \n",
|
|
"\n",
|
|
"Let's see an example to clear the ideas. We will be verbose to explain in details the input form. We wish to create the function:\n",
|
|
"$$\n",
|
|
"f(x, y) = [\\sin(\\pi x) \\sin(\\pi y), -\\sin(\\pi x) \\sin(\\pi y)] \\quad (x,y)\\in[0,1]\\times[0,1]\n",
|
|
"$$\n",
|
|
"\n",
|
|
"using a batch size equal to 1."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"id": "447bb133",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Domain has shape: torch.Size([1, 2, 200, 2])\n",
|
|
"Filter input data has shape: torch.Size([1, 2, 200, 3])\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# batch size fixed to 1\n",
|
|
"batch_size = 1\n",
|
|
"\n",
|
|
"# points in the mesh fixed to 200\n",
|
|
"N = 200\n",
|
|
"\n",
|
|
"# vectorial 2 dimensional function, number_input_fields=2\n",
|
|
"number_input_fields = 2\n",
|
|
"\n",
|
|
"# 2 dimensional spatial variables, D = 2 + 1 = 3\n",
|
|
"D = 3\n",
|
|
"\n",
|
|
"# create the function f domain as random 2d points in [0, 1]\n",
|
|
"domain = torch.rand(size=(batch_size, number_input_fields, N, D-1))\n",
|
|
"print(f\"Domain has shape: {domain.shape}\")\n",
|
|
"\n",
|
|
"# create the functions\n",
|
|
"pi = torch.acos(torch.tensor([-1.])) # pi value\n",
|
|
"f1 = torch.sin(pi * domain[:, 0, :, 0]) * torch.sin(pi * domain[:, 0, :, 1])\n",
|
|
"f2 = - torch.sin(pi * domain[:, 1, :, 0]) * torch.sin(pi * domain[:, 1, :, 1])\n",
|
|
"\n",
|
|
"# stacking the input domain and field values\n",
|
|
"data = torch.empty(size=(batch_size, number_input_fields, N, D))\n",
|
|
"data[..., :-1] = domain # copy the domain\n",
|
|
"data[:, 0, :, -1] = f1 # copy first field value\n",
|
|
"data[:, 1, :, -1] = f1 # copy second field value\n",
|
|
"print(f\"Filter input data has shape: {data.shape}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "e93d6afd",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Stride\n",
|
|
"\n",
|
|
"The stride is passed as a dictionary `stride` which tells the filter where to go. Here is an example for the $[0,1]\\times[0,5]$ domain:\n",
|
|
"\n",
|
|
"```python\n",
|
|
"# stride definition\n",
|
|
"stride = {\"domain\": [1, 5],\n",
|
|
" \"start\": [0, 0],\n",
|
|
" \"jump\": [0.1, 0.3],\n",
|
|
" \"direction\": [1, 1],\n",
|
|
" }\n",
|
|
"```\n",
|
|
"This tells the filter:\n",
|
|
"1. `domain`: square domain (the only implemented) $[0,1]\\times[0,5]$. The minimum value is always zero, while the maximum is specified by the user\n",
|
|
"2. `start`: start position of the filter, coordinate $(0, 0)$\n",
|
|
"3. `jump`: the jumps of the centroid of the filter to the next position $(0.1, 0.3)$\n",
|
|
"4. `direction`: the directions of the jump, with `1 = right`, `0 = no jump`, `-1 = left` with respect to the current position\n",
|
|
"\n",
|
|
"**Note**\n",
|
|
"\n",
|
|
"We are planning to release the possibility to directly pass a list of possible strides!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "71c13ef2",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Filter definition\n",
|
|
"\n",
|
|
"Having defined all the previous blocks, we are now able to construct the continuous filter.\n",
|
|
"\n",
|
|
"Suppose we would like to get an output with only one field, and let us fix the filter dimension to be $[0.1, 0.1]$."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"id": "b78c08b8",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"/home/matte_b/.local/lib/python3.12/site-packages/torch/functional.py: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3595.)\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# filter dim\n",
|
|
"filter_dim = [0.1, 0.1]\n",
|
|
"\n",
|
|
"# stride\n",
|
|
"stride = {\"domain\": [1, 1],\n",
|
|
" \"start\": [0, 0],\n",
|
|
" \"jump\": [0.08, 0.08],\n",
|
|
" \"direction\": [1, 1],\n",
|
|
" }\n",
|
|
"\n",
|
|
"# creating the filter \n",
|
|
"cConv = ContinuousConvBlock(input_numb_field=number_input_fields,\n",
|
|
" output_numb_field=1,\n",
|
|
" filter_dim=filter_dim,\n",
|
|
" stride=stride)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "49ccc992",
|
|
"metadata": {},
|
|
"source": [
|
|
"That's it! In just one line of code we have created the continuous convolutional filter. By default the `pina.model.FeedForward` neural network is intitialised, more on the [documentation](https://mathlab.github.io/PINA/_rst/fnn.html). In case the mesh doesn't change during training we can set the `optimize` flag equals to `True`, to exploit optimizations for finding the points to convolve."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"id": "0fbe67dc",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# creating the filter + optimization\n",
|
|
"cConv = ContinuousConvBlock(input_numb_field=number_input_fields,\n",
|
|
" output_numb_field=1,\n",
|
|
" filter_dim=filter_dim,\n",
|
|
" stride=stride,\n",
|
|
" optimize=True)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "f99c290e",
|
|
"metadata": {},
|
|
"source": [
|
|
"Let's try to do a forward pass:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"id": "07580a3c",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Filter input data has shape: torch.Size([1, 2, 200, 3])\n",
|
|
"Filter output data has shape: torch.Size([1, 1, 169, 3])\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"print(f\"Filter input data has shape: {data.shape}\")\n",
|
|
"\n",
|
|
"#input to the filter\n",
|
|
"output = cConv(data)\n",
|
|
"\n",
|
|
"print(f\"Filter output data has shape: {output.shape}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "886cf50f",
|
|
"metadata": {},
|
|
"source": [
|
|
"If we don't want to use the default `FeedForward` neural network, we can pass a specified torch model in the `model` keyword as follow: \n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"id": "0e234c69",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"class SimpleKernel(torch.nn.Module):\n",
|
|
" def __init__(self) -> None:\n",
|
|
" super().__init__()\n",
|
|
" self. model = torch.nn.Sequential(\n",
|
|
" torch.nn.Linear(2, 20),\n",
|
|
" torch.nn.ReLU(),\n",
|
|
" torch.nn.Linear(20, 20),\n",
|
|
" torch.nn.ReLU(),\n",
|
|
" torch.nn.Linear(20, 1))\n",
|
|
"\n",
|
|
" def forward(self, x):\n",
|
|
" return self.model(x)\n",
|
|
"\n",
|
|
"\n",
|
|
"cConv = ContinuousConvBlock(input_numb_field=number_input_fields,\n",
|
|
" output_numb_field=1,\n",
|
|
" filter_dim=filter_dim,\n",
|
|
" stride=stride,\n",
|
|
" optimize=True,\n",
|
|
" model=SimpleKernel)\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "2d4318ab",
|
|
"metadata": {},
|
|
"source": [
|
|
"Notice that we pass the class and not an already built object!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "254e8c8d",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Building a MNIST Classifier\n",
|
|
"\n",
|
|
"Let's see how we can build a MNIST classifier using a continuous convolutional filter. We will use the MNIST dataset from PyTorch. In order to keep small training times we use only 6000 samples for training and 1000 samples for testing."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"id": "6d816e7a",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from torch.utils.data import DataLoader, SubsetRandomSampler\n",
|
|
"\n",
|
|
"numb_training = 6000 # get just 6000 images for training\n",
|
|
"numb_testing= 1000 # get just 1000 images for training\n",
|
|
"seed = 111 # for reproducibility\n",
|
|
"batch_size = 8 # setting batch size\n",
|
|
"\n",
|
|
"# setting the seed\n",
|
|
"torch.manual_seed(seed)\n",
|
|
"\n",
|
|
"# downloading the dataset\n",
|
|
"train_data = torchvision.datasets.MNIST('./data/', train=True, download=True,\n",
|
|
" transform=torchvision.transforms.Compose([\n",
|
|
" torchvision.transforms.ToTensor(),\n",
|
|
" torchvision.transforms.Normalize(\n",
|
|
" (0.1307,), (0.3081,))\n",
|
|
" ]))\n",
|
|
"subsample_train_indices = torch.randperm(len(train_data))[:numb_training]\n",
|
|
"train_loader = DataLoader(train_data, batch_size=batch_size,\n",
|
|
" sampler=SubsetRandomSampler(subsample_train_indices))\n",
|
|
"\n",
|
|
"test_data = torchvision.datasets.MNIST('./data/', train=False, download=True,\n",
|
|
" transform=torchvision.transforms.Compose([\n",
|
|
" torchvision.transforms.ToTensor(),\n",
|
|
" torchvision.transforms.Normalize(\n",
|
|
" (0.1307,), (0.3081,))\n",
|
|
" ]))\n",
|
|
"subsample_test_indices = torch.randperm(len(train_data))[:numb_testing]\n",
|
|
"test_loader = DataLoader(train_data, batch_size=batch_size,\n",
|
|
" sampler=SubsetRandomSampler(subsample_train_indices))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "7f076010",
|
|
"metadata": {},
|
|
"source": [
|
|
"Let's now build a simple classifier. The MNIST dataset is composed by vectors of shape `[batch, 1, 28, 28]`, but we can image them as one field functions where the pixels $ij$ are the coordinate $x=i, y=j$ in a $[0, 27]\\times[0,27]$ domain, and the pixels values are the field values. We just need a function to transform the regular tensor in a tensor compatible for the continuous filter:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"id": "a872fb2d",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Original MNIST image shape: torch.Size([8, 1, 28, 28])\n",
|
|
"Transformed MNIST image shape: torch.Size([8, 1, 784, 3])\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"def transform_input(x):\n",
|
|
" batch_size = x.shape[0]\n",
|
|
" dim_grid = tuple(x.shape[:-3:-1])\n",
|
|
"\n",
|
|
" # creating the n dimensional mesh grid for a single channel image\n",
|
|
" values_mesh = [torch.arange(0, dim).float() for dim in dim_grid]\n",
|
|
" mesh = torch.meshgrid(values_mesh)\n",
|
|
" coordinates_mesh = [x.reshape(-1, 1) for x in mesh]\n",
|
|
" coordinates = torch.cat(coordinates_mesh, dim=1).unsqueeze(\n",
|
|
" 0).repeat((batch_size, 1, 1)).unsqueeze(1)\n",
|
|
"\n",
|
|
" return torch.cat((coordinates, x.flatten(2).unsqueeze(-1)), dim=-1)\n",
|
|
"\n",
|
|
"\n",
|
|
"# let's try it out\n",
|
|
"image, s = next(iter(train_loader))\n",
|
|
"print(f\"Original MNIST image shape: {image.shape}\")\n",
|
|
"\n",
|
|
"image_transformed = transform_input(image)\n",
|
|
"print(f\"Transformed MNIST image shape: {image_transformed.shape}\")\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "850b45c4",
|
|
"metadata": {},
|
|
"source": [
|
|
"We can now build a simple classifier! We will use just one convolutional filter followed by a feedforward neural network"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"id": "889c1592",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# setting the seed\n",
|
|
"torch.manual_seed(seed)\n",
|
|
"\n",
|
|
"class ContinuousClassifier(torch.nn.Module):\n",
|
|
" def __init__(self):\n",
|
|
" super().__init__()\n",
|
|
"\n",
|
|
" # number of classes for classification\n",
|
|
" numb_class = 10\n",
|
|
"\n",
|
|
" # convolutional block\n",
|
|
" self.convolution = ContinuousConvBlock(input_numb_field=1,\n",
|
|
" output_numb_field=4,\n",
|
|
" stride={\"domain\": [27, 27],\n",
|
|
" \"start\": [0, 0],\n",
|
|
" \"jumps\": [4, 4],\n",
|
|
" \"direction\": [1, 1.],\n",
|
|
" },\n",
|
|
" filter_dim=[4, 4],\n",
|
|
" optimize=True)\n",
|
|
" # feedforward net\n",
|
|
" self.nn = FeedForward(input_dimensions=196,\n",
|
|
" output_dimensions=numb_class,\n",
|
|
" layers=[120, 64],\n",
|
|
" func=torch.nn.ReLU)\n",
|
|
"\n",
|
|
" def forward(self, x):\n",
|
|
" # transform input + convolution\n",
|
|
" x = transform_input(x)\n",
|
|
" x = self.convolution(x)\n",
|
|
" # feed forward classification\n",
|
|
" return self.nn(x[..., -1].flatten(1))\n",
|
|
"\n",
|
|
"\n",
|
|
"net = ContinuousClassifier()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "4374c15c",
|
|
"metadata": {},
|
|
"source": [
|
|
"Let's try to train it using a simple pytorch training loop. We train for just 1 epoch using Adam optimizer with a $0.001$ learning rate."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 10,
|
|
"id": "0446afe0",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"batch [50/750] loss[0.161]\n",
|
|
"batch [100/750] loss[0.073]\n",
|
|
"batch [150/750] loss[0.063]\n",
|
|
"batch [200/750] loss[0.051]\n",
|
|
"batch [250/750] loss[0.044]\n",
|
|
"batch [300/750] loss[0.050]\n",
|
|
"batch [350/750] loss[0.053]\n",
|
|
"batch [400/750] loss[0.049]\n",
|
|
"batch [450/750] loss[0.046]\n",
|
|
"batch [500/750] loss[0.034]\n",
|
|
"batch [550/750] loss[0.036]\n",
|
|
"batch [600/750] loss[0.040]\n",
|
|
"batch [650/750] loss[0.028]\n",
|
|
"batch [700/750] loss[0.040]\n",
|
|
"batch [750/750] loss[0.040]\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# setting the seed\n",
|
|
"torch.manual_seed(seed)\n",
|
|
"\n",
|
|
"# optimizer and loss function\n",
|
|
"optimizer = torch.optim.Adam(net.parameters(), lr=0.001)\n",
|
|
"criterion = torch.nn.CrossEntropyLoss()\n",
|
|
"\n",
|
|
"for epoch in range(1): # loop over the dataset multiple times\n",
|
|
"\n",
|
|
" running_loss = 0.0\n",
|
|
" for i, data in enumerate(train_loader, 0):\n",
|
|
" # get the inputs; data is a list of [inputs, labels]\n",
|
|
" inputs, labels = data\n",
|
|
"\n",
|
|
" # zero the parameter gradients\n",
|
|
" optimizer.zero_grad()\n",
|
|
"\n",
|
|
" # forward + backward + optimize\n",
|
|
" outputs = net(inputs)\n",
|
|
" loss = criterion(outputs, labels)\n",
|
|
" loss.backward()\n",
|
|
" optimizer.step()\n",
|
|
"\n",
|
|
" # print statistics\n",
|
|
" running_loss += loss.item()\n",
|
|
" if i % 50 == 49: \n",
|
|
" print(\n",
|
|
" f'batch [{i + 1}/{numb_training//batch_size}] loss[{running_loss / 500:.3f}]')\n",
|
|
" running_loss = 0.0\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "47fa3d0e",
|
|
"metadata": {},
|
|
"source": [
|
|
"Let's see the performance on the test set!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 11,
|
|
"id": "b54638c1",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Accuracy of the network on the 1000 test images: 92.733%\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"correct = 0\n",
|
|
"total = 0\n",
|
|
"with torch.no_grad():\n",
|
|
" for data in test_loader:\n",
|
|
" images, labels = data\n",
|
|
" # calculate outputs by running images through the network\n",
|
|
" outputs = net(images)\n",
|
|
" # the class with the highest energy is what we choose as prediction\n",
|
|
" _, predicted = torch.max(outputs.data, 1)\n",
|
|
" total += labels.size(0)\n",
|
|
" correct += (predicted == labels).sum().item()\n",
|
|
"\n",
|
|
"print(\n",
|
|
" f'Accuracy of the network on the 1000 test images: {(correct / total):.3%}')\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "25cf2878",
|
|
"metadata": {},
|
|
"source": [
|
|
"As we can see we have very good performance for having trained only for 1 epoch! Nevertheless, we are still using structured data... Let's see how we can build an autoencoder for unstructured data now."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "3ce758e9",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Building a Continuous Convolutional Autoencoder\n",
|
|
"\n",
|
|
"Just as toy problem, we will now build an autoencoder for the following function $f(x,y)=\\sin(\\pi x)\\sin(\\pi y)$ on the unit circle domain centered in $(0.5, 0.5)$. We will also see the ability to up-sample (once trained) the results without retraining. Let's first create the input and visualize it, we will use firstly a mesh of $100$ points."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 12,
|
|
"id": "6ca0e929",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"image/png": "",
|
|
"text/plain": [
|
|
"<Figure size 640x480 with 2 Axes>"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"output_type": "display_data"
|
|
}
|
|
],
|
|
"source": [
|
|
"# create inputs\n",
|
|
"def circle_grid(N=100):\n",
|
|
" \"\"\"Generate points withing a unit 2D circle centered in (0.5, 0.5)\n",
|
|
"\n",
|
|
" :param N: number of points\n",
|
|
" :type N: float\n",
|
|
" :return: [x, y] array of points\n",
|
|
" :rtype: torch.tensor\n",
|
|
" \"\"\"\n",
|
|
"\n",
|
|
" PI = torch.acos(torch.zeros(1)).item() * 2\n",
|
|
" R = 0.5\n",
|
|
" centerX = 0.5\n",
|
|
" centerY = 0.5\n",
|
|
"\n",
|
|
" r = R * torch.sqrt(torch.rand(N))\n",
|
|
" theta = torch.rand(N) * 2 * PI\n",
|
|
"\n",
|
|
" x = centerX + r * torch.cos(theta)\n",
|
|
" y = centerY + r * torch.sin(theta)\n",
|
|
"\n",
|
|
" return torch.stack([x, y]).T\n",
|
|
"\n",
|
|
"# create the grid\n",
|
|
"grid = circle_grid(500)\n",
|
|
"\n",
|
|
"# create input\n",
|
|
"input_data = torch.empty(size=(1, 1, grid.shape[0], 3))\n",
|
|
"input_data[0, 0, :, :-1] = grid\n",
|
|
"input_data[0, 0, :, -1] = torch.sin(pi * grid[:, 0]) * torch.sin(pi * grid[:, 1])\n",
|
|
"\n",
|
|
"# visualize data\n",
|
|
"plt.title(\"Training sample with 500 points\")\n",
|
|
"plt.scatter(grid[:, 0], grid[:, 1], c=input_data[0, 0, :, -1])\n",
|
|
"plt.colorbar()\n",
|
|
"plt.show()\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ab6f5987",
|
|
"metadata": {},
|
|
"source": [
|
|
"Let's now build a simple autoencoder using the continuous convolutional filter. The data is clearly unstructured and a simple convolutional filter might not work without projecting or interpolating first. Let's first build and `Encoder` and `Decoder` class, and then a `Autoencoder` class that contains both."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 13,
|
|
"id": "13e8ae0e",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"class Encoder(torch.nn.Module):\n",
|
|
" def __init__(self, hidden_dimension):\n",
|
|
" super().__init__()\n",
|
|
"\n",
|
|
" # convolutional block\n",
|
|
" self.convolution = ContinuousConvBlock(input_numb_field=1,\n",
|
|
" output_numb_field=2,\n",
|
|
" stride={\"domain\": [1, 1],\n",
|
|
" \"start\": [0, 0],\n",
|
|
" \"jumps\": [0.05, 0.05],\n",
|
|
" \"direction\": [1, 1.],\n",
|
|
" },\n",
|
|
" filter_dim=[0.15, 0.15],\n",
|
|
" optimize=True)\n",
|
|
" # feedforward net\n",
|
|
" self.nn = FeedForward(input_dimensions=400,\n",
|
|
" output_dimensions=hidden_dimension,\n",
|
|
" layers=[240, 120])\n",
|
|
"\n",
|
|
" def forward(self, x):\n",
|
|
" # convolution\n",
|
|
" x = self.convolution(x)\n",
|
|
" # feed forward pass\n",
|
|
" return self.nn(x[..., -1])\n",
|
|
"\n",
|
|
"\n",
|
|
"class Decoder(torch.nn.Module):\n",
|
|
" def __init__(self, hidden_dimension):\n",
|
|
" super().__init__()\n",
|
|
"\n",
|
|
" # convolutional block\n",
|
|
" self.convolution = ContinuousConvBlock(input_numb_field=2,\n",
|
|
" output_numb_field=1,\n",
|
|
" stride={\"domain\": [1, 1],\n",
|
|
" \"start\": [0, 0],\n",
|
|
" \"jumps\": [0.05, 0.05],\n",
|
|
" \"direction\": [1, 1.],\n",
|
|
" },\n",
|
|
" filter_dim=[0.15, 0.15],\n",
|
|
" optimize=True)\n",
|
|
" # feedforward net\n",
|
|
" self.nn = FeedForward(input_dimensions=hidden_dimension,\n",
|
|
" output_dimensions=400,\n",
|
|
" layers=[120, 240])\n",
|
|
"\n",
|
|
" def forward(self, weights, grid):\n",
|
|
" # feed forward pass\n",
|
|
" x = self.nn(weights)\n",
|
|
" # transpose convolution\n",
|
|
" return torch.sigmoid(self.convolution.transpose(x, grid))\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "eb097e34",
|
|
"metadata": {},
|
|
"source": [
|
|
"Very good! Notice that in the `Decoder` class in the `forward` pass we have used the `.transpose()` method of the `ContinuousConvolution` class. This method accepts the `weights` for upsampling and the `grid` on where to upsample. Let's now build the autoencoder! We set the hidden dimension in the `hidden_dimension` variable. We apply the sigmoid on the output since the field value is between $[0, 1]$. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 14,
|
|
"id": "a4db89a7",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"class Autoencoder(torch.nn.Module):\n",
|
|
" def __init__(self, hidden_dimension=10):\n",
|
|
" super().__init__()\n",
|
|
"\n",
|
|
" self.encoder = Encoder(hidden_dimension)\n",
|
|
" self.decoder = Decoder(hidden_dimension)\n",
|
|
"\n",
|
|
" def forward(self, x):\n",
|
|
" # saving grid for later upsampling\n",
|
|
" grid = x.clone().detach()\n",
|
|
" # encoder\n",
|
|
" weights = self.encoder(x)\n",
|
|
" # decoder\n",
|
|
" out = self.decoder(weights, grid)\n",
|
|
" return out\n",
|
|
"\n",
|
|
"net = Autoencoder()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "2df482a7",
|
|
"metadata": {},
|
|
"source": [
|
|
"Let's now train the autoencoder, minimizing the mean square error loss and optimizing using Adam. We use the `SupervisedSolver` as solver, and the problem is a simple problem created by inheriting from `AbstractProblem`. It takes approximately two minutes to train on CPU."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 15,
|
|
"id": "700a7cf3",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"GPU available: False, used: False\n",
|
|
"TPU available: False, using: 0 TPU cores\n",
|
|
"HPU available: False, using: 0 HPUs\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Epoch 149: 100%|██████████| 1/1 [00:01<00:00, 0.75it/s, v_num=16, data_loss=0.0341, val_loss=0.0341, train_loss=0.0341]"
|
|
]
|
|
},
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"`Trainer.fit` stopped: `max_epochs=150` reached.\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Epoch 149: 100%|██████████| 1/1 [00:01<00:00, 0.75it/s, v_num=16, data_loss=0.0341, val_loss=0.0341, train_loss=0.0341]\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# define the problem\n",
|
|
"class CircleProblem(AbstractProblem):\n",
|
|
" input_variables = ['x', 'y', 'f']\n",
|
|
" output_variables = input_variables\n",
|
|
" al=LabelTensor(input_data, input_variables)\n",
|
|
" conditions = {'data' : Condition(input_points=LabelTensor(input_data, input_variables), output_points=LabelTensor(input_data, output_variables))}\n",
|
|
"\n",
|
|
"# define the solver\n",
|
|
"solver = SupervisedSolver(problem=CircleProblem(), model=net, loss=torch.nn.MSELoss(), use_lt=True) \n",
|
|
"\n",
|
|
"# train\n",
|
|
"trainer = Trainer(solver, max_epochs=150, accelerator='cpu', enable_model_summary=False) # we train on CPU and avoid model summary at beginning of training (optional)\n",
|
|
"trainer.train()\n",
|
|
" "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "a98ffb20",
|
|
"metadata": {},
|
|
"source": [
|
|
"Let's visualize the two solutions side by side!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 16,
|
|
"id": "0269fedf",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"net.eval()\n",
|
|
"\n",
|
|
"# get output and detach from computational graph for plotting\n",
|
|
"output = net(input_data).detach()\n",
|
|
"\n",
|
|
"# visualize data\n",
|
|
"#fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(8, 3))\n",
|
|
"#pic1 = axes[0].scatter(grid[:, 0], grid[:, 1], c=input_data[0, 0, :, -1])\n",
|
|
"#axes[0].set_title(\"Real\")\n",
|
|
"#fig.colorbar(pic1)\n",
|
|
"#plt.subplot(1, 2, 2)\n",
|
|
"#pic2 = axes[1].scatter(grid[:, 0], grid[:, 1], c=output[0, 0, :, -1])\n",
|
|
"#axes[1].set_title(\"Autoencoder\")\n",
|
|
"#fig.colorbar(pic2)\n",
|
|
"#plt.tight_layout()\n",
|
|
"#plt.show()\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "206141f9",
|
|
"metadata": {},
|
|
"source": [
|
|
"As we can see, the two solutions are really similar! We can compute the $l_2$ error quite easily as well:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 17,
|
|
"id": "ded8f91b",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"l2 error: 4.25%\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"def l2_error(input_, target):\n",
|
|
" return torch.linalg.norm(input_-target, ord=2)/torch.linalg.norm(input_, ord=2)\n",
|
|
"\n",
|
|
"\n",
|
|
"print(f'l2 error: {l2_error(input_data[0, 0, :, -1], output[0, 0, :, -1]):.2%}')"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "c30996c4",
|
|
"metadata": {},
|
|
"source": [
|
|
"More or less $4\\%$ in $l_2$ error, which is really low considering the fact that we use just **one** convolutional layer and a simple feedforward to decrease the dimension. Let's see now some peculiarity of the filter."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "f76db3b5",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Filter for upsampling\n",
|
|
"\n",
|
|
"Suppose we have already the hidden representation and we want to upsample on a differen grid with more points. Let's see how to do it:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 18,
|
|
"id": "fcbbaec6",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# setting the seed\n",
|
|
"torch.manual_seed(seed)\n",
|
|
"\n",
|
|
"grid2 = circle_grid(1500) # triple number of points\n",
|
|
"input_data2 = torch.zeros(size=(1, 1, grid2.shape[0], 3))\n",
|
|
"input_data2[0, 0, :, :-1] = grid2\n",
|
|
"input_data2[0, 0, :, -1] = torch.sin(pi *\n",
|
|
" grid2[:, 0]) * torch.sin(pi * grid2[:, 1])\n",
|
|
"\n",
|
|
"# get the hidden representation from original input\n",
|
|
"latent = net.encoder(input_data)\n",
|
|
"\n",
|
|
"# upsample on the second input_data2\n",
|
|
"output = net.decoder(latent, input_data2).detach()\n",
|
|
"\n",
|
|
"# show the picture\n",
|
|
"#fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(8, 3))\n",
|
|
"#pic1 = axes[0].scatter(grid2[:, 0], grid2[:, 1], c=input_data2[0, 0, :, -1])\n",
|
|
"#axes[0].set_title(\"Real\")\n",
|
|
"#fig.colorbar(pic1)\n",
|
|
"#plt.subplot(1, 2, 2)\n",
|
|
"#pic2 = axes[1].scatter(grid2[:, 0], grid2[:, 1], c=output[0, 0, :, -1])\n",
|
|
"# axes[1].set_title(\"Up-sampling\")\n",
|
|
"#fig.colorbar(pic2)\n",
|
|
"#plt.tight_layout()\n",
|
|
"#plt.show()\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "2cbf14b5",
|
|
"metadata": {},
|
|
"source": [
|
|
"As we can see we have a very good approximation of the original function, even thought some noise is present. Let's calculate the error now:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 19,
|
|
"id": "ab505b75",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"l2 error: 8.38%\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"print(f'l2 error: {l2_error(input_data2[0, 0, :, -1], output[0, 0, :, -1]):.2%}')"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "465cbd16",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Autoencoding at different resolutions\n",
|
|
"In the previous example we already had the hidden representation (of the original input) and we used it to upsample. Sometimes however we could have a finer mesh solution and we would simply want to encode it. This can be done without retraining! This procedure can be useful in case we have many points in the mesh and just a smaller part of them are needed for training. Let's see the results of this:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 20,
|
|
"id": "75ed28f5",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"l2 error: 8.50%\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# setting the seed\n",
|
|
"torch.manual_seed(seed)\n",
|
|
"\n",
|
|
"grid2 = circle_grid(3500) # very fine mesh\n",
|
|
"input_data2 = torch.zeros(size=(1, 1, grid2.shape[0], 3))\n",
|
|
"input_data2[0, 0, :, :-1] = grid2\n",
|
|
"input_data2[0, 0, :, -1] = torch.sin(pi *\n",
|
|
" grid2[:, 0]) * torch.sin(pi * grid2[:, 1])\n",
|
|
"\n",
|
|
"# get the hidden representation from finer mesh input\n",
|
|
"latent = net.encoder(input_data2)\n",
|
|
"\n",
|
|
"# upsample on the second input_data2\n",
|
|
"output = net.decoder(latent, input_data2).detach()\n",
|
|
"\n",
|
|
"# show the picture\n",
|
|
"#fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(8, 3))\n",
|
|
"#pic1 = axes[0].scatter(grid2[:, 0], grid2[:, 1], c=input_data2[0, 0, :, -1])\n",
|
|
"#axes[0].set_title(\"Real\")\n",
|
|
"#fig.colorbar(pic1)\n",
|
|
"#plt.subplot(1, 2, 2)\n",
|
|
"#pic2 = axes[1].scatter(grid2[:, 0], grid2[:, 1], c=output[0, 0, :, -1])\n",
|
|
"#axes[1].set_title(\"Autoencoder not re-trained\")\n",
|
|
"#fig.colorbar(pic2)\n",
|
|
"#plt.tight_layout()\n",
|
|
"#plt.show()\n",
|
|
"\n",
|
|
"# calculate l2 error\n",
|
|
"print(f'l2 error: {l2_error(input_data2[0, 0, :, -1], output[0, 0, :, -1]):.2%}')"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "8e720e55",
|
|
"metadata": {},
|
|
"source": [
|
|
"## What's next?\n",
|
|
"\n",
|
|
"We have shown the basic usage of a convolutional filter. There are additional extensions possible:\n",
|
|
"\n",
|
|
"1. Train using Physics Informed strategies\n",
|
|
"\n",
|
|
"2. Use the filter to build an unstructured convolutional autoencoder for reduced order modelling\n",
|
|
"\n",
|
|
"3. Many more..."
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.12.3"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|