export tutorials changed in 1d7f9ed (#679)

Co-authored-by: dario-coscia <dario-coscia@users.noreply.github.com>
This commit is contained in:
github-actions[bot]
2025-10-21 11:07:29 +02:00
committed by GitHub
parent 1d7f9ed903
commit 42e993317f
3 changed files with 184 additions and 202 deletions

View File

@@ -2,15 +2,15 @@
# coding: utf-8
# # Tutorial: Data-driven System Identification with SINDy
#
#
# [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mathLab/PINA/blob/master/tutorials/tutorial23/tutorial.ipynb)
#
#
#
#
# > ##### ⚠️ ***Before starting:***
# > We assume you are already familiar with the concepts covered in the [Getting started with PINA](https://mathlab.github.io/PINA/_tutorial.html#getting-started-with-pina) tutorial. If not, we strongly recommend reviewing them before exploring this advanced topic.
#
#
# In this tutorial, we will demonstrate a typical use case of **PINA** for Data-driven system identification using SINDy. The tutorial is largely inspired by the paper [Discovering governing equations from data by sparse identification of nonlinear dynamical systems](dx.doi.org/10.1073/pnas.1517384113).
#
#
# Let's start by importing the useful modules:
# In[ ]:
@@ -48,16 +48,16 @@ from pina.model import SINDy
# $$\dot{\boldsymbol{x}}(t)=\boldsymbol{f}(\boldsymbol{x}(t)),$$
# along with suitable initial conditions.
# For simplicity, we'll omit the argument of $\boldsymbol{x}$ from this point onward.
#
#
# Since $\boldsymbol{f}$ is unknown, we want to model it.
# While neural networks could be used to find an expression for $\boldsymbol{f}$, in certain contexts - for instance, to perform long-horizon forecasting - it might be useful to have an **explicit** set of equations describing it, which would also allow for a better degree of **interpretability** of our model.
#
#
# As a result, we use SINDy (introduced in [this paper](https://www.pnas.org/doi/full/10.1073/pnas.1517384113)), which we'll describe later on.
# Now, instead, we describe the system that is going to be considered in this tutorial: the **Lorenz** system.
#
#
# The Lorenz system is a set of three ordinary differential equations and is a simplified model of atmospheric convection.
# It is well-known because it can exhibit chaotic behavior, _i.e._, for given values of the parameters solutions are highly sensitive to small perturbations in the initial conditions, making forecasting extremely challenging.
#
#
# Mathematically speaking, we can write the Lorenz equations as
# $$
# \begin{cases}
@@ -67,9 +67,9 @@ from pina.model import SINDy
# \end{cases}
# $$
# With $\sigma = 10,\, \rho = 28$, and $\beta=8/3$, the solutions trace out the famous butterfly-shaped Lorenz attractor.
#
#
# With the following lines of code, we just generate the dataset for SINDy and plot some trajectories.
#
#
# **Disclaimer**: of course, here we use the equations defining the Lorenz system just to generate the data.
# If we had access to the dynamical term $\boldsymbol{f}$, there would be no need to use SINDy.
@@ -126,24 +126,24 @@ plot_n_conditions(X, n_ic_s)
# \dot{x}_i = f_i(\boldsymbol{x}) = \sum_{k}\Theta(\boldsymbol{x})_{k}\xi_{k,i},
# $$
# with $\boldsymbol{\xi}_i\in\mathbb{R}^r$ a vector of **coefficients** telling us which terms are active in the expression of $f_i$.
#
#
# Since we are in a supervised setting, we assume that we have at our disposal the snapshot matrix $\boldsymbol{X}$ and a matrix $\dot{\boldsymbol{X}}$ containing time **derivatives** at the corresponding time instances.
# Then, we can just impose that the previous relation holds on the data at our disposal.
# That is, our optimization problem will read as follows:
# $$
# \min_{\boldsymbol{\Xi}}\|\dot{\boldsymbol{X}}-\Theta(\boldsymbol{X})\boldsymbol{\Xi}\|_2^2.
# $$
#
#
# Notice, however, that the solution to the previous equation might not be **sparse**, as there might be many non-zero terms in it.
# In practice, many physical systems are described by a parsimonious and **interpretable** set of equations.
# Thus, we also impose a $L^1$ **penalization** on the model weights, encouraging them to be small in magnitude and trying to enforce sparsity.
# The final loss is then expressed as
#
#
# $$
# \min_{\boldsymbol{\Xi}}\bigl(\|\dot{\boldsymbol{X}}-\Theta(\boldsymbol{X})\boldsymbol{\Xi}\|_2^2 + \lambda\|\boldsymbol{\Xi}\|_1\bigr),
# $$
# with $\lambda\in\mathbb{R}^+$ a hyperparameter.
#
#
# Let us begin by computing the time derivatives of the data.
# Of course, usually we do not have access to the exact time derivatives of the system, meaning that $\dot{\boldsymbol{X}}$ needs to be **approximated**.
# Here we do it using a simple Finite Difference (FD) scheme, but [more sophisticated ideas](https://arxiv.org/abs/2505.16058) could be considered.
@@ -195,9 +195,9 @@ function_library = [
# ## Training with PINA
# We are now ready to train our model! We can use **PINA** to train the model, following the workflow from previous tutorials.
# First, we need to define the problem. In this case, we will use the [`SupervisedProblem`](https://mathlab.github.io/PINA/_rst/problem/zoo/supervised_problem.html#module-pina.problem.zoo.supervised_problem), which expects:
#
# - **Input**: the state variables tensor $\boldsymbol{X}$ containing all the collected snapshots.
# First, we need to define the problem. In this case, we will use the [`SupervisedProblem`](https://mathlab.github.io/PINA/_rst/problem/zoo/supervised_problem.html#module-pina.problem.zoo.supervised_problem), which expects:
#
# - **Input**: the state variables tensor $\boldsymbol{X}$ containing all the collected snapshots.
# - **Output**: the corresponding time derivatives $\dot{\boldsymbol{X}}$.
# In[ ]:
@@ -210,7 +210,7 @@ problem = SupervisedProblem(X_torch, dXdt_torch)
# Finally, we will use the `SupervisedSolver` to perform the training as we're dealing with a supervised problem.
#
#
# Recall that we should use $L^1$-regularization on the model's weights to ensure sparsity. For the ease of implementation, we adopt $L^2$ regularization, which is less common in SINDy literature but will suffice in our case.
# Additionally, more refined strategies could be used, for instance pruning coefficients below a certain **threshold** at every fixed number of epochs, but here we avoid further complications.
@@ -246,11 +246,11 @@ trainer.train()
# Now we'll print the identified equations and compare them with the original ones.
#
#
# Before going on, we underline that after training there might be many coefficients that are small, yet still non-zero.
# It is common for SINDy practitioners to interpret these coefficients as noise in the model and prune them.
# This is typically done by fixing a threshold $\tau\in\mathbb{R}^+$ and setting to $0$ all those $\xi_{i,j}$ such that $|\xi_{i,j}|<\tau$.
#
#
# In the following cell, we also define a function to print the identified model.
# In[ ]:
@@ -299,9 +299,9 @@ with torch.no_grad(): # prune coefficients
# \dot{z}=-\frac{8}{3} z+xy.
# \end{cases}
# $$
#
#
# That's a good result, especially considering that we did not perform tuning on the weight decay hyperparameter $\lambda$ and did not really care much about other optimization parameters.
#
#
# Let's plot a few trajectories!
# In[ ]:
@@ -328,14 +328,14 @@ plot_n_conditions(X_sim, n_ic_s_test)
# Great! We can see that the qualitative behavior of the system is really close to the real one.
#
#
# ## What's next?
# Congratulations on completing the introductory tutorial on **Data-driven System Identification with SINDy**! Now that you have a solid foundation, here are a few directions to explore:
#
# 1. **Experiment with Dimensionality Reduction techniques** — Try to combine SINDy with different reductions techniques such as POD or autoencoders - or both of them, as done [here](https://www.sciencedirect.com/science/article/abs/pii/S0045793025003019).
#
#
# 1. **Experiment with Dimensionality Reduction techniques** — Try to combine SINDy with different reductions techniques such as POD or autoencoders - or both of them, as done [here](https://www.sciencedirect.com/science/article/abs/pii/S0045793025003019).
#
# 2. **Study Parameterized Systems** — Write your own SINDy model for parameterized problems.
#
#
# 3. **...and many more!** — The possibilities are vast! Continue experimenting with advanced configurations, solvers, and features in PINA.
#
#
# For more resources and tutorials, check out the [PINA Documentation](https://mathlab.github.io/PINA/).