Home » Physics-Informed Neural Networks for Inverse PDE Problems

Physics-Informed Neural Networks for Inverse PDE Problems

a Physics-Informed Neural Network (PINN) feels a lot like giving a regular neural network a cheat sheet. Without a cheat sheet, we could estimate solutions to a physical system using only a neural network. Position (x) and Time (t) as inputs, Temperatures (u) as outputs. With sufficient data, this solution would be effective. However, it doesn’t utilize the physics we know about the system. We would expect temperatures to follow the dynamics of the heat equation, and we would also like to incorporate that into our neural network.

PINNs offer a method for combining the known physics about a system and neural network estimation. This is ingeniously achieved by utilizing automatic differentiation and a physics-based loss function. As a result, we can achieve better results with less data.

Agenda

  • Provide an interpretation of the heat equation
  • Simulate data using temperature data
  • Code a solution for thermal diffusivity κ and heat source q(x,t) using DeepXDE
  • Explain the difference between forward and inverse problems in PDE theory

This is the data we will be working with. Let’s pretend we used sensors to collect temperatures of a 1-meter rod over 5 seconds.

Graph of Position (x) and Time (t) with respect to Temperature (u).
Illustration by Author
The Heat Equation
Illustration by Author

In a nutshell, PINNs provide a new way to approximate solutions to physics equations (ODEs, PDEs, SDEs) by using data of the underlying system, and our physics equation.

Interpreting the Heat Equation

Inverse problem terms in the heat equation.
Illustration by Author

The partial derivative on the left represents how temperature changes with time. This is a function of position (x) and time (t). On the right, q(x,t) represents the heat entering the system. This is our Bunsen burner heating up the rod. The middle term describes how heat changes depending on the surrounding points. Heat flows from hot points to cold points, seeking to be in equilibrium with the surrounding points. The second spatial derivative (∂²u/∂x² in 1D, or ∇²u in higher dimensions) captures heat diffusion. This is the natural tendency for heat to flow from hot regions to cold regions.

This term is multiplied by the thermal diffusivity (κ), which depends on the material’s properties. We would expect something conductive, like metals, to heat up faster. When ∇²u is positive, the temperature at that point is lower than the average of its neighbours, so heat tends to flow into the point. When ∇²u is negative, the point is hotter than its surroundings, and heat tends to flow out. When ∇²u is zero, the point is in thermal equilibrium with its immediate neighbourhood.

In the image below, the top of our function could represent a very hot point. Notice how the Laplacian is negative, indicating that heat will flow out of this hot point to cooler surrounding points. The Laplacian is a measure of curvature around a point. In the heat equation, that is the curvature of the temperature profile.

A Gaussian Bump, and the Laplacian of a Gaussian Bump.
Illustration by Author

Generating the data

I must admit, I didn’t actually burn a rod and measure its temperature changes over time. I simulated the data using the heat equation. This is the code we used to simulate the data. All of it can be found on my GitHub.

#--- Generating Data ---
L = 1.0  # Rod Length (m)
Nx = 51  # Number of spatial points
dx = L / (Nx - 1)  # Spatial step
T_total = 5.0  # Total time (s)
Nt = 5000  # Number of time steps
dt = T_total / Nt  # Time step
kappa = 0.01  # Thermal diffusivity (m^2/s)
q = 1.0  # Constant heat source term (C/s)

u = np.zeros(Nx)
x_coords = np.linspace(0, L, Nx)
temperature_data_raw = []

header = ["Time (s)"] + [f"x={x:.2f}m" for x in x_coords]
temperature_data_raw.append(header)
temperature_data_raw.append([0.0] + u.tolist())

for n in range(1, Nt + 1):
    u_new = np.copy(u)
    for i in range(1, Nx - 1):
        u_new[i] = u[i] + dt * (kappa * (u[i+1] - 2*u[i] + u[i-1]) / (dx**2) + q)
    u_new[0] = 0.0
    u_new[Nx-1] = 0.0
    u = u_new
    if n % 50 == 0 or n == Nt:
        temperature_data_raw.append([n * dt] + u.tolist())

To generate this data, we used κ = 0.01 and q = 1, but only x, t, and u will be used to estimate κ and q. In other words, we pretend not to know κ and and seek to estimate them only with x, t, and u. This surface is 3-dimensional, but it represents the temperature of a 1-dimensional rod over time.

Illustration by Author
Dataframe
Illustration by Author

Arranging and splitting data

Here, we simply rearrange our data into columns for Position (x), time (t), and Temperature (u_val), then separate them into X and Y, and then split them into training and testing sets. 

# --- Prepare (x, t, u) triplet data ---
data_triplets = []
for _, row in df.iterrows():
t = row["Time (s)"]
for col in df.columns[1:]:
x = float(col.split('=')[1][:-1])
u_val = row[col]
data_triplets.append([x, t, u_val])
data_array = np.array(data_triplets)

X_data = data_array[:, 0:2] # X position (x), time (t)
y_data = data_array[:, 2:3] # Y temperature (u)
X data position (x), time (t), and Y data Temperature (u)
Illustration by Author

We keep our test size (20%)

# --- Train/test split ---
from sklearn.model_selection import train_test_split
x_train, x_test, u_train, u_test = train_test_split(X_data, y_data, test_size=0.2, random_state=42)
Train Test Split 

Because our PINN receives position (x) and time (t) as inputs, and using automatic differentiation and the chain rule, it can compute the following partial derivatives.

[
frac{partial u}{partial t}, quad
frac{partial u}{partial x}, quad
frac{partial^2 u}{partial x^2}, quad
nabla^2 u
]

So, finding the constants becomes a problem of testing different values for κ and q(x, t), minimizing the residual given by the loss function.

Packages and seeds and connecting backends

Don’t forget to install DeepXDE if you haven’t yet. 

!pip install --upgrade deepxde

These are all of the Libraries we will be using. For this to work, make sure you’re using Tensorflow 2 as the backend for DeepXDE.

# --- Imports and Connecting Backends ---

import os
os.environ["DDE_BACKEND"] = "tensorflow"  # Set to TF2 backend
import deepxde as dde
print("Backend:", dde.backend.__name__)  # Should now say: deepxde.backend.tensorflow
import tensorflow as tf
print("TensorFlow version:", tf.__version__)
print("Is eager execution enabled?", tf.executing_eagerly())
import deepxde as dde
print("DeepXDE version:", dde.__version__)
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from deepxde.backend import tf
import random
import torch

For replicability, we will set our seeds to 42. You can use this code on multiple libraries. 

# --- Setting Seeds ---
SEED = 42

random.seed(SEED)
np.random.seed(SEED)
os.environ['PYTHONHASHSEED'] = str(SEED)
tf.random.set_seed(SEED)
torch.manual_seed(SEED)
torch.cuda.manual_seed(SEED)
torch.cuda.manual_seed_all(SEED)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

Coding the PINN

Environment

Because the heat equation models temperature over both space and time, we need to consider the spatial domain and the temporal domain.

  • Space (0, 1) for a 1-meter rod 
  • Time (0,5) for 5 seconds of observation
  • Geomtime combines these dimensions
# --- Geometry and domain ---
geom = dde.geometry.Interval(0, 1)
timedomain = dde.geometry.TimeDomain(0, 5.0)
geomtime = dde.geometry.GeometryXTime(geom, timedomain)

Select for values in the PDE you would like to infer from the data. Here we pick kappa (κ) and heat source q

# --- Trainable variables ---
raw_kappa = tf.Variable(0.0)
raw_q = tf.Variable(0.0)

The Physics Loss

Our physics loss is simple: all the elements of the heat equation on one side. When this is zero, our equation holds. The physics loss will be minimized, so our estimate for κ and q best fits the physics. If we had a physics equation A = B, we would simply move all the elements to one side and define our residual as A – B = 0. The closer A – B is to zero the the better our PINN captures the dynamics of A = B.

def pde(x, u):
    du_t = dde.grad.jacobian(u, x, j=1)
    du_xx = dde.grad.hessian(u, x, i=0, j=0)
    kappa = tf.nn.softplus(raw_kappa)
    q = raw_q
    return du_t - kappa * du_xx - q

[
text{Residual}(x, t) = frac{partial u}{partial t} – kappa frac{partial^2 u}{partial x^2} – q
]

PINNs

The derivatives present in the residual are computed by applying the chain rule through the computational graph during backpropagation. These derivatives allow the PINN to evaluate the residual of the PDE.

Optionally, we could also add a data loss, also known as the loss from our standard neural network, which minimizes the difference between the prediction and the known values.

# --- Adding Data Loss ---
def custom_loss(y_true, y_pred):
    base_loss = tf.reduce_mean(tf.square(y_true - y_pred))
    reg = 10.0 * (tf.square(tf.nn.softplus(raw_kappa) - 0.01) + tf.square(raw_q - 1.0))
    return base_loss + reg #Loss from Data + Loss from PDE

Below, we create a TimePDE data object, which is a type of dataset in DeepXDE for solving time-dependent PDEs. It prepares the geometry, physics, boundary conditions, and initial conditions for training a PINN.

# --- DeepXDE Data object ---
data = dde.data.TimePDE(
    geomtime,
    pde,     #loss function
    [dde.PointSetBC(x_train, u_train)],  # Observed values as pseudo-BC
    num_domain=10000,
    num_boundary=0,
    num_initial=0,
    anchors=x_test,
)

The Architecture [2] + [64]*3 + [1] is used. We obtain this from two inputs (x, t), 64 neurons, 3 hidden layers, and 1 output (u).

[2] + [64]*3 + [1] = [2, 64, 64, 64, 1]

A hyperbolic tangent activation function is used to capture both linear and non-linear behaviour in the PDE solution. The weight initializer “Glorot normal” is used to prevent vanishing or exploding gradients in training. 

# --- Neural Network ---
net = dde.maps.FNN([2] + [64]*3 + [1], "tanh", "Glorot normal")
model = dde.Model(data, net)

We can use different optimizers. For me, L-BFGS-B worked better.

# --- Train with Adam ---
model.compile("adam", lr=1e-4, loss=custom_loss,
              external_trainable_variables=[raw_kappa, raw_q])
losshistory, train_state = model.train(iterations=100000)

# --- Optional L-BFGS-B fine-tuning ---
model.compile("L-BFGS-B", loss=custom_loss,
              external_trainable_variables=[raw_kappa, raw_q])

Training could take a while…

Training PINN
Illustration by Author

Training PINN
Illustration by Author

Model loss

Monitoring the model loss over time is a good way to watch for overfitting. Because we only used the physics loss, we don’t see Component 2, which would otherwise be the data loss. Since all the code is up on my GitHub, feel free to run it and see how changing the learning rate will change the variance in the model loss.

# --- Plot loss ---
dde.utils.plot_loss_history(losshistory)
plt.yscale("log")
plt.title("Training Loss (log scale)")
plt.xlabel("Iteration")
plt.ylabel("Loss")
plt.grid(True)
plt.show()

# --- Detailed loss plotting ---
losses = np.array(losshistory.loss_train)  # shape: (iterations, num_components)
iterations = np.arange(1, len(losses) + 1)

plt.figure(figsize=(10, 6))
plt.plot(iterations, losses[:, 0], label="Train Total Loss")

# If there are multiple components (e.g., PDE + BC + data), plot them
if losses.shape[1] > 1:
    for i in range(1, losses.shape[1]):
        plt.plot(iterations, losses[:, i], label=f"Train Loss Component {i}")

# Plot validation loss if available
if losshistory.loss_test:
    val_losses = np.array(losshistory.loss_test)
    plt.plot(iterations, val_losses[:, 0], '--', label="Validation Loss", color="black")

    # Optionally: plot validation loss components
    if val_losses.shape[1] > 1:
        for i in range(1, val_losses.shape[1]):
            plt.plot(iterations, val_losses[:, i], '--', label=f"Validation Loss Component {i}", alpha=0.6)

plt.xlabel("Iteration")
plt.ylabel("Loss")
plt.yscale("log")
plt.title("Training and Validation Loss Over Time")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
Illustration by Author
Illustration by Author

Results

We can infer these constants with great accuracy. Part of the success is due to focusing only on the physics loss and not incorporating our data loss. This is an option in PINNs. The accuracy here is also attributed to the absence of noise in the data generation process.

# --- Results ---
learned_kappa = tf.nn.softplus(raw_kappa).numpy()
learned_q = raw_q.numpy()
print("n--- Results ---")
print(f"True kappa: 0.01, Learned kappa: {learned_kappa:.6f}")
print(f"True q: 1.0, Learned q: {learned_q:.6f}")

Forward and Inverse Problems:

In this article, we solved the inverse problem of the PDE. This involves solving for the two red constants.

Forward and Inverse problems in the heat equation.
Illustration by Author

The Forward problem is characterized as follows: given the PDE, underlying parameters, boundary conditions, and forcing conditions, we would like to compute the state of the system. In this case, temperature (u). This problem involves predicting the system. Forward problems are generally well-posed; a solution exists and is unique. These solutions are continuously dependent on the inputs 

The Inverse Problem is charachterized as such: given the state of the system (temperature) infer the underlying parameters, boundary conditions, or forcing terms that best explain the observed data. Here, we estimate unknown parameters. Inverse problems are often ill-posed, lacking uniqueness or stability. 

Forward: predict the outcome when you know the causes.

Inverse: figure out the causes (or best inputs) from the observed outcome. 

Unintuitively, the inverse problem is generally resolved first. Knowing the parameters greatly helps in figuring out the forward problem. If we could figure out kappa (κ) and q(x, t), solving for the temperature u(x,t) would be a lot easier. 

Conclusion

PINNs provide a novel approach to solving both the inverse and forward problems in physics equations. Their advantage over neural networks is that they enable us to better solve these problems with less data, as they incorporate existing knowledge about physics into the neural network. This also has the added benefit of improved generalization. PINNs are particularly good at solving Inverse Problems.

References

  • Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378, 686–707. https://doi.org/10.1016/j.jcp.2018.10.045
  • Raissi, M. (2018). Deep hidden physics models: Deep learning of nonlinear partial differential equations. Journal of Machine Learning Research, 19(25), 1–24. https://arxiv.org/abs/1801.06637
  • Lu, L., Meng, X., Mao, Z., & Karniadakis, G. E. (2021). DeepXDE: A deep learning library for solving differential equations. SIAM Review, 63(1), 208–228. https://doi.org/10.1137/19M1274067
  • DeepXDE Developers. (n.d.). DeepXDE: A deep learning library for solving differential equations [Computer software documentation]. Retrieved July 25, 2025, from https://deepxde.readthedocs.io/en/latest/
  • Ren, Z., Zhou, S., Liu, D., & Liu, Q. (2025). Physics‑informed neural networks: A review of methodological evolution, theoretical foundations, and interdisciplinary frontiers toward next‑generation scientific computing. Applied Sciences, 15(14), Article 8092. https://doi.org/10.3390/app15148092 MDPI
  • Torres, E., Schiefer, J., & Niepert, M. (2025). Adaptive physics‑informed neural networks: A survey. arXiv. https://arxiv.org/abs/2503.18181 arXiv+1OpenReview+1

Github

Linkedin

Twitter

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *