How-to Guides
How to construct a "predictive" engineering model (PEM)
- First, break the system into individual components, for example the cathode, the discharge channel, the far-field plume, etc. Define all the inputs and outputs of each component and how they connect to each other. Variables that get mapped from the output of one component into the input of another component are termed "coupling" variables, while all other inputs are called system "exogenous" variables.
- Write a Python wrapper function for each component model, following the
amisc
package guidelines. See examples for the cathode, thruster, and plume models. By convention, these functions are placed in thehallmd.models
subdirectory. - Construct an
amisc.SystemSurrogate
object that links all the component models in a multidisciplinary system. See the example for pem_v0. Variables are specified in and loaded from a.json
configuration file in thehallmd.models.config
directory. Other model configurations can also go here. - You can test your system using the
SystemSurrogate
object.
Example
from hallmd.models.pem import your_pem
pem_obj = your_pem()
inputs = pem_obj.sample_inputs(100) # (100, input_dim)
outputs = pem_obj.predict(inputs, use_model='best') # (100, output_dim)
pem_obj
surrogate to approximate the true system model.
How to train a surrogate for the PEM
This guide assumes you have access to a Linux HPC system. It will use specific examples for the Great Lakes system at the University of Michigan, which uses the SLURM
workload manager and the Lmod
environment module. Regardless, this guide can be adapted to your specific system as needed. We assume you have a terminal connection open on the system you are using.
- Clone this repository and change into the root project directory:
- Source the setup script in the shell:
This will make sure you have the
pdm
tool installed, a proper version of Python loaded, thempi4py
library installed, and all required SLURM environment variables defined. You will need to edit this script for your specific usage, for example adding your SLURM account info, commenting outpdm add mpi4py
if you do not have an MPI-enabled system, etc. The main idea here is to set up a working Python virtual environment with all resources defined and loaded. - Create a new directory in the
scripts
folder with the name of your new "PEM" system. Thescripts/pem_v0
contains everything used to build the original 3-component PEM; it may be easiest to simply copy this directory as a template and name itpem_vi
for \(i > 0\). - Look at the
[tool.pdm.scripts]
section ofpyproject.toml
. There are three convenience scripts provided that can be run with the commandpdm run script_name
: the important ones aregen_data
,fit
, andtrain
. - The
gen_data
script will callsbatch scripts/your_pem/gen_data.sh
which then callsgen_data.py
. This is responsible for generating all the data needed by the models and surrogate before training the surrogate. For example,pem_v0
relies ongen_data
to make a test set and some compression-related data that get copied over to thehallmd.models.config
directory. At the very least, you will likely need this to make a test set for evaluating the performance of the surrogate during training. - The
fit
script will callsbatch scripts/your_pem/fit_surr.sh
which then callsfit_surr.py
. This is responsible for actually loading yourSystemSurrogate
object and training the surrogate via: - The
train
script is an expedient for callinggen_data
andfit
in sequence, with the latter being dependent on the successful completion of the first.
TLDR; Complete working example
How to use the surrogate after training
Surrogate training is performed with the amisc.SystemSurrogate.fit
function. You should specify a save directory for the SystemSurrogate
object, which will create a folder with the hierarchy:
amisc_2024_timestamp # Root surrogate directory
|- components # Model output files may optionally be saved here
| |- Cathode
| |- Thruster
| |- etc.
|- sys # Surrogate save files
| |- sys_init.pkl
| |- etc.
| |- sys_final.pkl
|- 2024_timestamp.log # Training log (useful for debugging)
.pkl
save files and reload the surrogate using the load_from_file()
function:
from amisc.system import SystemSurrogate
file = 'sys_final.pkl'
surr = SystemSurrogate.load_from_file(file)
Note
It is more advisable to distribute the whole amisc_timestamp
directory and load the save file from within amisc_timestamp/sys/sys_final.pkl
, since the directory structure will be recreated from a standalone file regardless.
How to use the surrogate for uncertainty quantification
There are four more scripts provided in scripts/pem_v0
that were used to run all UQ analyses for the original 3-component PEM:
plot_slice.py
-- Loads the surrogate from a training save.pkl
file and plots several "1d slices" of inputs and outputs and compares to the true model output. This is useful for gauging how good the surrogate approximation is.mcmc.py
-- Contains several functions for maximum-likelihood estimation, obtaining a Laplace estimate of the posterior, and Markov-Chain Monte Carlo (MCMC) sampling using theuqtils
package. Use this for calibration of the PEM model parameters using the surrogate.monte_carlo.py
-- Samples the uncertain inputs and propagates through the surrogate to get output uncertainty. Has several plotting functions for comparing model predictions to experimental data.sobol.py
-- Performs a Sobol' sensitivity analysis using the surrogate and theuqtils
package. Has several plotting functions for showing Sobol' indices for each component model.
Note
It is advisable to copy all these files from pem_v0
and adapt them for your new pem_vXX
. They are written quite specific to the use case, but can serve as a good starting point for your own scripts.