Which modeling methods can be used to create a digital twin?
When creating a digital twin (read the previous blog “How do you build a digital twin?”), a mathematical model is developed that can be used to simulate the behavior and performance of the physical asset. These mathematical models fall roughly into three categories: white box, black box, and gray box. The difference between the categories can mainly be understood by their reliance on data: white box models do not rely on data, black box models rely only on data, and gray box models are somewhere in between.
White box models
White box models are based on an asset’s underlying physics. They are typically derived from available scientific or engineering material, for example formulas to estimate the hydrodynamic resistance of a ship can be used to estimate the actual resistance encountered once the ship has been built. White box models have the advantage of being able to work with very little to no data, providing a robust model that can be ready before any data has been collected – or even before the asset has been built.
Currently in the marine industry, white box models are the most common approach. However, because white box models are a simplified version of reality, it can be extremely challenging for the formula to model the asset with sufficient accuracy. Using data to correct and fine tune white box models improves the accuracy, leading to gray box models (discussed below).
Black box models
Black box models take an approach where no prior knowledge of the physics or inner workings of the asset is needed; instead, data collected from the asset is used to build the model. Artificial intelligence (AI) methods like neural networks and deep learning are examples of black box models and have recently become very popular. The advantage of the black box approach is that, assuming the data is of a high enough quality, it learns to model the data very accurately.
In practice, however, this can also be a problem in a shipping context as there is very rarely adequate data available to generate a reliable black box model. This lack of data makes the accuracy very fragile as the model will often provide erroneous predictions if it encounters unfamiliar operating conditions. What’s more, black box models need extended data-collection periods that can often span from several months to over a year, significantly delaying the return on investment of the digital twin.
Black box models can be developed by mathematicians with no deep understanding of the asset in question. Sometimes this approach is chosen by companies developing an analytics platform, simply because they do not possess the required competence to build white or gray box models.
Gray box models
Gray box models combine the best aspects of the previous two approaches. They use physics to build a model based on robust prior knowledge of the asset, and then allow the data to fine-tune the model over time to characterize the specific asset in question even more accurately. Some advanced gray box modeling methodologies utilize Bayesian statistical models that learn to rely more on the data when good data is available and fall back to a physics-based approach when the data is unreliable. This provides a good combination of robustness and accuracy.
Sophisticated gray box models can be commissioned in a very short timeframe – comparable to the commissioning time of a white box model. The disadvantage of gray box models is that creating them requires developers with sophisticated, postgraduate-level competences in mathematics. In addition, the mathematicians also need to collaborate with domain experts in the field to embed the underlying physics of the assets.