Abstraction Levels, Transaction-Level Modeling, and Digital Twins

# Abstraction Levels When we design and build digital systems, what we are doing is reordering a pile of matter in a very specific way, so the new arrangement provides a function that the unordered pile was not able to provide. A Boeing 747 or a microchip; they are a finite number of atoms stuck together, arranged in a specific way to perform a specific task. Same for a smartphone, a submarine, a bicycle, or a data center. Now, the macro function of these piles of atoms emerges as the process of arranging them reaches certain stages. A microprocessor becomes able to act like one—executing instructions defined by a programmer—once the different building blocks (ALU, data path, memory, registers) form and can connect to each other and interact. But as we group matter into different sets of discrete "chunks", the emergent behavior of the collective starts to cloud the intrinsic agency. This means that the awareness that objects are composed of smaller constituent parts gently fades away. Is it by taking the burden of creating order out of disorder when we design that we can blissfully forget that the objects we devise are made of a zillion tiny particles, so we can only focus on the macro function that the thing provides. In reality, we cannot completely forget about that (and we never do); we are frequently reminded about it. How is any of this related to digital systems simulation? A simulation is a virtual representation of a physical system—and its dynamics—in a computational way. Modeling allows us to gain understanding, and in some cases, simulations help us predict future behavior. While modeling, we go from coarse to fine, and we accept that we are focusing on certain macro-level functionalities and purposely abstracting ourselves from the reasons why this macro function emerges: we just accept the behavior of a block that magically appears. For instance, when simulating a microprocessor reading or writing from memory, we do not need to simulate the electromagnetic wavefronts propagating through copper interconnects for a receiver to resolve if the data is a 1 or a 0, nor do we need to simulate charges moving across a [[Semiconductors#MOSFETs|MOSFET]] channel to replicate how a [[Semiconductors#Flip-Flops|flip-flop]] works. We also do not need to simulate address and data buses, or the chip selects. When the level of abstraction is high enough, a memory read is just a value put in a position in memory for the CPU to grab it, and we can move on. Essentially, communication is reduced to function calls between models. Why go so coarse? The answer is: optimization; more agency requires more resources allocated to compute each agent's evolution. The more details we add to our models, the longer it takes to create the model, and the more complex the model gets, which invalidates the idea of modeling in the first place. The "usefulness" of a model decreases as its fidelity increases since it becomes as difficult to understand and operate as the actual system. > [!note] Bonini's Paradox > _As a model of a complex system becomes more complete, it becomes less understandable_ If the object we are trying to model has a hierarchical composition of components that are also composed of other components, then it means the physics and behavior of those collectives can be simpler to handle if we conceive and treat them as wholes: they move together, rotate together, and interact with the environment together. The constituent atoms and particles of those groups are interdependent enough to follow each other everywhere they go, which means we can only care for the group as a thing on its own and allocate computational resources for that. Sometimes what we model does not show such unity. For example, things get more complicated when we have to model and simulate fluids or gases. We can't really model a fluid or a gas as one monolithic entity. What price do we pay for neglecting the almost infinitely compound nature of objects when we model? We pay a penalty for fidelity of behavior and accuracy. Our models will not behave as the real thing, nor provide the same results as the real thing: that's a fact. We might still be ok with this, but only to some extent, and under certain conditions. Modeling a memory read as a magical number appearing in memory will overlook the fact that memory reads take time and are non-deterministic. In most cases, this is a reasonable price to pay. ## "Digital Twins" and the Price of Agency Let's consider a basic electric circuit: a battery, a resistor, and a switch, which we happen to have as a real circuit on top of our lab table. But we also want to model the circuit in a simulation. How do we proceed? Well, we can just do this in a spreadsheet and implement Ohm's law. Is that a model? It is in a way, since it provides some insight into how different parameters of the circuit interact. It is important to say that in this spreadsheet model, of course, there is no real current, real resistance, or real voltages. There are just variables (numbers) that represent such electrical properties, connected by means of Ohm's equation. Can we call that a simulation or even a digital twin? Not so much. Say we want now to analyze how the circuit behaves if the ambient temperature drops by half. Our spreadsheet model cannot answer that, at least not yet. We must add to the spreadsheet the equation that relates resistance variation with temperature. After adding that, now we want to benchmark how the real circuit in the lab and our model behave as we arbitrarily vary the temperature. We quickly see a problem: the spreadsheet reacts instantaneously to the temperature variation (which is now an input), whereas the real circuit does not. Why? Well, we have not added so far any concept of "time" to the model; it just instantaneously calculates resistance, voltages, and currents as we change the inputs. If we want to add time to it, we have to surely drop the spreadsheet and somehow code so that our model can make the variables of interest evolve with time. This would lead us to a set of ordinary differential equations, which also means we will have to find a way of integrating such equations to obtain the output variables. Say we do that, and now that we have a model that is aware of time: note that the notion of "time" in a simulation model is purely a construction; more specifically, a discretized construction. Whereas the real circuit on the table can only evolve in real-time (meaning, it cannot evolve faster than what the environment and its own time constants dictate), a simulated model can evolve as fast as the computing resources allow. In the real circuit on the table, time elapses continuously, but in our computer simulation model, it elapses in discrete steps. During a simulated timestep, the computer model must calculate all the statuses of the relevant values and guarantee the results are ready before the next step comes. What happens between two subsequent steps in the simulation? Nothing; all simulation variables remain frozen with the last computed values until the next tick comes. Simulated time can also run backward. Now it turns out we want to measure how fast the voltage drop on the resistor settles to its final value while turning the switch on and off. We measure the real circuit, then we check our model, and we see a difference. Why? Well, our model has so far considered that the wires connecting the different discrete elements are ideal (no resistance, no inductance, no capacitance). We purposely decided that wires just "do not exist" in our model, and things were linked sort of magically; now we are paying the price for such simplification. Turns out, wires do exist, and they have physical properties that affect the behavior of the circuit. The fact that we have been able so far to ignore this obeys only to the fact that our focus had just been somewhere else. All of a sudden, our model's coarseness becomes very visible. System analysis and modeling largely rely on this opportunistic thinking: if conditions allow, we can largely ignore lower-order effects until we cannot. It is a modeling challenge to make it extensible as the need for fidelity evolves. But it is also a challenge (and perhaps the most difficult one) to understand where the line lies between adding useful vs. pointless realism to our models. From a computing perspective, unnecessary features of a model are just overhead. Finally, let's say we want to create a simulation of our electric circuit. We want to make sure the model will behave, in every single aspect, the same as the circuit on the lab table. This means that we would have to simulate the battery chemistry, the wires' distributed inductance and capacitance per unit of infinitesimal length, the switch mechanics, mass and stiffness, resistor internal composition, and solving for all the partial differential equations involved. This could snowball into a feature-adding spree until we find ourselves modeling down to the quantum mechanics of the most elementary particles known. This leads to some thoughts: - As we increase agency in our models, we are required to understand beforehand how the new agents will behave. In the circuit example, to be able to model the behavior of the circuit more realistically, we need to know what effects are missing before adding them. For example, we must know Joule's law exists before adding that feature to the circuit model. - As we increase agency in our models and simulations, every new agent has its own state vector, which is a set of numerical variables that describe its past, current, and future as simulated time advances, which a computing resource must keep evolving accordingly. The past might be kept in non-volatile storage for analysis and replay, either forward or backward. Then, a "global supervisor" of sorts must ensure the cohesion between all the agents; i.e., enforce the rules on how the agents can or cannot interact. Even if we try to abandon the "global supervisor" idea and ideally assume full parallel computing, where we assign to each agent its own dedicated computing resource, still it does not resolve the problem of needing something else to decide how those agents are supposed to behave with each other. In our physical reality, physical laws are pervasive, and they dictate how all particles in the known Universe are supposed to behave when they mingle. In simulated realities, those "laws" must be constructed and coded as well. Nobody presently knows how to reproduce General Relativity and particle physics from an algorithm running on some sort of machine; we can only *approximate* the laws that we *know* with a computer simulation. Attempts to algorithmically reproduce natural laws can quickly face incompatibilities related to the symmetries of theories of special and general relativity, as well as "spooky actions at a distance". - Will we ever be able to manage to model things exactly as they are and create *true* digital twins that can fully mimic a physical system? A perfect model should generate the exact same information as the real system, given the same stimuli. But as we said before, even two copies of a physical system will not generate the exact same information given the same stimuli, nor will they show the exact same behavior, due to the compound nature of such systems. Every copy of a physical system is, to a statistical level, different. As functional complexity grows, the dispersion between copies of a physical system will also grow. This means that an eventual perfect digital twin would need to either be perfectly tuned for each specific instance of a physical system or accept to compute information with a tolerable error. - Could an operator be eventually unable to tell if she is operating a digital twin or the real thing? Since the digital twin would be running in a computational environment, detached from the physical reality, even if perfectly simulated at the quantum level, all the simulated constituent parts and laws would evolve in a pure computational domain. For example, if we were theoretically able to model our electric circuit at the quantum level, we would not be able to hook a multimeter or oscilloscope to it: hence, there would be no way we would not notice it is not the real thing. Then, the role of Human-Machine Interfaces (or Human-Model interfaces in this case) becomes highly relevant; it is not just about the accuracy with which the model computes its numerical outputs, but also how it couples those with our perception of how the (original) system is supposed to work. Provided the model is perfect (from an information and dynamics perspective), it must still present to the operator the exact "experience" as the real thing to avoid being spotted as a phony.