Most software is judged by whether it works. Regulated software is judged by something stricter and stranger: whether you can prove, later, to a person who was not in the room, that it worked for the right reason. The decision is only half of it. The other half is the account of the decision, and in this kind of work the account is not overhead. It is the thing you are actually selling.
This is the part that teams building compliance systems tend to discover late, usually in the first real audit. The system makes good decisions. Then someone asks why it made a particular one, eighteen months ago, and wants to see the rule that was applied, the regulation it came from, the version of that regulation in force on the day, and the facts that triggered it. If the answer is "the model looked at the document and decided," the system has failed, even if the decision was correct. Being right without being able to show why is, in regulated work, a kind of being wrong.
This paper is about provenance: what it actually is, why it cannot be added after the fact, and why the audit trail is a feature rather than a cost.
The question that comes after the decision
Picture the decision already made and correct. A transaction was flagged, a filing was marked late, a transfer was blocked. Now the questions start, and they are not the questions a consumer product ever has to answer.
Which rule produced this? Not a general policy, the specific rule. Where in the regulation does that rule come from, down to the article? Was that the version of the rule in force on the date of the decision, or has the regulation changed since? What facts did the rule read, and where did those facts come from? Who decided this rule was a correct encoding of the law, and when? And can you show all of this to a regulator, an auditor, or a court, in a form they can check without taking your word for any of it?
A system that cannot answer these has not automated compliance. It has automated the easy half and left the hard half, the defensible half, undone. And the hard half is the half that was the point, because the reason a bank cannot simply use a clever model is not that the model decides badly. It is that the model cannot answer these questions about its own decisions, and the bank will be asked.
What provenance actually is
Provenance is not a log file. A log records that something happened. Provenance records why it was correct, in a form that survives scrutiny. It has parts, and they are worth pulling apart, because conflating them is how systems end up with logs they cannot defend.
The first part is the logic. The actual conditions that produced the verdict, explicit and readable, not buried in code and not inferred. The second is the source. The publication, the article, the paragraph the rule encodes, so the decision traces to law and not to someone's interpretation of it. The third is the version. Which edition of the rule was in force when the decision was made, because regulation changes and a decision has to be judged against the law as it stood, not the law as it is now. The fourth is the trust signal. Whether a qualified person reviewed this rule and confirmed it captures the regulation, and what evidence stands behind that.
A decision carries real provenance only when all four are present and bound together: this verdict, from these conditions, encoding this article, in this version, reviewed by this person. Drop any one and the account develops a hole exactly where an auditor will push.
Point-in-time correctness
The version part deserves its own attention, because it is the part teams forget, and it is the part that turns a tidy system into an indefensible one.
Regulation is not static. Thresholds move. Deadlines shorten. Adequacy decisions are struck down. A disclosure window that was ten days becomes five. When that happens, a system that holds only the current rule has quietly lost the ability to explain its past decisions, because it is now judging yesterday's decision against today's rule. The auditor does not want to know whether the filing would be late under the current window. They want to know whether it was late under the window that was in force on the day it was due. Those can be different questions with different answers, and only one of them is the correct one to ask.
So provenance has to be point-in-time. The system must be able to say not just "here is the rule" but "here is the rule as it stood on the date this decision was made," with the effective date and the amendment history attached. This is why the regulation lives as versioned data and the logic references it: when the law changes, you add a new version with its effective date, and every past decision can still be explained against the version that governed it. A decision is forever tied to the law it was actually made under. That is not a nicety. It is the difference between an audit trail and a liability.
The model has no provenance
Now set the model's decision next to this, and the gap is total. When a model produces a verdict, what is the account? You can show the input you gave it and the output it returned. The step in between is a forward pass through a vast network of weights, and there is no article in there, no version, no condition you can point to. "The model assigned high probability to this outcome" is not a reason a regulation was satisfied. It is a description of an inference.
You can ask the model to explain itself, and it will, fluently, and the explanation will be a plausible story generated after the fact, not a trace of the actual computation. It can be wrong while the answer is right. It cannot be checked against the law, because it was not derived from the law. A model can be accurate. It cannot be provenanced, and in regulated work the second property is the one that closes the sale.
Provenance is architecture, not a log
Here is the claim that matters most for anyone building this: you cannot add provenance afterward. It is not a logging layer you bolt onto a decision system once it works. It is a property of how the decision is made.
If the decision comes from a model, there is no provenance to capture, because the decision was never made by reference to anything you can cite. You can log the inputs and outputs all you like; the trail will have a hole in the middle that no amount of logging fills. If the decision comes from a rule, the provenance is already there, because the rule is the provenance. The conditions that fired are the logic. The article the rule carries is the source. The version it was drawn from is the point-in-time record. The reviewer who signed it is the trust signal. Capturing the audit trail is not extra work layered on the decision. It is a readout of how the decision was structured in the first place.
This is why provenance and determinism are the same design choice seen from two angles. A deterministic rule decides in a way that can be explained, and explained the same way every time, traced to a source, pinned to a version. A probabilistic model decides in a way that cannot, however much tooling you wrap around it. You do not choose to have an audit trail. You choose an architecture, and the audit trail either exists as a consequence of it or cannot be had at any price.
The audit trail as a product
It is tempting to treat all of this as compliance theater, a tax you pay to sell into regulated buyers. That framing is backwards. The audit trail is not the tax. It is the product.
A bank is not buying a system that makes compliance decisions. It can make compliance decisions. It is buying a system that makes compliance decisions it can defend, to its regulator, on a bad day, two years later, without scrambling. The defensibility is the value. A decision endpoint that returns a verdict is a commodity. A decision endpoint that returns a verdict, the rule that produced it, the article it traces to, the version in force, and the reviewer who signed it, is infrastructure a regulated institution can actually run, because it answers the question that always comes after the decision before the question is even asked.
Build it that way and the audit is not an event you dread. It is a query. The trail was there the whole time, because it was never separate from the decision. It was the decision, written down properly.
The point
In consumer software, why is optional. The recommendation was good or it was not, and no one convenes a hearing. In regulated work, why is the whole second half of the job, and it is the half that does not yield to a better model, because a model's decision has no why that survives being checked. Provenance is not something you attach to a decision. It is something a decision either has by construction or never has at all. Choose the architecture that has it, and the audit trail stops being the cost of doing business in regulated markets and becomes the reason you can do business there in the first place.