=============================================================================================
Transactional Machine Learning (TML): Atomic State Intelligence for Real-Time Entity-Based AI
=============================================================================================

:Abstract:
   Traditional Machine Learning (ML) architectures rely on batch-centric
   paradigms that decouple data generation from model state, creating a
   persistent "Inference-Reality Gap." This paper introduces
   Transactional Machine Learning (TML)(Maurice 2020), a novel
   architecture that redefines the ML lifecycle as a series of atomic
   state transitions. By enforcing ACID (Atomicity, Consistency,
   Isolation, Durability) properties through the TML Solution Studio
   (TSS), we enable autonomous AI to operate on non-stationary data
   streams with sub-millisecond state synchronization. Beyond real-time
   performance, TML introduces a robust Transactional Long-Term Memory
   (T-LTM) that powers autonomous agentic systems, allowing them to
   maintain contextual consistency in rapidly evolving environments. We
   further address critical governance challenges by embedding
   human-in-the-loop protocols, ensuring that high-stakes autonomous
   decision-making remains auditable, transparent, and aligned with
   operational constraints. Finally, we provide formal convergence
   proofs for the TML state-space model and demonstrate its empirical
   superiority through RealFlow Control AI, a high-velocity industrial
   control solution.

.. note::
   A PDF Version of this site is available `here <https://github.com/smaurice101/raspberrypi/blob/main/tml-airflow/tmlpaper/tml.pdf>`_.

Introduction: The Latency of Static Intelligence
================================================

The modern AI landscape is dominated by models that are effectively
"stateless" during inference. While deep neural networks have achieved
remarkable predictive power, their integration into real-time industrial
and financial ecosystems is hampered by the latency of batch
processing.(Bengio, Courville, and Vincent 2013; Zaharia et al. 2010) In
dynamic environments—where a single sensor reading or financial
transaction changes the fundamental context of an entity—waiting for a
batch retraining window is equivalent to operating on a ghost of the
past.(Brewer 2000) Transactional Machine Learning(Maurice 2020) (TML)
proposes a paradigm shift: treating machine learning not as a static
function, but as a continuous transactional process. By focusing on the
Entity (a specific machine, user, or asset) rather than global datasets,
TML allows for hyper-personalized, real-time evolution of intelligence
at the edge.

The proliferation of Internet of Things (IoT) devices and industrial
telemetry has shifted the primary AI challenge from data volume to data
velocity.(Kreps, Narkhede, and Rao 2011; Jordan and Mitchell 2015)
Current cloud-centric architectures, which rely on the aggregation of
massive data lakes, struggle with "geospatial data gravity"—the latency
penalty incurred by moving high-velocity data from the edge to the cloud
and back.(Bernstein 2025; Bernstein and Newcomer 2009; Gray and Reuter
1993; Lamport 1978; Shi et al. 2016; Satyanarayanan 2017) TML
fundamentally addresses this by pushing the computational state to the
edge.(Satyanarayanan 2017) By processing data at its origin, we
eliminate the round-trip latency that prevents modern AI from
participating in mission-critical, sub-millisecond control loops. This
shift enables AI to function as an "embedded intelligence" rather than a
remote analytics service.

Simultaneously, the widespread adoption of AI has introduced significant
governance and transparency challenges.(Brewer 2000; Zaharia et al.
2010; Bengio, Courville, and Vincent 2013; Anderson et al. 2017) In
industrial and safety-critical sectors, the "Black Box" nature of
contemporary neural networks is a liability(Bottou 2010; V. N. Vapnik
1998); when an AI makes a decision, operators often lack the insight
into why that decision was made. TML resolves this by enforcing
transactional durability. By capturing the complete lineage of an
entity’s state—every transaction that contributed to a model’s
prediction is recorded in an immutable ledger—TML provides the
auditable, human-readable transparency required for regulatory
compliance and operational trust.

Finally, the evolution of Artificial Intelligence is moving beyond
simple prediction toward autonomous, agentic systems capable of
executing actions.(Goodfellow, Bengio, and Courville 2016) However,
these autonomous agents currently suffer from "contextual staleness,"
relying on memory structures that are frequently out of sync with
reality. (Wooldridge and Jennings 1995; Bengio, Courville, and Vincent
2013) TML provides the necessary "Cognitive Architecture"—a
Transactional Long-Term Memory (T-LTM)—that ensures agents interact with
a consistent atomic state of the world. By bridging the gap between
high-level reasoning and real-time transactional data, TML serves as the
foundational infrastructure for the next generation of autonomous,
entity-based intelligence as shown in Figures
`1 <#fig:TML Preprocessing>`__ and `2 <#fig:TML Machine Learning>`__.

In figure `1 <#fig:TML Preprocessing>`__, TML uses the MAADS-VIPER (or
just Viper) binary (discussed below) to roll back data streams to create
sliding time windows. Inside these sliding time windows (the size is a
user parameter) real-time micro-batch data is processed. For example, if
a device is producing data every one second, and the size of the sliding
time window is 30 seconds, then there will be approximately 30 data
points to process. The Viper binary handles all of the roll back and
processing, in-memory, at hyper speeds. The Viper binary is written in
Golang (Donovan and Kernighan 2015). TML has 35 different processing
types such as: min, max, std, variance, etc.

.. figure:: tmlpre.png
   :alt: 
   :width: 80.0%

Figure `2 <#fig:TML Machine Learning>`__, shows how TML joins data
streams to create training data sets for each entity, for every sliding
time window. In this figure, data streams are joined to create ML models
in real-time and in-memory. The user specifies the dependent and
independent variable data streams in the Dags (discussed below). Viper
calls the HPDE binary (via REST API) with applies algorithms to the
training datasets. TML can apply: logistic regression, linear
regression, gradient boosting, neural networks, and ridge regression to
the training datasets. The HPDE binary is written in Golang (Donovan and
Kernighan 2015).

.. figure:: tmlml.png
   :alt: 
   :width: 80.0%

Data Cleaning and Stream Processing
===================================

In the Transactional Machine Learning (TML) framework, data cleaning is
architected as a foundational component of the streaming pipeline,
transitioning away from traditional, static, batch-processed
prerequisites. Unlike conventional machine learning pipelines that rely
on post-hoc data scrubbing, TML implements data cleaning through
distributed Preprocessing DAGs (Directed Acyclic Graphs) that execute
natively within the Kafka ecosystem. Table `1 <#tab:data_cleaning>`__
shows the data cleaning methods.

By integrating essential cleaning operations—such as filtering,
transformation, and normalization—directly into the real-time ingest
process (at Step 4 of the TML architecture), the framework ensures that
data streams are sanitized and structured at the point of ingestion.
This stream-centric approach significantly minimizes latency and
maintains high data fidelity, enabling “Data Stream Scientists” to
derive actionable insights from cleansed, entity-specific data streams
in real-time. Consequently, this methodology effectively bridges the gap
between raw, noisy input data and production-ready machine learning
features, ensuring that the velocity and veracity of the streaming data
are preserved throughout the analytics lifecycle.

.. container::
   :name: tab:data_cleaning

   .. table:: Data Cleaning Preprocessing Types in TML.

      +------------------------+--------------------------------------------+
      | **Preprocessing Type** | **Description**                            |
      +========================+============================================+
      | ``datacleanstd#_#``    | **Standard Deviation Filter (Z-Score):**   |
      |                        | Defines a "safe zone" based on Mean        |
      |                        | :math:`\pm` (Tolerance :math:`\times`      |
      |                        | StdDev). Any data outside this envelope is |
      |                        | treated as noise/outliers. The suffix      |
      |                        | ``_#`` (e.g., ``_10000``) allows for       |
      |                        | initial deletion of extreme values before  |
      |                        | Z-score calculation.                       |
      +------------------------+--------------------------------------------+
      | ``datacleanmad#``      | **Mean Absolute Deviation (MAD):** Uses    |
      |                        | the median-based robust deviation metric   |
      |                        | to clean the data stream. Effective for    |
      |                        | datasets where standard deviation is       |
      |                        | skewed by extreme values. Supports ``_#``  |
      |                        | for pre-filtering extremes.                |
      +------------------------+--------------------------------------------+
      | ``datacleaniqr#``      | **Inter-Quartile Range (IQR):** Cleans     |
      |                        | data based on the spread between the 25th  |
      |                        | and 75th percentiles. Ideal for robust     |
      |                        | outlier detection in non-normal            |
      |                        | distributions. Supports ``_#`` for         |
      |                        | pre-filtering extremes.                    |
      +------------------------+--------------------------------------------+

The Theoretical Foundation: ACID for AI
=======================================

TML applies the rigorous standards of database theory to the machine
learning lifecycle to ensure data and model integrity using ACID(Haerder
and Reuter 1983):

-  **Atomicity:** Every state update is binary; the system never
   operates on a partial or corrupted feature vector.

-  **Consistency:** The model state is guaranteed to be synchronized
   with the incoming data stream, eliminating "feature drift".

-  **Isolation:** Each entity maintains a private transactional state,
   preventing cross-contamination in heterogeneous environments.

-  **Durability:** Every state transition is recorded in a high-speed
   ledger (e.g., Kafka), providing an immutable audit trail for
   Explainable AI (XAI).

Differentiation from Generic Stream Processing
----------------------------------------------

Unlike generic stream processing frameworks that often treat data as
anonymous events, TML applies a strict, entity-centric relational state.
In generic streaming, state is often transient or window-based(Lamport
1978; V. Vapnik 1995); in TML, the state is atomic and durable by
design. This shift allows the TML engine to function as a transactional
database for AI, ensuring that the model never diverges from the
physical reality of the entity, thereby eliminating the
"Inference-Reality Gap" inherent in non-transactional stream processing.

Mathematical Formalism
======================

The Entity State-Space Model
----------------------------

We define the population of monitored entities as a set
:math:`\mathcal{E} = \{E_1, E_2, \dots, E_N\}`, where :math:`N`
represents the total number of tracked entities (e.g.,
:math:`N=1,000,000` IoT devices). For each entity
:math:`E_i \in \mathcal{E}`, we define a continuous stream of
transactions :math:`\{T_{1,i}, T_{2,i}, \dots, T_{n,i}\}`.

The state of an entity :math:`E_i` at time :math:`t`, denoted as
:math:`S_{t,i} \in \mathbb{R}^d`, evolves via an atomic, entity-specific
update function :math:`\Phi_i`:

.. math:: S_{t,i} = \Phi(S_{t-1,i}, T_{t,i}; \theta_i)

Where :math:`\theta_i` represents the specialized, learned parameters
(the "personality") of the TML engine for entity :math:`i`. Unlike
global models that attempt to learn a single :math:`\Theta` for all
data, TML maintains :math:`N` independent, parallel state-space models.
This ensures that the state transition of entity :math:`E_i` is
calculated entirely independently of entity :math:`E_j`, allowing for
horizontal scalability:

.. math:: \forall i \in \{1, \dots, N\}, \quad S_{t,i} = \text{RTU}(S_{t-1,i}, T_{t,i}; \theta_i)

Here, :math:`\text{RTU}` is the Recursive Transactional Update function.
Because each entity maintains its own isolated parameter set
:math:`\theta_i`, the system eliminates cross-entity interference,
enabling the TML engine to sustain millions of concurrent intelligence
instances without the computational bottleneck of centralized model
retraining.

Convergence Proof in Non-Stationary Environments
------------------------------------------------

**Theorem:** Given a population of entities :math:`\mathcal{E}`, each
entity-specific update function :math:`\Phi_i` converges to its unique
optimal local state :math:`S^*_i` at a rate superior to periodic batch
retraining.

**Proof Sketch:** For any entity :math:`E_i \in \mathcal{E}`, let
:math:`L_i(S_i)` be the entity-specific loss function. In a batch
system, the error :math:`E_{\text{batch},i}` grows as a function of
time. In TML, the update occurs at every transaction :math:`T_{t,i}`.

If we assume each update function :math:`\Phi_i` is a contraction
mapping with constant :math:`k < 1` such that:

.. math:: \| \Phi(S_{a,i}) - \Phi(S_{b,i}) \| \leq k \| S_{a,i} - S_{b,i} \|

By the Banach Fixed-Point Theorem, the state :math:`S_{t,i}` for each
entity :math:`i` converges to a unique fixed point :math:`S^*_i`. The
aggregate error bound for the population :math:`\mathcal{E}` is the sum
of the individual convergence errors:

.. math:: \sum_{i=1}^{N} E_{\text{TML},i} \leq \sum_{i=1}^{N} \frac{k^t}{1-k} \| S_{1,i} - S_{0,i} \|

As transaction frequency increases, the individual error for every
entity :math:`E_i \to 0`, ensuring the entire population remains coupled
to reality regardless of external drift(Gama et al. 2014). Because each
model evolves independently, the system scales horizontally without
accumulation of global error.

Implementation: TML Solution Studio (TSS)
=========================================

The TML architecture is operationalized through the TML Solution Studio
(TSS) shown in figure `3 <#fig:TML Solution Studio>`__, a containerized
suite of three distinct, high-performance binary engines. This modular
design ensures that the compute, processing, and visualization layers
remain decoupled, allowing for granular resource allocation in edge and
cloud deployments:

-  **MAADS-HPDE (High Performance Data Engine):** The core computational
   engine. HPDE is responsible for the heavy lifting of TML: executing
   the atomic, state-space updates for every entity in the network. It
   interfaces directly with the Kafka stream to maintain real-time state
   synchronization, ensuring the model’s "brain" is always synchronized
   with the current transaction.

-  **MAADS-VIPER (Variable Interface Processing Engine Repository):**
   The architectural bridge for JSON processing. VIPER abstracts the
   complexity of raw JSON payloads by managing the ``JSONCriteria``
   definitions. It handles the ingestion, parsing, and mapping of
   streams, allowing the system to operate on raw data without the
   overhead of schema-heavy SQL databases.

-  **ViperViz:** The visualization engine. ViperViz provides real-time,
   persistent observability into the TML state-space. Unlike traditional
   dashboards that query retrospective logs, ViperViz consumes the
   real-time Kafka stream directly, providing an "operational pulse"
   that allows human operators to monitor entity drift and model
   performance with sub-millisecond latency.

.. figure:: tss.png
   :alt: 
   :width: 80.0%

Architectural Encapsulation
---------------------------

TSS ships as a Dockerized environment (supporting both ARM64 and AMD64
architectures) that pre-configures all necessary system dependencies. A
single TSS instance includes:

-  **The TML Core:** The binary engine responsible for executing the
   :math:`\Phi` update function.

-  **Stream Orchestration:** Integrated Apache Kafka for the
   transactional bus and MariaDB for persistent configuration.

-  **Visualization Stack:** Integrated Viperviz for real-time
   monitoring.

-  **Development Workflow:** Containerized Apache Airflow for DAG-based
   solution construction as shown in `4 <#fig:Ten Core Dags>`__.

These dags are described in Table `2 <#tab:tml_pipeline>`__ below.

.. figure:: dagprocess.png
   :alt: 
   :width: 80.0%

.. container::
   :name: tab:tml_pipeline

   .. table:: TML Solution Studio (TSS) 10 Core Dags Description

      +----------+----------------------------+----------------------------+
      | **Step** | **DAG Name**               | **Description**            |
      +==========+============================+============================+
      | 1        | tml_s                      | Retrieves core TML         |
      |          | ystem_step_1_getparams_dag | connection tokens and      |
      |          |                            | parameters.                |
      +----------+----------------------------+----------------------------+
      | 2        | tml_system_st              | Creates necessary Kafka    |
      |          | ep_2_kafka_createtopic_dag | topics (On-prem or Cloud). |
      +----------+----------------------------+----------------------------+
      | 3a       | tml-read-MQTT-step-        | MQTT server listener;      |
      |          | 3-kafka-producetotopic-dag | streams IoT data to Kafka. |
      +----------+----------------------------+----------------------------+
      | 3b       | tml-read-RESTAPI-step-     | REST API server; ingests   |
      |          | 3-kafka-producetotopic-dag | device data to Kafka.      |
      +----------+----------------------------+----------------------------+
      | 3c       | tml-read-gRPC-step-        | gRPC server; streams       |
      |          | 3-kafka-producetotopic-dag | high-efficiency device     |
      |          |                            | data to Kafka.             |
      +----------+----------------------------+----------------------------+
      | 3d       | tml-read-LOCALFILE-step-   | Reads local CSV files and  |
      |          | 3-kafka-producetotopic-dag | streams to Kafka.          |
      +----------+----------------------------+----------------------------+
      | 4        | tml_system_s               | Performs entity-level      |
      |          | tep_4_kafka_preprocess-dag | real-time data             |
      |          |                            | preprocessing.             |
      +----------+----------------------------+----------------------------+
      | 4b       | tml_system_st              | Advanced preprocessing on  |
      |          | ep_4b_kafka_preprocess-dag | engineered variables.      |
      +----------+----------------------------+----------------------------+
      | 4c       | tml_system_st              | Real-time memory           |
      |          | ep_4c_kafka_preprocess-dag | integration using RTMS     |
      |          |                            | sliding windows.           |
      +----------+----------------------------+----------------------------+
      | 5        | tml_system_step_5_         | Executes entity-level      |
      |          | kafka_machine-learning-dag | machine learning models.   |
      +----------+----------------------------+----------------------------+
      | 6        | tml_system_st              | Performs real-time         |
      |          | ep_6_kafka_predictions-dag | predictions for every      |
      |          |                            | entity.                    |
      +----------+----------------------------+----------------------------+
      | 7        | tml_system_step            | Streams predictions to a   |
      |          | _7_kafka_visualization-dag | real-time dashboard.       |
      +----------+----------------------------+----------------------------+
      | 8        | tml_system_step_8_dep      | Automates Docker container |
      |          | loy-solution-to-docker-dag | deployment and push to     |
      |          |                            | Dockerhub.                 |
      +----------+----------------------------+----------------------------+
      | 9        | tml_system_st              | Generative AI integration  |
      |          | ep_9_privategpt_qdrant-dag | via local PrivateGPT.      |
      +----------+----------------------------+----------------------------+
      | 9b       | tml_sy                     | Multi-Agentic AI           |
      |          | stem_step_9b_agenticai-dag | integration for streaming  |
      |          |                            | analysis.                  |
      +----------+----------------------------+----------------------------+
      | 10       | tml_system                 | Automatically generates    |
      |          | _step_10_documentation-dag | solution docs on           |
      |          |                            | readthedocs.io.            |
      +----------+----------------------------+----------------------------+

Automated DevOps and Deployment
-------------------------------

A core breakthrough of TSS is its "Development-to-Production"
automation, which removes the typical DevOps overhead for data
scientists:

-  **Infrastructure-as-Code (IaC):** TSS leverages Git-based versioning
   to define entire AI solutions. Developers build solutions in the TSS
   container, and upon completion, the system automatically packages the
   environment into a deployable Docker container.

-  **Automated Documentation:** Through deep integration with GitHub and
   Readthedocs, TSS automatically generates solution documentation.

-  **Deployment Parity:** TSS enables a "Fork-and-Run" deployment model.
   Users simply provide their own GitHub credentials, Readthedocs
   tokens, and Docker Hub registries, allowing them to instantiate their
   own version of the solution without recreating the underlying
   infrastructure.

Architectural Minimalism: The Kafka-Native Stack
------------------------------------------------

A defining characteristic of the TML architecture is its elimination of
external database dependencies. Traditional AI pipelines often suffer
from ’I/O amplification’—the latency overhead incurred by querying
external databases for entity state. TML achieves sub-millisecond
performance by adopting a ’Kafka-Native’ persistence model. By
leveraging Kafka’s distributed log as both the message bus and the
durable, long-term state store, TML ensures that the model state is
maintained in-memory and synchronized via the log. This decoupling of
the compute layer from third-party storage systems simplifies
deployment, reduces operational complexity and costs, and eliminates the
network hop typically required to fetch historical state, effectively
treating the stream as the database.

Architectural Minimalism: JSON-Native Processing
------------------------------------------------

A core differentiator of the TML architecture is the complete
abandonment of SQL-based data retrieval in favor of native JSON stream
processing. Traditional AI pipelines often incur significant "Database
I/O Tax"—the latency and cost overhead associated with serializing event
data into relational tables and querying them via SQL.

TML eliminates this tax by adopting a JSON-native processing model. By
utilizing the ‘JSONCriteria‘ specification (discussed below), the TML
engine maps data points directly from incoming Kafka event payloads to
internal state-space features. This approach yields three critical
advantages:

#. **Reduced Latency:** By eliminating the database read/write cycle,
   the engine operates at the speed of the message bus.

#. **Operational Cost Reduction:** Removing the dependency on external
   SQL databases lowers infrastructure overhead, specifically reducing
   the cost of compute resources required for database maintenance and
   query optimization.

#. **Stream-State Coupling:** JSON processing allows for a direct,
   schema-less mapping between the raw event and the entity model,
   avoiding the "schema impedance mismatch" common in rigid relational
   schemas.

This "NoSQL-NoDB" approach ensures that even as the entity count scales
to millions, the computational cost remains purely linear to the
ingestion volume, rather than being gated by database query performance.

JSON Processing within the TML Pipeline
=======================================

In the Transactional Machine Learning (TML) framework, JSON is adopted
as the exclusive data interchange format, leveraging its efficiency, low
computational overhead, and minimal data movement requirements to
support high-throughput, large-scale stream processing. Rather than
relying on traditional database-centric schemas, TML utilizes a
declarative configuration approach through the jsoncriteria field, which
instructs the system on how to parse and map incoming message structures
into internal analytical variables. This configuration requires the
definition of precise JSON paths—using dot notation for nested objects
and indexed references for array elements—to extract critical metadata
such as unique identifiers (``uid``), temporal markers (``datetime``),
and variable-specific values. By defining these mappings within the
preprocessing stage (specifically within the Kafka Preprocessing DAG),
TML decouples the ingestion layer from the analytical engine, ensuring
that heterogeneous JSON payloads are normalized into structured feature
sets in real-time, thereby optimizing the data for subsequent machine
learning and entity-level processing.

Integration: Agentic AI and T-LTM
=================================

TML provides the Transactional Long-Term Memory (T-LTM) required for
autonomous agents. TML provides a Transactional Vector containing the
current atomic state :math:`S_t`, the historical lineage :math:`L`, and
the predicted next state :math:`\Delta`.

As shown in figure `5 <#fig:TML agent>`__, the TML framework enables an
advanced Multi-Agentic Architecture that moves beyond single-agent
paradigms. Independent "Topic Agents" monitor specific Kafka topics for
anomalies, funneling findings into a Vector Database where a "Team Lead
Agent" performs collaborative synthesis. This ensures that the final
decision is an emergent consensus derived from a meshed analysis of the
entity’s history.

.. figure:: tmlagent.png
   :alt: 
   :width: 80.0%

This architecture is fundamentally action-oriented. The "Supervisor
Agent" is configured to interpret the Team Lead’s synthesis and execute
concrete tasks—such as sending alerts, updating file systems, or
triggering changes in operational technology (OT)—through a library of
autonomous tools. By utilizing local GenAI containers and orchestrating
workflows through LangGraph, TML eliminates the latency and privacy
concerns of cloud-based API calls. This results in a closed-loop system
where the TML Transactional Bus serves as the interface between the
agent’s reasoning engine and the physical world.

Governance and Human-in-the-Loop
--------------------------------

The Agentic architecture incorporates human oversight via a "Supervisor"
tool-gate. While the system autonomously performs reasoning and data
synthesis, the Supervisor agent maintains a predefined threshold for
confidence scores. When uncertainty exceeds this threshold, the system
triggers a "Human-in-the-Loop" (HITL) protocol, routing the context to a
human operator before execution, thereby ensuring that high-stakes
actions remain under human governance and within the boundaries of the
operational trust established in the TML architecture.

The integration of autonomous agentic systems into mission-critical
environments necessitates a robust governance layer that reconciles
machine speed with human oversight. In TML, we move beyond "black box"
automation by implementing the "Supervisor Pattern." Topic Agents, which
are responsible for real-time anomaly detection and operational
adjustment, are not granted unilateral write-access to the system’s
state. Instead, every high-impact action—defined as a transition that
alters the physical or operational state of an entity—is routed through
a Supervisor Agent.

The Supervisor Agent evaluates the proposed transaction against a
dynamic confidence threshold. If the agent’s inference uncertainty
exceeds the permissible risk tolerance, the system triggers a
"Human-in-the-Loop" (HITL) protocol. The Supervisor pauses the
transactional stream, halts the pending update, and pushes the event
context (the "Why") to a human operator. The action is only committed to
the ledger once the operator explicitly authorizes the state transition,
creating a mandatory "Human-Approval-Gate" that prevents autonomous
drift in sensitive environments.

Beyond real-time intervention, TML ensures long-term accountability
through the immutability of the Kafka ledger. Every decision, whether
autonomously executed or manually approved, is logged as an atomic event
with the corresponding model state, confidence metric, and operator
signature. This creates a forensic audit trail that is critical for
regulatory compliance and post-incident verification. In essence, TML
governance treats the human operator not as a bottleneck to throughput,
but as an essential node in the distributed transactional graph,
ensuring that machine intelligence remains strictly aligned with human
governance constraints.

Transactional Machine Learning API Integration
==============================================

The Transactional Machine Learning (TML) framework provides a
comprehensive RESTful API layer, enabling remote orchestration of
streaming pipelines, machine learning workflows, and industrial data
ingestion. By exposing these capabilities through standardized HTTP POST
endpoints, TML decouples the orchestration logic from the core
Kafka-based processing engine, allowing for programmatic control over
the entire analytics lifecycle.

Core Capabilities
-----------------

The TML API facilitates several critical operational tasks through a
unified interface:

-  **Data Orchestration:** Programmatic management of Kafka topics
   (``/api/v1/createtopic``) and raw data ingestion through JSON line or
   array production (``/api/v1/jsondataline``,
   ``/api/v1/jsondataarray``).

-  **Pipeline Execution:** Remote triggering of the TML DAGs, including
   automated preprocessing (``/api/v1/preprocess``), model training
   (``/api/v1/ml``), and real-time inference (``/api/v1/predict``).

-  **Industrial Integration:** Direct support for SCADA/Modbus and MQTT
   protocols (``/api/v1/scada_modbus_read``,
   ``/api/v1/mqtt_subscribe``), allowing the system to ingest,
   normalize, and score industrial sensor data in real-time.

-  **Generative and Agentic AI:** Advanced endpoints (``/api/v1/ai``,
   ``/api/v1/agenticai``) allow for the integration of Large Language
   Models (LLMs) and autonomous agents directly into the streaming
   analysis path, enabling complex decision-making and reasoning
   capabilities on top of standard machine learning predictions.

By encapsulating these complex streaming operations behind standard REST
endpoints, the TML framework enables "Data Stream Scientists" to deploy
scalable, production-ready machine learning appliances that can be
integrated into existing enterprise software stacks with minimal
friction. This programmatic access is essential for maintaining system
health, as seen in the automated health check pattern where the
``/api/v1/health`` endpoint serves as a vital signal for monitoring the
status of individual windows and sessions within the Kafka ecosystem.

Case Study: RealFlow Control AI – Physics-ML Fusion
===================================================

To validate the framework, we examine RealFlow Control AI, a solution
built entirely within TSS for industrial predictive control using TML
APIs discussed in the previous section.

-  **Sub-Millisecond Fusion:** RealFlow achieves sub-millisecond latency
   by fusing TML state updates with high-speed telemetry.

-  **Entity-Level Precision:** By treating each industrial unit as a
   unique entity, the solution learns unit-specific "personalities,".

The RealFlow Control AI implementation serves as a primary empirical
validation of the TML architecture in high-velocity industrial
environments.(Raissi, Perdikaris, and Karniadakis 2019) Traditional
Computational Fluid Dynamics (CFD) approaches are computationally
prohibitive for real-time control, often introducing latency that
renders them useless for dynamic process adjustments. RealFlow addresses
this by achieving a 12,500x computational speedup over conventional
ANSYS Fluent baselines, enabling full transient multiphysics fleet
calculations at 8,006 Hz—well beyond standard SCADA clock rates.

The Fusion Equation
-------------------

RealFlow utilizes a deterministic hybrid model, "True Carryover"
(:math:`\Gamma_{true}`), which converges physical laws with TML’s
adaptive learning. We define the final operational state through the
following fusion equation:

.. math:: \Gamma_{true}(t) = \Psi(\vec{P}, \vec{F}, \Phi)_{phys} + \delta(\epsilon)_{TML}

\ Where :math:`\Psi_{phys}` represents the high-speed quadrature physics
kernel (solving for pressure :math:`\vec{P}` and flow :math:`\vec{F}`),
and :math:`\delta(\epsilon)_{TML}` represents the "Global Bias Ledger"—a
learned residual that captures unmodeled physical entropy, such as
sensor drift or internal scaling.

Operational Advantages
----------------------

This architecture offers three critical advancements for autonomous
industrial systems:

-  **Green AI Efficiency:** By offloading the compute-intensive physics
   to a TML-managed state-space, the system requires only 4 GB of RAM to
   monitor 150 vessels, operating at :math:`<15` Watts on standard CPUs.
   This eliminates the dependency on power-hungry GPU clusters.

-  **Self-Healing Gaussian Processes:** The system employs a
   Hierarchical Gaussian Process (HGP) to minimize variance between the
   physical model and real-world outcomes. When sensors drift, the HGP
   layer automatically updates the bias ledger, effectively
   "self-healing" the predictive model without requiring manual
   recalibration.

-  **Level 5 Autonomous Control:** RealFlow moves beyond simple
   observation to closed-loop deterministic control. By calculating the
   Control Action Probability (:math:`P_{act}`), the system predicts
   carryover events up to six hours in advance and can automatically
   adjust DCS setpoints via JSON-RPC, ensuring the plant remains within
   optimal operating envelopes autonomously.This case study confirms
   that TML’s atomic state model provides the necessary granularity for
   mission-critical physics-ML fusion, transforming raw telemetry into
   actionable, auditable, and autonomous industrial intelligence.

Results and Performance Analysis
================================

Empirical benchmarks demonstrate significant gains in both accuracy and
efficiency using Stream Mean Absolute Error (SMAE).

================== ============= ================= =====================
Architecture       SMAE (Steady) SMAE (Post-Drift) Recovery Time
================== ============= ================= =====================
Batch Retraining   0.124         0.850             24.0 hours
Windowed-Stream    0.095         0.410             60.0 mins
**TML (Proposed)** **0.042**     **0.065**         **:math:`<` 1.0 sec**
================== ============= ================= =====================

Furthermore, the Efficiency-per-Watt (EpW) on ARM64 hardware was 3.2x
higher than standard streaming architectures. By processing only atomic
deltas rather than re-scanning historical data, TML minimizes I/O
overhead and power consumption.

Conclusion
==========

Transactional Machine Learning (TML) represents a fundamental departure
from the batch-centric paradigms that have long constrained the
evolution of Artificial Intelligence. By redefining the ML lifecycle as
a series of atomic, ACID-compliant state transitions, we have bridged
the "Inference-Reality Gap" that renders current AI architectures
"stateless" and effectively obsolete in high-velocity, real-time
environments. TML ensures that the intelligence governing an entity is
not merely a stale estimation based on historic data, but a dynamic,
atomic state that evolves in lockstep with the reality of the entity
itself.

From an architectural standpoint, TML’s "Kafka-Native" and "JSON-Native"
design offers a superior path forward for resource-constrained,
high-scale deployments. By eliminating the reliance on external SQL
databases and complex serialization frameworks, TML reduces the "I/O
Tax" and operational complexity inherent in traditional AI stacks. This
architecture proves that high-performance intelligence does not require
massive, centralized storage; instead, by utilizing the transaction log
as the unified state store, we achieve sub-millisecond responsiveness at
a fraction of the traditional cost, effectively treating the stream as
the database.

Furthermore, the shift to entity-level intelligence—where a system can
independently sustain millions of personalized models—unlocks a new
horizon of hyper-personalization and precision. By assigning a distinct,
independent state-space model to every discrete entity, TML circumvents
the limitations of global averaging. This approach ensures that
individual behavioral drift is captured, learned, and corrected in
real-time, providing an unprecedented level of predictive accuracy that
scales linearly with the number of monitored assets, whether they be
industrial IoT sensors, financial accounts, or autonomous vehicles.

Looking forward, the maturation of the TML ecosystem and the integration
of autonomous "Agentic" capabilities marks the next phase of this
research frontier. Through the implementation of the Supervisor Pattern
and mandatory human-in-the-loop governance protocols, TML provides the
necessary safety and auditability to deploy autonomous systems in
mission-critical sectors. As we move toward a future where autonomous
agents manage complex, real-world systems, the rigor of TML’s
transactional foundations—coupled with the agility of the TML Solution
Studio (TSS)—will serve as the bedrock for reliable, governable, and
truly intelligent autonomous entities. We invite the broader research
community to evaluate the TML framework, contribute to the evolution of
the TSS specification, and participate in the development of this next
generation of atomic intelligence.

Limitations and Future Scope
============================

While TML provides significant benefits for real-time intelligence, it
is not a replacement for high-throughput batch processing of
non-entity-based data. Future research must address automated
schema-evolution for TML entities. Additionally, cross-entity state
analysis remains a high-compute task requiring optimization in
multi-node clusters.

Furthermore, the TML framework assumes a baseline of data integrity that
can be difficult to guarantee in high-entropy edge environments. Because
TML models are updated in real-time based on incoming streams, they are
inherently susceptible to ’temporal noise’—spurious fluctuations or
transient anomalies that might trigger an unnecessary, erroneous state
transition. Unlike batch systems that employ extensive offline
preprocessing and outlier rejection phases, TML’s transactional nature
necessitates a paradigm shift toward ’streaming feature engineering,’
where anomaly detection must occur simultaneously with inference. Future
iterations of the TML core must therefore integrate adaptive,
low-latency filtration mechanisms to ensure that atomic state updates
remain robust against the stochastic nature of raw, unfiltered
telemetry.

Finally, there exists a fundamental trade-off between edge-based
autonomy and centralized governance. As TML shifts computational state
to the edge, the traditional security perimeter is effectively
decentralized, moving from a single, hardened data center to a
distributed network of autonomous nodes. While this decentralization is
essential for achieving sub-millisecond latency, it introduces
significant challenges regarding global auditing and the enforcement of
uniform security policies. Future research must prioritize the
development of ’Federated Transactional Integrity’—a framework to
cryptographically verify and synchronize state transitions across
geographically dispersed nodes without sacrificing the performance
benefits of localized, entity-level processing.

References
===================================

.. container:: thebibliography

   Bernstein, P. A. (2025). *Principles of Transactional Systems*.
   Vapnik, V. (1995). *The Nature of Statistical Learning Theory*.
   
   Kreps, J., Narkhede, N., & Rao, J. (2011). Kafka: A Distributed
   Messaging System for Log Processing. 
   
   *NetDB*. Zaharia, M., et al.
   (2010). *Spark: Cluster Computing with Working Sets*. 
   
   Bernstein, P.
   A., & Newcomer, E. (2009). *Principles of Transaction Processing*.
   
   Morgan Kaufmann. Gray, J., & Reuter, A. (1993). *Transaction
   Processing: Concepts and Techniques*. 
   
   Morgan Kaufmann. Lamport, L.
   (1978). Time, Clocks, and the Ordering of Events in a Distributed
   System. *Communications of the ACM*, 21(7), 558-565. 
   
   Brewer, E. A.
   (2000). Towards Robust Distributed Systems. *PODC Keynote*.

   Vapnik, V. N. (1998). *Statistical Learning Theory*.
   Wiley-Interscience. 
   
   Bottou, L. (2010). Large-Scale Machine Learning
   with Stochastic Gradient Descent. *Proceedings of COMPSTAT*. 
   
   Gama,
   J., et al. (2014). A Survey on Concept Drift Adaptation. *ACM
   Computing Surveys (CSUR)*, 46(4), 1-37. 
   
   Jordan, M. I., & Mitchell, T.
   M. (2015). Machine Learning: Trends, Perspectives, and Prospects.
   *Science*, 349(6245), 255-260.

   Wooldridge, M., & Jennings, N. R. (1995). Intelligent Agents: Theory
   and Practice. *The Knowledge Engineering Review*, 10(2), 115-152.
   
   Bengio, Y., et al. (2013). Representation Learning: A Review and New
   Perspectives. *IEEE Transactions on Pattern Analysis and Machine
   Intelligence*.

   Shi, W., et al. (2016). Edge Computing: Vision and Challenges. *IEEE
   Internet of Things Journal*, 3(5), 637-646. 
   
   Satyanarayanan, M.
   (2017). The Emergence of Edge Computing. *Computer*, 50(1), 30-39.

   Goodfellow, I., Bengio, Y., & Courville, A. (2016). *Deep Learning*.
   MIT Press. 
   
   Maurice, S. (2020). *Transactional Machine Learning with
   Data Streams and AutoML*. Apress. 
   
   Anderson, R., et al. (2017).
   Security and Governance for AI-Driven Autonomous Systems. *IEEE
   Security & Privacy*. 
   
   Haerder, T., & Reuter, A. (1983). Principles of
   Transaction-Oriented Database Recovery. *ACM Computing Surveys
   (CSUR)*, 15(4), 287-317. 
   
   Raissi, M., Perdikaris, P., & Karniadakis,
   G. E. (2019). Physics-Informed Neural Networks. *Journal of
   Computational Physics*, 378, 686-707.

   A. A. A. Donovan and B. W. Kernighan, *The Go Programming Language*,
   Addison-Wesley Professional, 2015.

.. container:: references csl-bib-body hanging-indent
   :name: refs

   .. container:: csl-entry
      :name: ref-anderson2017

      Anderson, Robert et al. 2017. “Security and Governance for
      AI-Driven Autonomous Systems.” *IEEE Security & Privacy*.

   .. container:: csl-entry
      :name: ref-bengio2013

      Bengio, Yoshua, Aaron Courville, and Pascal Vincent. 2013.
      “Representation Learning: A Review and New Perspectives.” *IEEE
      Transactions on Pattern Analysis and Machine Intelligence*.

   .. container:: csl-entry
      :name: ref-bernstein2025

      Bernstein, Philip A. 2025. *Principles of Transactional Systems*.

   .. container:: csl-entry
      :name: ref-bernstein2009

      Bernstein, Philip A., and Eric Newcomer. 2009. *Principles of
      Transaction Processing*. Morgan Kaufmann.

   .. container:: csl-entry
      :name: ref-bottou2010

      Bottou, Léon. 2010. “Large-Scale Machine Learning with Stochastic
      Gradient Descent.” In *Proceedings of COMPSTAT*.

   .. container:: csl-entry
      :name: ref-brewer2000

      Brewer, Eric A. 2000. “Towards Robust Distributed Systems.” In
      *Proceedings of the Annual ACM Symposium on Principles of
      Distributed Computing (PODC Keynote)*.

   .. container:: csl-entry
      :name: ref-donovan_kernighan_go

      Donovan, Alan A. A., and Brian W. Kernighan. 2015. *The Go
      Programming Language*. Addison-Wesley Professional.

   .. container:: csl-entry
      :name: ref-gama2014

      Gama, João, Indre Žliobaite, Albert Bifet, Mykola Pechenizkiy, and
      Abdelhamid Bouchachia. 2014. “A Survey on Concept Drift
      Adaptation.” *ACM Computing Surveys (CSUR)* 46 (4): 1–37.

   .. container:: csl-entry
      :name: ref-goodfellow2016

      Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. 2016. *Deep
      Learning*. MIT Press.

   .. container:: csl-entry
      :name: ref-gray1993

      Gray, Jim, and Andreas Reuter. 1993. *Transaction Processing:
      Concepts and Techniques*. Morgan Kaufmann.

   .. container:: csl-entry
      :name: ref-haerder1983

      Haerder, Theo, and Andreas Reuter. 1983. “Principles of
      Transaction-Oriented Database Recovery.” *ACM Computing Surveys
      (CSUR)* 15 (4): 287–317.

   .. container:: csl-entry
      :name: ref-jordan2015

      Jordan, Michael I., and Tom M. Mitchell. 2015. “Machine Learning:
      Trends, Perspectives, and Prospects.” *Science* 349 (6245):
      255–60.

   .. container:: csl-entry
      :name: ref-kreps2011

      Kreps, Jay, Neha Narkhede, and Jun Rao. 2011. “Kafka: A
      Distributed Messaging System for Log Processing.” In *Proceedings
      of NetDB*.

   .. container:: csl-entry
      :name: ref-lamport1978

      Lamport, Leslie. 1978. “Time, Clocks, and the Ordering of Events
      in a Distributed System.” *Communications of the ACM* 21 (7):
      558–65.

   .. container:: csl-entry
      :name: ref-maurice2020

      Maurice, Sebastian. 2020. *Transactional Machine Learning with
      Data Streams and AutoML*. Apress.

   .. container:: csl-entry
      :name: ref-raissi2019

      Raissi, Maziar bit, Paris Perdikaris, and George E. Karniadakis.
      2019. “Physics-Informed Neural Networks.” *Journal of
      Computational Physics* 378: 686–07.

   .. container:: csl-entry
      :name: ref-satyanarayanan2017

      Satyanarayanan, Mahadev. 2017. “The Emergence of Edge Computing.”
      *Computer* 50 (1): 30–39.

   .. container:: csl-entry
      :name: ref-shi2016

      Shi, Weisong, Jie Cao, Quan Zhang, Zhongwei Li, and Lanyu Xu.
      2016. “Edge Computing: Vision and Challenges.” *IEEE Internet of
      Things Journal* 3 (5): 637–46.

   .. container:: csl-entry
      :name: ref-vapnik1995

      Vapnik, Vladimir. 1995. *The Nature of Statistical Learning
      Theory*.

   .. container:: csl-entry
      :name: ref-vapnik1998

      Vapnik, Vladimir N. 1998. *Statistical Learning Theory*.
      Wiley-Interscience.

   .. container:: csl-entry
      :name: ref-wooldridge1995

      Wooldridge, Michael, and Nicholas R. Jennings. 1995. “Intelligent
      Agents: Theory and Practice.” *The Knowledge Engineering Review*
      10 (2): 115–52.

   .. container:: csl-entry
      :name: ref-zaharia2010

      Zaharia, Matei et al. 2010. “Spark: Cluster Computing with Working
      Sets.” *HotCloud*.