Traditionally, the dynamic modeling of chemical processes has relied on first-principles models grounded in fundamental physics and chemistry laws. These models, primarily formulated through differential equations with constant parameters, enable the calculation of control actions optimizing process operations, taking both process and actuator limitations into account. However, the ever-evolving and nonlinear nature of chemical processes frequently calls for models with time-varying parameters. Motivated by these challenges, we have developed hybrid models that integrate system-agnostic first-principles dynamics with system-specific data-driven, time-varying parameters. Our hybrid modeling framework incorporates a recent innovation: attention-based time-series transformers (TSTs) coupled with positional encoding. This marks a pioneering venture into applying the transformer algorithm – a cornerstone in ChatGPT’s triumph - to nonlinear, time-varying processes. By analyzing data across both current and preceding time steps, the TST captures both immediate and historical changes in process states, granting a contextual insight on process dynamics, mirroring ChatGPT’s textual context understanding. This TST-based hybrid model identifies correlations between process parameters and state variables. Its versatility is evident as it adapts to a spectrum of models - from density function theory to computational fluid dynamics - and scales, spanning from laboratory to extensive industrial environments. We will present applications of this hybrid modeling and control architecture, showcasing its utility from labs to industrial processes, made possible through partnerships with leading chemical process enterprises.