Skip to content

Gentian — Distributed AI Inference

Gentian tile: SRAM-dominated surface with a logic strip along one edge, symmetric gold bond pads on all four sides, cyan mesh-connection traces extending at the edges.

Gentian is a mesh-able custom silicon architecture for transformer inference.

The memory-wall context

Transformer inference is memory-bound during decode. A 70-billion-parameter model with a long-context KV cache moves hundreds of gigabytes across off-chip memory for every generated token. On GPU-class hardware, most of the energy is spent moving state, not computing on it. High-bandwidth memory (HBM) and advanced packaging (2.5D interposers, CoWoS) mitigate the bandwidth cost; they do not remove the underlying fact that compute and memory are separate silicon, connected by a link that costs an order of magnitude more per byte than an on-die access.

A distributed inference architecture that keeps parameters and activations where the computation happens — rather than shuttling them across a memory boundary — trades an external bandwidth problem for an internal coordination problem.

The architecture

Gentian is a tile-mesh where each tile holds its share of the model state in on-die SRAM and tiles compose via a regular mesh across die boundaries on an ordinary PCB. The design commits to four engineering invariants:

  • Regular mesh topology. Nearest-neighbour links only; no crossbar, no global bus.
  • Master-free peer-to-peer coordination. No central scheduler or arbiter.
  • Stationary data. Weights and activations stay on their tile; only small traversal objects move.
  • Clock-velocity interconnect. Inter-die hops at the CPU clock rate, not via packetised SerDes.

Together the four commitments put wall-clock runtime on the causal lower bound of parallel computation, and make the architecture node-agnostic and portable across commercial advanced-node processes.

Project state

Milestone State
End-to-end execution A real trained transformer runs on the mesh
RTL Validated against C++ reference
Physical design Place-and-route sign-off reached
Architecture paper Preprint in preparation
Simulation archive Available to partners under NDA
Commercial-node tape-out Foundry partner selection in progress

Engagement

Technical briefing, architecture manuscript, PnR reports, and the simulation archive are provided under mutual NDA. A partnership conversation moves directly to engineering review. Broader research context at cybiont Research.

Request a technical briefing →