Gentian is an SRAM-first inference substrate for long-context transformer serving, with internal calibrated projections targeting one-to-two orders of magnitude lower energy per token in regimes where HBM capacity and scale-out coordination dominate serving cost.
Gentian is named after Gentiana, a wildflower widespread across Europe, typical of the Swiss Alps, and used in traditional medicine.
Problem
Current accelerators concentrate fast model state behind an HBM boundary. HBM improves bandwidth, but fast-memory capacity remains package-bound. At long context lengths, active state spills across packages and scale-out fabrics; the cost is paid on the critical path of serving.
Position
Gentian takes the opposite physical position: fast local memory and compute scale together across commodity logic tiles. Each added tile contributes local memory and compute capacity. The design target is bounded-latency inference at context lengths where monolithic HBM packages become structurally inefficient.
Efficiency target
Internal calibrated projections indicate one-to-two orders of magnitude lower energy per token for selected long-context inference regimes compared with HBM-centric scale-out. Detailed assumptions, calibration artifacts, comparator methodology, and implementation evidence are reviewed only under mutual NDA.
Architecture boundary
Public materials describe Gentian at the substrate level only. The detailed execution model, protocol structure, scheduling semantics, RTL evidence, simulation archive, and physical-design reports are provided only under mutual NDA.
Public-level properties:
- SRAM-first distributed substrate.
- Commodity logic tiles.
- Local memory scales with tile count.
- No HBM on the critical path.
- Designed for long-context transformer serving.
- Engineering review available under NDA.
Diligence package
Gentian is an active cybiont engineering programme. The diligence package available under mutual NDA includes:
- Architecture manuscript.
- Reference-model and RTL parity summary.
- FPGA bring-up status.
- Physical-design artifact summary.
- Simulation archive index.
- IP and FTO discussion through counsel.
What is not claimed publicly
cybiont does not make public production-silicon performance claims on this page. Commercial-node projections, detailed benchmark methodology, protocol mechanics, and implementation evidence are reviewed only in controlled technical diligence.
Strategic diligence for AI-inference infrastructure
Gentian addresses the memory-capacity and data-movement limits of HBM-centric AI inference.
Architecture, RTL, FPGA bring-up, physical-design evidence, simulation artifacts, and IP/FTO discussion are available under mutual NDA to semiconductor companies, foundries, hyperscalers, and strategic deep-tech investors.