Envise is a general-purpose machine learning accelerator that combines photonics and transistor-based systems in a single, compact module. Using a silicon photonics processing core for most computational tasks, Envise provides offload acceleration for high performance AI inference workloads with never before seen performance and efficiency.
Envise provides flexible features for machine learning workloads:
- Common activation functions such as ReLU, GELU, sigmoid, tanh, and support for customizable activation functions
- Convolution acceleration for common filter sizes
- Flexible number format support including INT8, INT16, and bfloat16
- Dynamic scaling for maximum precision
- Matrix transform hardware including transpose, swap, reductions, and other custom data movement operators
- Flexcore modes for 256X256, 128X512, or dual core 128X256 operation
The Envise server features 16 Envise Chips in a 4-U server configuration with only 3kW power consumption. The server is a building block for a rack-scale Envise inference system that can run the largest neural networks developed to date at unprecedented performance — 3 times higher inferences/second than the Nvidia DGX-A100 with 7 times the inferences/second/Watt on BERT-Base with the SQuAD dataset.
of SRAM per Envise chip. Massive on-chip activation and weight storage enabling state-of-the-art neural network execution without leaving the processor.
400Gbps Lightmatter interconnect fabric per Envise chip — enabling large model scale-out. Running the most advanced neural networks on the planet.
RISC cores per Envise processor. Generic off-load capabilities.
Ultra-high performance out-of-order super-scalar processing architecture.
Deployment-grade reliability, availability, and serviceability features. Next generation compute with the reliability of standard electronics.
Standards-based host and interconnect interface. Revolutionary compute, standard communications.