Skip to main content

Overview

Mango BoostX™ RoCE AI is a next-generation FPGA-based RoCEv2 accelerator card engineered for large-scale AI, ML, and HPC environments. It overcomes the scalability and flexibility limitations of traditional RNICs by delivering 400GbE line-rate performance with seamless GPU peer-to-peer communication and full interoperability with existing data center infrastructure. Beyond standard RDMA capabilities, Mango BoostX™ RoCE AI incorporates advanced, configurable congestion management precisely tailored to user environments, maximizing fabric-wide bandwidth utilization for large-scale distributed AI and HPC workloads.

Highlights

High-Performance RDMA Solution

  • Deliver 2x200GbE line-rate performance
  • Support peer-to-peer communication with GPUs (e.g., GPUDirect RDMA, ROCmRDMA)

Interoperability and Standard Compatibility

  • Comply with major Linux distributions
  • Interoperate with commercial RNICs and network switches

Scalable AI Networking

  • Provide advanced features to optimize efficiency in large-scale network environments
  • Offer an easy-to-use SDK to implement tailored congestion management

Supported Hardware

RoCE AI 2x200GbE (HHHL)

Mango BoostX RoCE AI board

Hardware Specification

Network Interface

  • 1x QSFP-DD port
  • 2x 200GbE support
  • 8x lanes of PAM-4/NRZ Serdes
  • Support active and passive cables

Host Interface

  • 2x PCIe Gen5 x8 (PCIe bifurcated)

Form Factor

  • HHHL, single slot
  • PCIe add-in card

Processing Unit

  • 2x Arm Cortex-A72
  • 2x Arm Cortex-R5F

Memory

  • 8GB LPDDR4, ECC support
  • 256MB OSPI flash
  • 64GB eMMC flash

Management

  • PCIe in-band management
  • MCTP over SMBus
  • FRU (Field Replaceable Unit)
  • UART

Environmental

  • Typical Power Consumption: 55W (with full RDMA performance & passive cable)
  • 12V, 3.3V, 3.3V_AUX input voltage via PCIe Gold Finger
  • Operating Temperature: 0°C to 55°C
  • Operating Relative Humidity: 20% to 80%
  • Storage Temperature: -20°C to 60°C
  • Storage Relative Humidity: 10% to 90%

Regulatory

  • FCC/CE/KC
  • cTUVus
  • RoHS

RDMA Features

  • GPU-RNIC peer-to-peer communication (AMD/NVIDIA)
  • Reliable Connection (RC) and Unreliable Datagram (UD) QPs
  • RDMA read/write/write with immediate/send/recv operations
  • IPv4 support
  • Configurable MTU size
  • Zero-length operations
  • Event-based CQ handling in user-space

The table below describes the number of available resources. Effective count refers to the number of resources that a user can actually use after the driver is loaded.

ResourcePer InterfaceTotal
Max QPs5121024
Effective max QPs5081016
Max CQs5121024
Effective max CQs5081016
Max MRs10242048
Effective max MRs10232046
Max PDs10242048
Effective max PDs10232046
Max SRQs256512
Max SGEs per WRN/A2

Advanced RDMA Features for Scalable AI Networking

  • Configuration-free RoCEv2: Automated congestion management without complex infrastructure tuning
  • Packet spraying: Maximizes fabric bandwidth by distributing packets across multiple paths
  • Selective retransmission: Improves network efficiency through faster recovery from packet loss
  • Programmable congestion control: Flexible, user-defined algorithms tailored to specific network environments

See Advanced Features for AI Networking for usage details.