Trading systems depend on market data arriving at extremely low and predictable latency. Multicast lets the exchange send one copy of each market data packet while the network replicates it to many subscribers with hardware-level fanout.
There is no retransmission, no back pressure, and no per-connection overhead, which avoids latency spikes. That makes multicast the only practical way to deliver large-volume, real-time market data consistently across many servers.
In an HFT environment, jitter is worse than packet loss. Traders need every server to see the same tick at nearly the same microsecond.
TCP introduces retransmissions, congestion control, and per-flow overhead, making it impossible to scale market data to dozens of hosts with stable latency. Multicast solves this by delivering one stream with deterministic fanout and zero retransmission, letting trading infrastructure maintain predictable, low-latency behavior while handling massive market data volumes.
下面是整个 multicast 生命周期的 5 个步骤:
1️⃣ Receivers → 用 IGMP join group(例如 239.1.1.1)
2️⃣ Leaf switch → IGMP report 转成 PIM join
3️⃣ PIM-SM 在 underlay 上构建 shared tree (RPT: Rendezvous Point Tree)
4️⃣ Source 发送数据 → 经 RP 转发 or switch to SPT (Shortest Path Tree)
5️⃣ leaf → 把数据复制给所有 receivers(在 ASIC 里完成)
这个过程你要非常清楚,因为面试里一定会问你“PIM-SM 具体怎么构建树”。
Market data multicast 架构:
Exchange → cross-connect → core → spine/leaf → feed handlers → trading servers
IGMP at edge, PIM-SM in core
多 receivers + recovery feed
Reliable multicast design:
简单、冗余的 L3 underlay(spine-leaf, ECMP)
IGMP + PIM-SM,Anycast-RP / RP 冗余
双 feed / 多路径
强 observability(sequence gaps, latency, PIM/IGMP health)
定期 failover 演练 + runbooks
EVPN/VXLAN in colo:
用来做 overlay / L2 extension / tenant 分割
BGP EVPN + VXLAN VTEP on leaves
常规做法:market data multicast 跑 underlay,不走 overlay
overlay 用于内部服务、管理、隔离