Pushing packets in software is generally brutal but multicast/broadcast should be inherently easier. It's less "copy this packet 27 times" and more "instead of receiving 27 packets and sending 27 packets you receive 1 packet and send it 27 times before you remove it from memory". The "hard" part becomes dealing with the queues filling up because you're inherently able to churn out so much more data than you're able to receive vs unicast.