Show HN: I built an ISP infrastructure emulator from scratch with a custom vBNG
Posted by saphalpdyl 6 hours ago
Demo: https://aether.saphal.me GitHub: https://github.com/saphalpdyl/Aether
Aether is a multi-BNG (Broadband Network Gateway) ISP infrastructure lab built almost from scratch that emulates IPoE IPv4 subscriber management end-to-end. It supports IPoE/Ipv4 networks and runs a python-based vBNG with RADIUS AAA, per-subscriber traffic shaping, and traffic simulation emulated on Containerlab. It is also my first personal networking project, built roughly over a month.
Motivations behind the project
I'm a CS sophomore. About three years ago, I was assigned, as an intern, to build a OSS/BSS platform for a regional ISP by myself without mentoring. Referencing demo.splynx.com , I developed most of the BSS side ( bookkeeping, accounting, inventory management ), but, in terms of networking, I managed to install and setup RADIUS and that was about it. I didn't have anyone to mentor me or ask questions to, so I had given up then.
Three years later, I decided to try cracking it again. This project is meant to serve as a learning reference for anyone who's been in that same position i.e staring at closed-source vendor stacks without proper guidance. This is absolutely not production-grade, but I hope it gives someone a place to start.
Architecture overview
The core component, the BNG, runs on an event-driven architecture where state changes are passed around as messages to avoid handling mutexes and locks. The session manager is the sole owner of the session state. To keep it clean and predictable, the direBNG never accepts external inputctly. The one exception is the Go RADIUS CoA daemon, which passes CoA messages in via IPC sockets. Everything the BNG produces(events, session snapshots) gets pushed to Redis Streams, where the bng-ingestor picks them up, processes them, and persists them.
Simulation and meta-configs
I am generating traffic through a simulator node that mounts the host's docker socket and runs docker exec commands on selected hosts. The topology.yaml used by Containerlab to define the network topology grows bigger as more BNG's and access nodes are added. So aether.config.yaml, a simpler configuration, is consumed by the configuration pipeline to generate the topology.yaml and other files (nginx.conf, kea-dhcp.conf, RADIUS clients.conf etc.)
Known Limitations
- Multiple veth hops through the emulated topology add significant overhead. Profiling with iperf3 (-P 10 -t 10, 9500 MTU, 24 vCPUs) shows BNG→upstream at ~24 Gbit/s, but host→BNG→upstream drops to ~3.5 Gbit/s. The 9500 MTU also isn't representative of real ISP deployments. This gets worse when the actual network is reintroduced capping my throughput to 1.6 Gbits/sec in local. - The circuit ID format (1/0/X) is non-standard. I simplified it for clarity. - No iBGP or VLAN support. - No Ipv6 support. I wanted to target IPv4 networks from the start to avoid getting too much breadth without a lot of depth.
Nearly everything I know about networking (except some sections from AWS) I learned building this. A lot was figured out on the fly, so engineers will likely spot questionable decisions in the codebase. I'd genuinely appreciate that feedback.
Questions
- Currently, the circuit where the user connects is arbitrarily decided by the demo user. In a real system with thousands of circuits, it'd be very difficult to properly assess which circuit the customer might connect to. When adding a new customer to a service, how does the operator decide, based on customer's location, which circuit to provide the service to ?
Comments
Comment by yjftsjthsd-h 2 hours ago
Comment by saphalpdyl 20 minutes ago
At some point, I did use Nokia SR Linux as my access node + relay, but had issues with configuration and Option 82. Later, I wrote one myself.
Comment by john_strinlai 1 hour ago
(or whichever operators group best fits your area. i only subscribe to NANOG, so cant speak to the activity/friendliness of the other groups. you can find a pretty comprehensive list here: https://nanog.org/resources/organizations-our-community/)
Comment by saphalpdyl 2 hours ago
A better and UX-friendly implementation would have been Netbox + aether.config.yaml -> configuration pipeline -> topology.yaml + <other generated files>.
Comment by nonameiguess 1 hour ago
On the other hand, building even a tiny subset but doing it yourself from scratch is a great way to learn. I made a very poor man's VM image builder for HyperV years back because Packer didn't have a builder for it at the time and that was a pretty interesting experience. Finally grokked the Windows object model and even though I still don't use it, I at least no longer jeer at PowerShell.
I'm interested in the answer to your question, too, but as a customer of an ISP. I don't work for one. I was the first owner of my house and when they hooked me into their network, whoever did messed up my neighbors badly, putting them on the wrong circuit and bleeding noise into adjacent neighborhoods. For three years, complaint calls would get our network cut by third-party contractors with no warning, then we'd have to call and get it reconnected. I don't know how they're supposed to do it, but know it can cause quite a mess when they do it wrong.
Comment by bikesharing 2 hours ago