Test with ECMP on Linux

I was reading an article about ECMP (Equal Cost Multipath) for traffic load sharing, and it brought back memories of my previous traffic engineering tests. It seems simple at first glance, but it’s actually more complex—especially when it comes to policy-based routing.

The challenge lies in determining traffic redirection and sharing in a session-wise connection, whether with or without NAT, across multiple links or circuits with different latencies. There’s also the complication of firewall interception with asymmetric return traffic. These factors make achieving ideal traffic load sharing quite difficult.

Of course, if tunneling is involved, things get simpler. It essentially blinds both endpoints and allows you to add two routes with the same metric in overlay routing. However, it doesn’t clearly explain why load-sharing performance behaves the way it does.

What about service enhancement? If the primary link becomes congested, should the secondary link pick up some of the traffic? That’s not exactly round-robin behavior—it would require active measurement and monitoring of the links. Maintaining session flow on the primary link while redirecting new flows to the secondary link sounds ideal, but it’s difficult to implement. For MPLS-TE, that’s straightforward—but what if you have two internet links, like one DIA (Direct Internet Access) and one mobile network? How would you handle that?

Well, just for fun, I haven’t done any serious measurements yet. But after setting up load sharing on my node, it seems to be working—though I haven’t really thought through the next steps. Running a Speedtest shows that the flows (by ports) are transmitting separately. Hmm… not ideal, but not bad either. But what about other applications? If they’re using two different IP addresses for outgoing traffic—ahhhh…

Let’s discuss this, bro.


Enable 2 Multipath load sharing
sudo ip route add default scope global \
nexthop via 192.168.X.X dev XXX weight 1 \
nexthop via 192.168.X.X dev XXX weight 1

For multipath routing, disabling connection tracking for locally-generated traffic helps
sudo sysctl -w net.netfilter.nf_conntrack_tcp_loose=0

Enable Layer 4 Hashing
sudo sysctl -w net.ipv4.fib_multipath_hash_policy=1

Enable IP Forwarding
sudo sysctl -w net.ipv4.ip_forward=1

Force More Aggressive Flow-Based Balancing:
Set rp_filter to 0 (disable reverse path filtering) so the kernel won’t drop asymmetric traffic

sudo sysctl -w net.ipv4.conf.all.rp_filter=0

Flush all route cache
sudo ip route flush cache

#ECMP #Linux #Internet #Routing #IP #Firewall #Tunneling #MPLS #trafficEngineering #ChatGPT

How do you troubleshoot a network problem? Cabling? Configuration?

As a Wide Area Network (WAN), the circuit provided by telecom backhaul between two endpoints—whether it’s point-to-point between two sites (EVPL, SDH) or customer site to provider PE (Internet, IPVPN, VPLS, etc.)—should be connected to the provider’s equipment or router devices to deliver the service. If you’re referring to dark fiber in a limited area… um… okay, next.

How do you verify the circuit service? Check your site router configuration? Check your IP routing?

The basic mindset: I believe we should start by checking the cabling. Yes, Layer 1, isn’t it?

If your port is UP and able to send and receive packets, WELL, at least confirm both endpoint IP addresses and perform a ping test. (Yes, a PING test—please don’t tell me you don’t know what PING is.)

PIC from #Google

From past experience, field engineers often argue that the device configuration is incorrect, but guess what? The issue ends up being the WRONG port connected.

Therefore, HOW IMPORTANT IS PHOTO CAPTURE!!!!!!!

What if the ping fails? Yes, it happens—cable quality issues, loose connectors, poor signaling, etc.

Have you ever checked the DUPLEX setting????????????? Confirm both ends have the SAME duplex setting!!

PIC from Cisco Press

Then you’ll mention bandwidth: “Speedtest.com, huh? Why can’t I get full bandwidth?!”

Please understand: we cannot guarantee a test server will allocate all resources for your test. The Internet is unmanaged, and you need to be aware of overhead and your device’s processing power. Do you really think your mobile can hit 2Gbps over Wi-Fi, bro?

For standard testing, running tests between the client site and the ISP backhaul provides a great reference for your service quality—this is typically done during installation.

But anyway… PLEASE confirm the cabling is correct before spending too much time checking the configuration. Start with Layer 1 first!

#circuit #physical #cabling #ISP #provider #Internet #EVPL #IPLC #IPVPN #P2P #IP #testing #ping #bandwidth #speedtest #traffic #packetlost #duplexing #router #WIFI

AI Network Operator – under Deepseek case

We all know how successful Deepseek has been in recent months. It demonstrates that a low-processing-power, CPU-based AI is possible. Adopting this type of AI anywhere, including IoT devices or even routers, could be feasible.

Cisco, Juniper, Arista, and other network device manufacturers already produce hardware with high processing power. Some of these devices run Linux- or Unix-based platforms, allowing libraries and packages to be installed on the system. If that’s the case, can AI run on them?

Based on Deepseek’s case, tests have shown that an ARM Linux-based Raspberry Pi can successfully run AI. Although the response time may not meet business requirements, it still functions.

Running AI on a router (perhaps within the control plane?) could enable AI to control and modify router configurations. (Skynet? Terminator?) But then, would the AI become uncontrollable?

There are several key questions to consider:

  1. What can AI do on routers and firewall devices?
  2. Can AI self-learn the network environment and take further control?
  3. Can AI troubleshoot operational issues?

It seems like an interesting topic for further research. However, before diving deeper, teaching AI about network operations should no longer be a major concern.

Paragraph proofreading by #ChatGPT

AI Picture generated by #CANVA

#AI #Network #internet #networkoperation #operation #IP #Router #RaspberryPI #PI #Cisco #Juniper #Arista #opensource #BGP #routing

Enhance Internet performance by using the right Public DNS servers

If you are thinking of how to enhance your Internet performance, it is great that you can subscribe a higher bandwidth Internet service. But is it the right way?

No really.

Increasing the bandwidth cannot shortern the latency between you and the destination server. But You cannot control our provider’s network path.

Under the current Web server depolyment, using the Content Delivery Network to deliver the content to Internet is a comment way. However, your network provider’s DNS server may not response the optimial server for the request domain. Therefore, you cannot enjoy the lowest latency between you and the request server.

HOW to?

A little measurement you can do, you can try to make a query to several publilc DNS servers, the reponse result may not be the same. For Example, by 8.8.8.8 Google or 1.1.1.1 Cloudflare. Based on the result, a simple ping test you can perform and record the lower latency one. Finally, you can setup a bind server to forware the domain to that DNS server to have a better Internet performance.

Reference what our work. Feel Free to discuss.

https://www.bgptrace.com/DNS/running_result.html

#DNS #Internet #Measurement #Ping #Latency #CloudFlare #Google #1.1.1.1 #8.8.8.8