Test with ECMP on Linux

I was reading an article about ECMP (Equal Cost Multipath) for traffic load sharing, and it brought back memories of my previous traffic engineering tests. It seems simple at first glance, but it’s actually more complex—especially when it comes to policy-based routing.

The challenge lies in determining traffic redirection and sharing in a session-wise connection, whether with or without NAT, across multiple links or circuits with different latencies. There’s also the complication of firewall interception with asymmetric return traffic. These factors make achieving ideal traffic load sharing quite difficult.

Of course, if tunneling is involved, things get simpler. It essentially blinds both endpoints and allows you to add two routes with the same metric in overlay routing. However, it doesn’t clearly explain why load-sharing performance behaves the way it does.

What about service enhancement? If the primary link becomes congested, should the secondary link pick up some of the traffic? That’s not exactly round-robin behavior—it would require active measurement and monitoring of the links. Maintaining session flow on the primary link while redirecting new flows to the secondary link sounds ideal, but it’s difficult to implement. For MPLS-TE, that’s straightforward—but what if you have two internet links, like one DIA (Direct Internet Access) and one mobile network? How would you handle that?

Well, just for fun, I haven’t done any serious measurements yet. But after setting up load sharing on my node, it seems to be working—though I haven’t really thought through the next steps. Running a Speedtest shows that the flows (by ports) are transmitting separately. Hmm… not ideal, but not bad either. But what about other applications? If they’re using two different IP addresses for outgoing traffic—ahhhh…

Let’s discuss this, bro.


Enable 2 Multipath load sharing
sudo ip route add default scope global \
nexthop via 192.168.X.X dev XXX weight 1 \
nexthop via 192.168.X.X dev XXX weight 1

For multipath routing, disabling connection tracking for locally-generated traffic helps
sudo sysctl -w net.netfilter.nf_conntrack_tcp_loose=0

Enable Layer 4 Hashing
sudo sysctl -w net.ipv4.fib_multipath_hash_policy=1

Enable IP Forwarding
sudo sysctl -w net.ipv4.ip_forward=1

Force More Aggressive Flow-Based Balancing:
Set rp_filter to 0 (disable reverse path filtering) so the kernel won’t drop asymmetric traffic

sudo sysctl -w net.ipv4.conf.all.rp_filter=0

Flush all route cache
sudo ip route flush cache

#ECMP #Linux #Internet #Routing #IP #Firewall #Tunneling #MPLS #trafficEngineering #ChatGPT

Looking glass function provided by RIPE Atlas?

I performed some traceroute tests using the public looking glass of another organization/provider. I found that some test functions, like Ping and Traceroute, were launched using RIPE Atlas probes. It looks impressive and kind of funny.

In the previous year, the provider developed a web interface and API to launch commands from their own PE (Provider Edge) or Internet BG (Border Gateway) routers and return the results. The geographical router list allows users to select region-based tests.

This seems to be a new method using RIPE Atlas, where queries can be made via an API. The web interface lets users select which probe to use for the measurement, deducting the web provider’s “RIPE Atlas Credits” for each test.

However, I’m wondering — since looking glass aims to provide insights into a specific network provider’s or AS owner’s network — if we’re using this method, why not just go to the official RIPE Atlas website to launch the test?

Well, I guess the more user-friendly web portal makes it easier for users.

Pingnetbox – http://www.pingnetbox.com

#ripe #atlas #lookingglass #measurement #ping #traceroute #test #internet #AS #chatgpt #proofreading

Starlink Satellites’ Movement Proven by Periodical Measurement – Part 2

Tuning the measurement to 5 minutes each, the result portal summarizes the data in a single file by RIPE Atlas Probe ID.

The results show a predictable pattern of latency changes with increases and decreases, which may indicate satellite movement. We assume that the latency between the ground station, CDN server, and client site remains constant (unless under a DDoS attack… um…).

With the current resources available on RIPE Atlas, can we compare country-based latency and service levels of Starlink? Ah, that should probably be done by the Starlink NOC…

https://www.bgptrace.com/atlas/starlink

#starlink #CDN #cloudflare #satellites #ping #latency #movement #probe #RIPE #atlas

Starlink Satellites’ Movement Proven by Periodical Measurement

Using the Atlas RIPE probe (what a great network measurement platform!), we selected the probe, which uses Starlink to continuously measure connections to CDN servers.

We assume that, no matter which Starlink satellites are passing over the area, the network service connection will still be provided to the same region. For example, if the satellites are crossing the US regions, it doesn’t matter which satellite; it will send data back to a US-based station on the ground.

The test seems a bit funny, but the latency trend appears to follow a pattern. It shows a progression from high latency to low latency and then back to high latency over time. Assuming that the ground station link is a fixed connection to destination CDN server, the latency remains constant. Therefore, the movement of the satellites affects the latency. When the latency decreases, it suggests that another satellite has taken over that area, and the roaming process is complete.

You can think of this like when you are using a mobile device. As you move from one cell site (A) to another (B), roaming occurs, which registers your device from Cell Site A to Cell Site B. This is a similar process with satellites.

Now, back to the Starlink client probe: if its location doesn’t change, then as the satellites move through space, the distance between the satellites and the probe site will increase, and latency can indicate this. When the latency decreases, we may assume that another satellite has taken over the service coverage (similar to the roaming process). This is because the satellites do not move backward.

Moreover, does the change in latency over time affect the user experience?
For instance, during a video or voice call, latency may fluctuate—increasing or decreasing.

However, live gameplay presents a different scenario. Unlike calls, it often relies on a stable connection. A fixed connection typically doesn’t exhibit the same fluctuating physical characteristics, making latency more predictable in gaming environments.

Currently, measurements are taken every 15 minutes. If we shorten this test period, we may get more accurate insights into this operation.

https://www.bgptrace.com/atlas/starlink

#starlink #satellites #probe #RIPE #atlas #internet #measurement #roaming #cellsite #cell #mobile #ping #latency

Enhance Internet performance by using the right Public DNS servers

If you are thinking of how to enhance your Internet performance, it is great that you can subscribe a higher bandwidth Internet service. But is it the right way?

No really.

Increasing the bandwidth cannot shortern the latency between you and the destination server. But You cannot control our provider’s network path.

Under the current Web server depolyment, using the Content Delivery Network to deliver the content to Internet is a comment way. However, your network provider’s DNS server may not response the optimial server for the request domain. Therefore, you cannot enjoy the lowest latency between you and the request server.

HOW to?

A little measurement you can do, you can try to make a query to several publilc DNS servers, the reponse result may not be the same. For Example, by 8.8.8.8 Google or 1.1.1.1 Cloudflare. Based on the result, a simple ping test you can perform and record the lower latency one. Finally, you can setup a bind server to forware the domain to that DNS server to have a better Internet performance.

Reference what our work. Feel Free to discuss.

https://www.bgptrace.com/DNS/running_result.html

#DNS #Internet #Measurement #Ping #Latency #CloudFlare #Google #1.1.1.1 #8.8.8.8