Test with ECMP on Linux – Part 2

Continuing from the previous article on ECMP on a Linux machine, the goal is to enhance overall performance, but it will always be limited by the session-wise characteristics of traffic.

Surprisingly, something interesting happens during a multiple-flow Speedtest. When a single test generates multiple traffic flows using different port numbers (Layer 4), the aggregated result shows a higher combined throughput—regardless of whether the traffic is routed through two different public IPs with NAT.

For example, consider two links:

  1. 150Mbps Download / 30Mbps Upload (DIA)
  2. 30Mbps Download / 30Mbps Upload (DIA via public WiFi)

A Linux gateway is configured to use ECMP with two next-hop routes pointing to these links. When the traffic test starts, the portal or app will display only one of the link’s public IP addresses. However, the test results show 170+ Mbps download and 40+ Mbps upload. WOOOO!!!

Of course, this is just a traffic test—similar to running iPerf with multiple flow tests and aggregating the results. So, yes, it’s possible!

I believe some vendors use similar techniques for load sharing. Since it’s Linux-based, that shouldn’t be an issue.

Any other ideas? I’m also thinking about UDP-based video streaming—should we be considering application-layer optimization?

#ECMP #internet #NAT #IP #loadsharing #BGP #DIA #Speedtest #IPERF #measurement #traffictest

Test with ECMP on Linux

I was reading an article about ECMP (Equal Cost Multipath) for traffic load sharing, and it brought back memories of my previous traffic engineering tests. It seems simple at first glance, but it’s actually more complex—especially when it comes to policy-based routing.

The challenge lies in determining traffic redirection and sharing in a session-wise connection, whether with or without NAT, across multiple links or circuits with different latencies. There’s also the complication of firewall interception with asymmetric return traffic. These factors make achieving ideal traffic load sharing quite difficult.

Of course, if tunneling is involved, things get simpler. It essentially blinds both endpoints and allows you to add two routes with the same metric in overlay routing. However, it doesn’t clearly explain why load-sharing performance behaves the way it does.

What about service enhancement? If the primary link becomes congested, should the secondary link pick up some of the traffic? That’s not exactly round-robin behavior—it would require active measurement and monitoring of the links. Maintaining session flow on the primary link while redirecting new flows to the secondary link sounds ideal, but it’s difficult to implement. For MPLS-TE, that’s straightforward—but what if you have two internet links, like one DIA (Direct Internet Access) and one mobile network? How would you handle that?

Well, just for fun, I haven’t done any serious measurements yet. But after setting up load sharing on my node, it seems to be working—though I haven’t really thought through the next steps. Running a Speedtest shows that the flows (by ports) are transmitting separately. Hmm… not ideal, but not bad either. But what about other applications? If they’re using two different IP addresses for outgoing traffic—ahhhh…

Let’s discuss this, bro.


Enable 2 Multipath load sharing
sudo ip route add default scope global \
nexthop via 192.168.X.X dev XXX weight 1 \
nexthop via 192.168.X.X dev XXX weight 1

For multipath routing, disabling connection tracking for locally-generated traffic helps
sudo sysctl -w net.netfilter.nf_conntrack_tcp_loose=0

Enable Layer 4 Hashing
sudo sysctl -w net.ipv4.fib_multipath_hash_policy=1

Enable IP Forwarding
sudo sysctl -w net.ipv4.ip_forward=1

Force More Aggressive Flow-Based Balancing:
Set rp_filter to 0 (disable reverse path filtering) so the kernel won’t drop asymmetric traffic

sudo sysctl -w net.ipv4.conf.all.rp_filter=0

Flush all route cache
sudo ip route flush cache

#ECMP #Linux #Internet #Routing #IP #Firewall #Tunneling #MPLS #trafficEngineering #ChatGPT

Looking glass function provided by RIPE Atlas?

I performed some traceroute tests using the public looking glass of another organization/provider. I found that some test functions, like Ping and Traceroute, were launched using RIPE Atlas probes. It looks impressive and kind of funny.

In the previous year, the provider developed a web interface and API to launch commands from their own PE (Provider Edge) or Internet BG (Border Gateway) routers and return the results. The geographical router list allows users to select region-based tests.

This seems to be a new method using RIPE Atlas, where queries can be made via an API. The web interface lets users select which probe to use for the measurement, deducting the web provider’s “RIPE Atlas Credits” for each test.

However, I’m wondering — since looking glass aims to provide insights into a specific network provider’s or AS owner’s network — if we’re using this method, why not just go to the official RIPE Atlas website to launch the test?

Well, I guess the more user-friendly web portal makes it easier for users.

Pingnetbox – http://www.pingnetbox.com

#ripe #atlas #lookingglass #measurement #ping #traceroute #test #internet #AS #chatgpt #proofreading

Starlink Satellites’ Movement Proven by Periodical Measurement – Part 2

Tuning the measurement to 5 minutes each, the result portal summarizes the data in a single file by RIPE Atlas Probe ID.

The results show a predictable pattern of latency changes with increases and decreases, which may indicate satellite movement. We assume that the latency between the ground station, CDN server, and client site remains constant (unless under a DDoS attack… um…).

With the current resources available on RIPE Atlas, can we compare country-based latency and service levels of Starlink? Ah, that should probably be done by the Starlink NOC…

https://www.bgptrace.com/atlas/starlink

#starlink #CDN #cloudflare #satellites #ping #latency #movement #probe #RIPE #atlas

Starlink Satellites’ Movement Proven by Periodical Measurement

Using the Atlas RIPE probe (what a great network measurement platform!), we selected the probe, which uses Starlink to continuously measure connections to CDN servers.

We assume that, no matter which Starlink satellites are passing over the area, the network service connection will still be provided to the same region. For example, if the satellites are crossing the US regions, it doesn’t matter which satellite; it will send data back to a US-based station on the ground.

The test seems a bit funny, but the latency trend appears to follow a pattern. It shows a progression from high latency to low latency and then back to high latency over time. Assuming that the ground station link is a fixed connection to destination CDN server, the latency remains constant. Therefore, the movement of the satellites affects the latency. When the latency decreases, it suggests that another satellite has taken over that area, and the roaming process is complete.

You can think of this like when you are using a mobile device. As you move from one cell site (A) to another (B), roaming occurs, which registers your device from Cell Site A to Cell Site B. This is a similar process with satellites.

Now, back to the Starlink client probe: if its location doesn’t change, then as the satellites move through space, the distance between the satellites and the probe site will increase, and latency can indicate this. When the latency decreases, we may assume that another satellite has taken over the service coverage (similar to the roaming process). This is because the satellites do not move backward.

Moreover, does the change in latency over time affect the user experience?
For instance, during a video or voice call, latency may fluctuate—increasing or decreasing.

However, live gameplay presents a different scenario. Unlike calls, it often relies on a stable connection. A fixed connection typically doesn’t exhibit the same fluctuating physical characteristics, making latency more predictable in gaming environments.

Currently, measurements are taken every 15 minutes. If we shorten this test period, we may get more accurate insights into this operation.

https://www.bgptrace.com/atlas/starlink

#starlink #satellites #probe #RIPE #atlas #internet #measurement #roaming #cellsite #cell #mobile #ping #latency

How do you troubleshoot a network problem? Cabling? Configuration?

As a Wide Area Network (WAN), the circuit provided by telecom backhaul between two endpoints—whether it’s point-to-point between two sites (EVPL, SDH) or customer site to provider PE (Internet, IPVPN, VPLS, etc.)—should be connected to the provider’s equipment or router devices to deliver the service. If you’re referring to dark fiber in a limited area… um… okay, next.

How do you verify the circuit service? Check your site router configuration? Check your IP routing?

The basic mindset: I believe we should start by checking the cabling. Yes, Layer 1, isn’t it?

If your port is UP and able to send and receive packets, WELL, at least confirm both endpoint IP addresses and perform a ping test. (Yes, a PING test—please don’t tell me you don’t know what PING is.)

PIC from #Google

From past experience, field engineers often argue that the device configuration is incorrect, but guess what? The issue ends up being the WRONG port connected.

Therefore, HOW IMPORTANT IS PHOTO CAPTURE!!!!!!!

What if the ping fails? Yes, it happens—cable quality issues, loose connectors, poor signaling, etc.

Have you ever checked the DUPLEX setting????????????? Confirm both ends have the SAME duplex setting!!

PIC from Cisco Press

Then you’ll mention bandwidth: “Speedtest.com, huh? Why can’t I get full bandwidth?!”

Please understand: we cannot guarantee a test server will allocate all resources for your test. The Internet is unmanaged, and you need to be aware of overhead and your device’s processing power. Do you really think your mobile can hit 2Gbps over Wi-Fi, bro?

For standard testing, running tests between the client site and the ISP backhaul provides a great reference for your service quality—this is typically done during installation.

But anyway… PLEASE confirm the cabling is correct before spending too much time checking the configuration. Start with Layer 1 first!

#circuit #physical #cabling #ISP #provider #Internet #EVPL #IPLC #IPVPN #P2P #IP #testing #ping #bandwidth #speedtest #traffic #packetlost #duplexing #router #WIFI

Model Training on AMD 16-core CPU with 8GB RAM running in a virtual machine for Bitcoin Price Prediction – Part 2 – Updated

Continuing with Over 500,000+ Data Points for Bitcoin (BTC) Price Prediction

Using the Python program, the first method I tried was SVR (Support Vector Regression) for prediction. However… how many steps should I use for prediction? 🤔

Previously, I used a Raspberry Pi 4B (4GB RAM) for prediction, and… OH… 😩
I don’t even want to count the time again. Just imagine training a new model on a Raspberry Pi!

So, I switched to an AMD 16-core CPU with 8GB RAM running in a virtual machine to perform the prediction.

  • 60 steps calculation: Took 7 hours 😵
  • 120 steps: …Man… still running after 20 hours! 😫 Finally !!! 33 Hours

Do I need an M4 machine for this? 💻⚡

ChatGPT provided another approach.
OK, let’s test it… I’ll let you know how it goes! 🚀

🧪 Quick Example of More Time Steps Effect

Time Step (X Length)Predicted AccuracyNotes
30⭐⭐⭐Quick but less accurate for long-term trends.
60⭐⭐⭐⭐Balanced context and performance.
120⭐⭐⭐⭐½Better for long-term trends but slower.
240⭐⭐Risk of overfitting and slower training.

#SVR #Prediction #Computing #AI #Step #ChatGPT #Python #Bitcoin #crypto #Cryptocurrency #trading #price #virtualmachine #vm #raspberrypi #ram #CPU #CUDB #AMD #Nvidia

Model Training Using TensorFlow on Raspberry Pi 4B (4GB RAM) for Bitcoin Price Prediction

The development of a CRYPTO gaming system https://www.cryptogeemu.com/ has been ongoing for around two years. What does it actually do? Well… just for fun!

The system captures data from several major crypto market sites to fetch the latest price list every minute. It then calculates the average values to determine the price. Users can create a new account and are given a default balance of $10,000 USD to buy and sell crypto—but there’s no actual real-market trading.

The Thought Process

Suddenly, I started wondering:
How can I use this kind of historical data? Can I make a prediction?

So, I simply asked ChatGPT about my idea. I shared the data structure and inquired about how to perform predictions.

ChatGPT first suggested using Linear Regression for calculations. However, the predicted values had a large difference compared to the next actual data point.

Next, it introduced me to the Long Short-Term Memory (LSTM) method for training under the TensorFlow library.

I fed 514,709 lines of BTC price data into the training program on a Raspberry Pi 4B (4GB RAM).
The first run took 7 hours to complete the model !!!!!!!!!!!!!!!!!

But the result… um… 😐

I’m currently running the second round of training. I’ll update you all soon!

Sample Data:

YYYY/MM/DD-hh:mm:ss  Price  
2025/02/17-20:06:09 95567.20707189501
2025/02/17-20:07:07 95582.896334665

P.S.: I’m not great at math. 😅

#BTC #Bitcoin #TensorFlow #AI #CryptoGeemu #RaspberryPi #Training #Crypto #ChatGPT #LinearRegression #LSTM #LongShortTermMemory

AI Network Operator – under Deepseek case

We all know how successful Deepseek has been in recent months. It demonstrates that a low-processing-power, CPU-based AI is possible. Adopting this type of AI anywhere, including IoT devices or even routers, could be feasible.

Cisco, Juniper, Arista, and other network device manufacturers already produce hardware with high processing power. Some of these devices run Linux- or Unix-based platforms, allowing libraries and packages to be installed on the system. If that’s the case, can AI run on them?

Based on Deepseek’s case, tests have shown that an ARM Linux-based Raspberry Pi can successfully run AI. Although the response time may not meet business requirements, it still functions.

Running AI on a router (perhaps within the control plane?) could enable AI to control and modify router configurations. (Skynet? Terminator?) But then, would the AI become uncontrollable?

There are several key questions to consider:

  1. What can AI do on routers and firewall devices?
  2. Can AI self-learn the network environment and take further control?
  3. Can AI troubleshoot operational issues?

It seems like an interesting topic for further research. However, before diving deeper, teaching AI about network operations should no longer be a major concern.

Paragraph proofreading by #ChatGPT

AI Picture generated by #CANVA

#AI #Network #internet #networkoperation #operation #IP #Router #RaspberryPI #PI #Cisco #Juniper #Arista #opensource #BGP #routing

The latency between satellites and CDN. What if CDN at Space?

Referencing some studies on Starlink and SpaceX, this is a great example of low-Earth orbit (LEO) satellite technology providing high-bandwidth network access. However, as you know, no matter how large the bandwidth, latency remains one of the key factors affecting user experience and application traffic performance.

Moreover, satellites are linked to ground stations, which then connect to Internet peering or exchange points to retrieve the required data via traffic routing. This total latency may not always be predictable due to satellite movement, variations in the distance between the user’s access antenna and the satellite, and the routing path between the ground station and the client machine.

Now, imagine if a CDN node were in space—embedded within the satellite itself. If a satellite operated as a Layer 3 router gateway, could we integrate a server farm with SSD storage to provide caching and content delivery services?

#ripe #atlas #starlink #cloudflare #CDN #latency

https://bgptrace.com/atlas/starlink

[1] Poster: Twinkle, Twinkle, Streaming Star: Illuminating CDN Performance over Starlink, Nitinder Mohan – Delft University of Technology – Delft, Netherlands, Rohan Bose – Technical University of Munich – Munich, Germany, Jörg Ott – Technical University of Munich – Munich, Germany, IMC ’24, November 4–6, 2024, Madrid, Spain https://www.nitindermohan.com/documents/2024/pubs/leoCDN_IMC2024_poster.pdf