cdop.pt

How to restart the GL-MT300Nv1's WISP connection automatically

(published: 2023-12-16)

Let's get rid of an annoyance by teaching a Wi-Fi router to self-soothe when it is abandoned by its uplink counterpart.

Pictured above is the first generation of the GL.iNet MT300N. I have owned two of these for a few years now. For their price tag at the time of purchase, they're quite respectable pieces of silicon.

The range is less than stellar, which is to be expected, since this unit is advertised as a travel router and therefore has no external antenna to speak of. I just deal with it, because I can't be arsed to mod the thing (or research if that's even possible). As stated, it's a good Wi-Fi router for the price, except for one annoying quirk which can be mitigated using the procedure shown later in this post.

Most of the versatility of the MT300N comes from the fact that it can run OpenWrt, allowing a competent hacker to perform all kinds of antics with it. Due to the flexibility of it's operating system, this yellow half-cube is capable of being much more than the thing you put between your devices and the (probably misconfigured) free Wi-Fi from some overpriced hotel in New York.

Siting quietly above the laptop-turned-server that feeds it power over one of it's USB ports and very far away from overpriced hotels, my MT300N serves two functions. First, it is this network's primary router. It handles all the run-of-the-mill gateway + router + switch stuff as well as a few other functions which are out of scope for this post. But it also provides a safety buffer between the devices on the network and the mediocre (and unpatched) 3G/4G modem-router combo provided by the ISP, which I trust about as much as a timeshare sales rep.

I need both of the RJ-45 ports for the internal network, so the MT300N is configured to run in WISP mode with the ISP supplied router uplink. It is precisely in this configuration that we encounter our annoyance.

When running in WISP and AP mode simultaneously, the MT300N's uplink Wi-Fi connection will hang randomly. I don't have enough data to determine if this is in any way dependent on what is on the other end of the uplink connection. I have not seen any exceptions to this rule so far, and upgrading OpenWrt didn't seem to help either. At first glance, looks like a hardware bug. But I can't say for sure what causes the problem. The issue is not easy to replicate willingly anyway.

Once this happens, the Wi-Fi interfaces on the unit will cease to function normally. Even if the uplink connection is reestablished, the AP side of the interface will be down or in a restart loop preventing any wireless devices from connecting to the network. The MT300N will remain in this state and continue to malfunction until human help arrives.

Correct operation can be restored with a reboot, or by bringing the Wi-Fi interfaces down, then up again. Doing either of these manually is tedious, slow and quickly frustrates whoever has to go upstairs and do it, particularly if it happens more than once in a given day. But we can automate ourselves out of this with very little effort.

We just need three ingredients: a way to reset the Wi-Fi interfaces to a working state programmatically, a way to check if the uplink connection is healthy, and some shell glue.

Luckily, on OpenWrt, running

/sbin/wifi

(just like that, no parameters) has the exact effect we want on the Wi-Fi interfaces. It fully reconfigures them, restoring both the uplink and internal Wi-Fi networks to a working state.

With that taken care of, the easiest way to determine if the interfaces need to be reconfigured is to ping the uplink router with a relatively short timeout:

# send one packet, wait at most 2 seconds:
/bin/ping -c 1 -W 2 $uplink_ipaddr

In a quiet network like this one, if a reply does not arrive in a couple of seconds we can be almost one hundred percent sure it won't arrive at all.

And now we just need that bit of shell glue, to perform the health checks a few times every minute and act when the malfunction is detected:

#!/bin/sh

uplink_ipaddr="change this"

while true; do
  failcount=0

  echo failcount $failcount

  echo pinging...

  until /bin/ping -c 1 -W 2 $uplink_ipaddr ; do
    failcount=$((failcount+1))

    echo failcount $failcount

    if [ x"$failcount" = x"4" ]; then
      /sbin/wifi

      sleep 30
      break
    fi

    echo waiting...

    sleep 8
  done

  echo waiting...

  sleep 10
done

For those who can't read shell, here's how this works:

  1. Reset the failure counter.
  2. Try to reach to the uplink router.
  3. If 2) succeeds, skip the failure count loop. Wait 10 seconds and go to step 1).
  4. If 2) fails, increment the failure counter. Wait 10 (2+8) seconds and repeat step 2).
  5. On the fourth consecutive failure, or about 30 seconds since the connection was first detected to be lost, reconfigure all wireless interfaces. Stop the script for another 40 (30+10) seconds to prevent false negatives while the interfaces restart. Then go back to step 1).

This sequence repeats forever. We don't act on the first failure to reach the uplink router because that can be a result of random packet loss, in which case the probability that the next health checks will fail for the same reason is essentially zero (again, this is a quiet network). Obviously, the maximum number of failed checks and all the dead wait times in the script are subject to adjustment according to taste. For the sake of clarity: I made no attempt to optimize any of those values.

Save the script somewhere in the router's file system and make it executable:

chmod +x /root/wifi-auto-restart.sh

Finally, add a line like the following to /etc/rc.local so that the script runs automatically when the router starts. This can be done on the shell or via LuCI in System -> Startup:

# file: /etc/rc.local

/root/wifi-auto-restart.sh 2>&1 >/dev/null &

exit 0

At this point, we can either reboot the MT300N, or run the same line manually from the command line to reap the fruits of automation:

/root/wifi-auto-restart.sh 2>&1 >/dev/null &

Note that since this is a workaround and not an actual fix for the underlying cause of the problem, the wireless devices in the internal network will still get disconnected for a minute or so when the hangup happens. The difference to the previous state of affairs is that now they'll reconnect automagically once the router's Wi-Fi interfaces are up again.

And we're done, that's all there is to it. With minimal counseling, the MT300N is now a big independent boy and will look after itself without needing anyone's help.