Best Practices March 5, 2026 25 min read

WiFi Resilience Guide: Keep Your Network Running During Cloud Maintenance

Q: What happens to connected users during an IronWiFi maintenance window?

Users who are already connected and authenticated remain connected. RADIUS is only contacted during initial authentication and session re-authentication. If your Session-Timeout is set to 8 hours (28800 seconds), a user who authenticated at 9 AM won't need RADIUS again until 5 PM. With PMK caching and 802.11r enabled, roaming between APs also skips RADIUS entirely.

Q: How do I configure multiple IronWiFi RADIUS servers?

IronWiFi provides RADIUS server endpoints in multiple geographic regions. In your AP configuration, set your closest regional server as the primary RADIUS server and a different region's server as the secondary. For example, use the US-East server as primary and US-West as secondary. Both servers share the same user database and authentication policies, so failover is seamless.

Q: What Session-Timeout should I use?

For enterprise networks with WPA-Enterprise (802.1X), we recommend a Session-Timeout of 28800 seconds (8 hours) — roughly one business day. For guest networks using captive portals, 86400 seconds (24 hours) is appropriate. Longer timeouts reduce RADIUS dependency because clients re-authenticate less frequently, but they also mean access revocation takes longer to take effect.

Q: Does my AP cache credentials locally?

Most enterprise APs cache PMK (Pairwise Master Key) data from successful authentications. This cache allows clients to roam between APs or reconnect without a full RADIUS exchange. Aruba goes further with its 'cached-reauth' feature, which lets previously authenticated clients re-authenticate from the local cache even when RADIUS is unreachable. Cisco WLC has Local EAP as a similar fallback mechanism.

Q: How long does RADIUS failover take?

RADIUS failover time depends on three settings: the authentication timeout (how long the AP waits for a response), the retry count (how many times it retries), and the dead-time (how long a failed server stays marked as dead). The formula is: failover_time = timeout × retries. With a 5-second timeout and 3 retries, failover takes 15 seconds. We recommend keeping total failover time under 20 seconds for a good user experience.

Q: Should I enable 802.11r Fast Transition?

Yes, if your environment has Apple devices (iPhones, iPads, MacBooks) or other 802.11r-capable clients. Fast Transition allows devices to roam between APs without a full RADIUS re-authentication, reducing roaming time from hundreds of milliseconds to under 50ms. This also means roaming does not depend on RADIUS availability at all. Enable it alongside PMK caching for maximum resilience.

Your access points already have built-in resilience features that most organizations never configure. This guide covers session timeouts, PMK caching, 802.11r, and vendor-specific failover settings for Meraki, Aruba, Ruckus, UniFi, Cisco WLC, and Juniper Mist — so your WiFi keeps working even when the RADIUS server is briefly unreachable.

To make your WiFi network resilient against RADIUS server unavailability, configure three things: (1) set Session-Timeout to 8+ hours so connected users rarely need re-authentication, (2) enable PMK caching and 802.11r so roaming skips RADIUS entirely, and (3) configure a secondary RADIUS server in a different geographic region. Most enterprise APs also have vendor-specific fallback features — Meraki's"Allow" failover policy, Aruba's cached-reauth, and Cisco WLC's Local EAP — that keep previously authenticated users connected even during extended outages.

Cloud RADIUS sits in the authentication path of your WiFi network. When a new device connects, the access point sends its credentials to the RADIUS server for validation. But here is the key insight that changes everything: RADIUS is only contacted during initial authentication and periodic re-authentication. A user who authenticated at 9 AM does not need RADIUS again until their session timeout expires.

This means that with the right configuration, a RADIUS server could be unreachable for hours and most of your users would never notice. The goal of this guide is to help you configure every available resilience mechanism so that your WiFi network can tolerate any brief cloud maintenance window without user impact.

Core Resilience Concepts

Before diving into vendor-specific settings, it helps to understand the four mechanisms that reduce your network's dependency on RADIUS availability.

Session Timeout vs Idle Timeout

Session-Timeout controls how long a user stays authenticated before the AP forces a re-authentication. With a default of 3600 seconds (1 hour), your AP contacts RADIUS 8-10 times per user per business day. Increase this to 28800 seconds (8 hours) and the AP only contacts RADIUS once per user per day — typically at the start of the workday.

Idle-Timeout disconnects users who stop sending traffic. A reasonable value is 3600 seconds (1 hour). This frees up resources without creating unnecessary RADIUS load.

The Session Timeout Multiplier

Every doubling of your Session-Timeout halves your RADIUS dependency. Moving from 1 hour to 8 hours means 8x fewer RADIUS requests and an 8x longer window where RADIUS unavailability has zero user impact. For most enterprise networks, 28800 seconds (8 hours) is the sweet spot — one authentication per business day.

PMK/OKC Caching

After a successful 802.1X authentication, the AP and client derive a Pairwise Master Key (PMK). With PMK caching enabled, this key is stored and reused when the client roams to another AP or reconnects. The client presents the cached PMK ID, and the AP accepts it without contacting RADIUS at all.

Opportunistic Key Caching (OKC) extends this further — the original AP shares the PMK with neighboring APs so that even a first-time roam to a new AP can skip RADIUS. Most enterprise APs enable PMK caching by default, but it is worth verifying.

802.11r Fast Transition

802.11r (Fast BSS Transition) is an IEEE standard that pre-negotiates security keys during roaming. Instead of a full 802.1X exchange (which takes 300-700ms and requires RADIUS), a device roams in under 50ms using pre-distributed keys. This is particularly important for Apple devices (iPhones, iPads, MacBooks), which strongly prefer 802.11r networks.

When 802.11r is enabled, roaming never contacts RADIUS — it is entirely handled between the client and the AP infrastructure.

RADIUS Failover

Every enterprise AP supports configuring a primary and secondary RADIUS server. When the primary becomes unreachable, the AP automatically fails over to the secondary. The key parameters are:

Authentication timeout — how long the AP waits for a RADIUS response (typically 3-10 seconds)
Retry count — how many times the AP retries before marking the server dead (typically 1-3)
Dead-time — how long a failed server stays marked as dead before the AP tries it again (typically 5-30 minutes)

The failover time formula is: failover_time = timeout × retries. With a 5-second timeout and 3 retries, failover takes 15 seconds. We recommend keeping total failover time under 20 seconds.

IronWiFi Multi-Region Setup

IronWiFi operates RADIUS servers in multiple geographic regions. Configure your closest region as the primary server and a different region as the secondary. Both share the same user database and policies, so failover is seamless. You can find your region-specific server IPs in the IronWiFi Console under Networks.

Vendor-Specific Configuration

Each AP vendor implements resilience features differently. Jump to your vendor below for specific configuration steps.

Cisco Meraki Aruba (HPE) Ruckus Ubiquiti UniFi Cisco WLC / C9800 Juniper Mist

Cisco Meraki Excellent Resilience

Meraki has one of the strongest resilience stories of any cloud-managed AP. Its Failover Policy feature is the standout — it can grant temporary network access to new clients even when RADIUS is completely unreachable.

1. Set Failover Policy to"Allow"

In the Meraki Dashboard, navigate to Wireless > Configure > Access control for your WPA2-Enterprise SSID. Under RADIUS settings, set the RADIUS Failover Policy to "Allow". When all configured RADIUS servers are unreachable, Meraki will grant new clients a 1-hour temporary session instead of blocking them.

2. Configure Multiple RADIUS Servers

Add 2-3 RADIUS servers using IronWiFi endpoints from different regions. Set the RADIUS load balancing policy to "Strict priority order" so the closest server is always tried first.

3. Enable RADIUS Testing

Under the RADIUS server configuration, enable RADIUS testing for proactive health monitoring. Meraki will periodically send test authentication requests and mark servers as dead before real users are affected.

4. Set Session-Timeout

In IronWiFi, set the Session-Timeout attribute to 28800 (8 hours). This is returned in the RADIUS Access-Accept and Meraki respects it for session timing.

Meraki"Allow" Policy Caveat

The"Allow" failover policy grants a temporary 1-hour session with the default VLAN. This means new users get network access but without any custom VLAN assignment or group policies. For most environments, this is far better than a complete outage. After RADIUS recovers, the next re-authentication applies the correct policies.

Aruba (HPE) Excellent Resilience

Aruba's standout resilience feature is cached-reauth, which allows previously authenticated clients to re-authenticate from the local cache even when all RADIUS servers are unreachable. This is the strongest vendor-specific resilience mechanism available.

1. Enable Cached Re-authentication

On ArubaOS controllers or Aruba Central:

# Enable cached re-authentication on your AAA profile aaa authentication dot1x <profile-name> cached-reauth enable cached-reauth-period 86400

The cached-reauth-period of 86400 seconds (24 hours) means any client who authenticated within the last 24 hours can re-authenticate locally without RADIUS.

2. Configure Server Tracking and Dead-Time

# Configure RADIUS server group with failover aaa server-group <group-name> auth-server <primary-server> position 1 auth-server <secondary-server> position 2 set dead-time 10

The dead-time of 10 minutes means a failed server is not retried for 10 minutes, preventing repeated timeout delays.

3. Set Server Response Timeout

# Tune RADIUS timeout and retries aaa authentication-server radius <server-name> timeout 5 retransmit 3

With 5-second timeout and 3 retransmissions, failover to the secondary server happens within 15 seconds.

Ruckus (CommScope) Good Resilience

Ruckus offers a practical fallback mechanism through its auth-timeout-action setting. When RADIUS is unreachable, you can configure the AP to either deny access or place users on a default VLAN with limited connectivity.

1. Set Auth-Timeout-Action

In SmartZone or Ruckus controller:

# Allow clients onto a default VLAN when RADIUS times out set auth-timeout-action success set auth-default-vlan 100

When RADIUS is unreachable, new clients are placed on VLAN 100 (configure this as a restricted-but-functional VLAN with internet access).

2. Configure MAC Auth as Fallback

# Set authentication order: try 802.1X first, fall back to MAC auth set auth-order dot1x mac-auth

If 802.1X fails (including RADIUS timeout), the AP tries MAC-based authentication as a fallback.

3. Set Grace Period

Configure a grace period of 5-10 minutes in the WLAN settings. During this window, clients that lose their session can reconnect without full re-authentication.

4. Multiple RADIUS Servers

Add primary and secondary RADIUS servers in the AAA configuration. Ruckus automatically fails over to the secondary when the primary stops responding.

Ubiquiti UniFi Limited Resilience

UniFi has the most limited resilience options of the major AP vendors. There is no"allow" fallback policy and no local credential caching. Your primary defense is long session timeouts and a properly configured secondary RADIUS server.

1. Always Configure a Secondary RADIUS Server

In the UniFi Controller or UniFi Network app, navigate to Settings > WiFi > [SSID] > Advanced. Add both a primary and secondary RADIUS server using IronWiFi endpoints from different regions. This is your only automatic failover mechanism.

2. Set Long Session Timeouts

In IronWiFi, set Session-Timeout to 43200 (12 hours) or even 86400 (24 hours) for UniFi deployments. Since UniFi has no fallback mechanism, longer timeouts are your primary resilience lever.

3. Use Geographically Diverse Servers

Ensure your primary and secondary RADIUS servers are in different IronWiFi regions (e.g., US-East primary and EU-West secondary). This protects against regional outages.

UniFi Resilience Limitation

If both RADIUS servers are unreachable, new UniFi clients cannot authenticate — there is no bypass. Existing authenticated clients remain connected until their session expires. For mission-critical UniFi deployments, consider pairing with a local RADIUS proxy that caches credentials.

Cisco WLC / Catalyst 9800 Excellent Resilience

Cisco's enterprise wireless controllers have the deepest set of resilience features. Local EAP can serve as a complete fallback authentication server, and FlexConnect enables branch offices to authenticate locally when the WAN link to the controller is down.

1. Configure Dead-Criteria and Deadtime

# Set RADIUS dead-server detection (Catalyst 9800) radius-server dead-criteria time 10 tries 3 radius-server deadtime 15

The AP marks a server as dead after 3 failed attempts within 10 seconds, then waits 15 minutes before retrying it.

2. Disable Aggressive Failover

# Prevent premature failover that causes auth loops no radius-server attribute 6 on-for-login-auth

3. Configure Local EAP as Fallback

# Create a Local EAP profile eap profile LOCAL-EAP-FALLBACK method peap method tls # Add local to AAA method list as last resort aaa authentication dot1x default group radius local

When all RADIUS servers are unreachable, the WLC authenticates users against a local credential store. Pre-populate this with critical users or use certificate-based authentication.

4. FlexConnect for Branch Offices

For branch office deployments, enable FlexConnect with local authentication. APs cache credentials and authenticate locally even when the WAN link to the central controller is down.

5. Active Fallback Testing

# Enable active probing to detect server recovery radius-server dead-criteria time 10 tries 3 radius-server deadtime 15 # Use a test user for proactive monitoring radius server IRONWIFI-PRIMARY automate-tester username radius-test idle-time 5

The controller sends periodic test authentications with a dummy username to detect when the RADIUS server recovers, enabling automatic failback.

Juniper Mist Good Resilience

Juniper Mist provides resilience primarily through Mist Edge, which acts as a local RADIUS proxy and can cache authentication state. For organizations without Mist Edge, longer session timeouts and multiple RADIUS servers are the primary mechanisms.

1. Deploy Mist Edge as RADIUS Proxy

Mist Edge sits on-premises and proxies RADIUS traffic to the cloud. If the cloud RADIUS server is unreachable, Mist Edge can cache and replay previous authentication decisions for known clients.

2. Configure Multiple Mist Edges

Deploy Mist Edges in an active/passive or active/active failover configuration. If one Mist Edge fails, the other takes over RADIUS proxy duties.

3. Enable Configuration Persistence on APs

Mist APs store their configuration locally. Even if the Mist cloud is unreachable, the APs continue operating with their last-known configuration, including RADIUS server settings and WLAN profiles.

4. Use Longer RADIUS Session Timeouts

Set Session-Timeout to 28800 (8 hours) in IronWiFi. This reduces the frequency of RADIUS re-authentication and extends the window during which users remain connected without RADIUS contact.

IronWiFi Recommended Settings

Configure these settings in the IronWiFi Console for maximum resilience.

Setting	Enterprise (802.1X)	Guest (Captive Portal)
Session-Timeout	28800s (8 hours)	86400s (24 hours)
Idle-Timeout	3600s (1 hour)	3600s (1 hour)
Termination-Action	RADIUS-Request (re-auth, don't disconnect)	RADIUS-Request
Primary RADIUS	Closest regional server	Closest regional server
Secondary RADIUS	Different region server	Different region server

Termination-Action Matters

Always set Termination-Action = RADIUS-Request (value 1) in your IronWiFi RADIUS reply attributes. This tells the AP to attempt re-authentication when the session expires, rather than immediately disconnecting the user. If re-authentication fails because RADIUS is unreachable, the AP's vendor-specific fallback behavior kicks in — which is exactly what you want.

Finding Your Region-Specific Server IPs

Log into the IronWiFi Console and navigate to Networks.
Select your network. The RADIUS server IP, ports, and shared secret are displayed on the network details page.
For your secondary server, create a second network in a different region (or contact support to get the secondary server endpoint for your account).
Enter both server IPs in your AP configuration: the closest region as primary, the other as secondary.

Resilience Score Checklist

Use this checklist to assess your network's resilience posture. Each item you configure reduces the impact of any RADIUS server unavailability.

Secondary RADIUS server configured — Geographic redundancy with a different IronWiFi region
Session-Timeout ≥ 8 hours — One authentication per business day minimum
Vendor-specific fallback enabled — Meraki"Allow", Aruba cached-reauth, Ruckus auth-timeout-action, or Cisco Local EAP
PMK/OKC caching enabled — Roaming and reconnection skip RADIUS
802.11r Fast Transition enabled — Sub-50ms roaming for Apple devices without RADIUS
RADIUS health monitoring active — Proactive dead-server detection, not just timeout-based
Tested failover scenario — Simulate a RADIUS outage and verify behavior

Scoring: 7/7 = bulletproof, 5-6 = well-protected, 3-4 = basic resilience, <3 = vulnerable to outages. Most organizations score 2-3 out of the box without configuration changes.

Frequently Asked Questions

What happens to connected users during an IronWiFi maintenance window?

Users who are already connected and authenticated remain connected. RADIUS is only contacted during initial authentication and session re-authentication. If your Session-Timeout is set to 8 hours (28800 seconds), a user who authenticated at 9 AM won't need RADIUS again until 5 PM. With PMK caching and 802.11r enabled, roaming between APs also skips RADIUS entirely.

How do I configure multiple IronWiFi RADIUS servers?

IronWiFi provides RADIUS server endpoints in multiple geographic regions. In your AP configuration, set your closest regional server as the primary RADIUS server and a different region's server as the secondary. For example, use the US-East server as primary and US-West as secondary. Both servers share the same user database and authentication policies, so failover is seamless.

What Session-Timeout should I use?

For enterprise networks with WPA-Enterprise (802.1X), we recommend a Session-Timeout of 28800 seconds (8 hours) — roughly one business day. For guest networks using captive portals, 86400 seconds (24 hours) is appropriate. Longer timeouts reduce RADIUS dependency because clients re-authenticate less frequently, but they also mean access revocation takes longer to take effect.

Does my AP cache credentials locally?

Most enterprise APs cache PMK (Pairwise Master Key) data from successful authentications. This cache allows clients to roam between APs or reconnect without a full RADIUS exchange. Aruba goes further with its"cached-reauth" feature, which lets previously authenticated clients re-authenticate from the local cache even when RADIUS is unreachable. Cisco WLC has Local EAP as a similar fallback mechanism.

How long does RADIUS failover take?

RADIUS failover time depends on three settings: the authentication timeout (how long the AP waits for a response), the retry count (how many times it retries), and the dead-time (how long a failed server stays marked as dead). The formula is: failover_time = timeout × retries. With a 5-second timeout and 3 retries, failover takes 15 seconds. We recommend keeping total failover time under 20 seconds for a good user experience.

Should I enable 802.11r Fast Transition?

Yes, if your environment has Apple devices (iPhones, iPads, MacBooks) or other 802.11r-capable clients. Fast Transition allows devices to roam between APs without a full RADIUS re-authentication, reducing roaming time from hundreds of milliseconds to under 50ms. This also means roaming does not depend on RADIUS availability at all. Enable it alongside PMK caching for maximum resilience.

Tags: WiFi Resilience RADIUS Failover Session Timeout PMK Caching 802.11r AP Configuration Network Redundancy Best Practices