Saturday, November 20, 2010

Netscreen - NSRP

HA Setups
There are 3 main types of HA setup, they are,
  • Active / Passive - All traffic passes the active node. In the event of failure the backup firewall is activated, and traffic flow is resumed.
  • Active / Active - Both Firewalls share the network load. In the event of failure all traffic is passed through the working node.
  • Active / Active Full Mesh - This setup eliminates any single point of failure. Every link to each node, switch, and router is cabled twice to allow for complete redundancy.
HA Feature Sets

SOHO
This allows for you to configure a secondary untrust interface. Of which in the event of failure the secondary link will become active, in order to restore connectivity. You can use either the available serial port or ethernet port for your secondary link, allowing you to connect ADSL Modems or Routers.
By default you must manually initiate a failover from the CLI.

The various commands are below,

ns5gt-> exec failover force <- failover manual
ns5gt-> exec failover revert <- revert ack
ns5gt-> exec failover auto <- enable automatic failover
To allow the link to stabilize there is a default hold down timer of 30secs. If required you can modify this by using the command,

ns5gt-> set failover hold-down [number of seconds]
SOHO only monitors the link between the Netscreen and the modem or the router. So if there is a problem with the ISP service the Netscreen will not failover.
To allow you to configure this setup (dual untrust) you will need to be using a port mode of "dual-untrust" or "combined".
To confirm which mode you are running use the command `get system`. You can change the mode by using `exec port-mode dual-untrust` command, but be warned this will cause all the configuration to be erased.
NSRP-Lite
This allows for Active/Passive setup with configuration syncronisation. But does not provide Run-Time Oject syncronisation (discussed later) or an Active/Active setup.

NSRP
NSRP is the protocol that allows clustered Netscreens to communicate with each other and allows them to exchange state information. Which in turn allows them to make the required decisions to ensure traffic is still passed in the event of failure.
When NSRP is enabled a VSD (Virtual Security Device) is created, along with the configuration of the physical interfaces being applied to VSI`s Virtual Security Interfaces. Each VSD belongs to a VSD group. In each VSD group, one VSD is nominated as a master VSD. Each VSD will sit on each firewall. Only the master VSD (Active firewall) will pass the traffic. Along with this the IP addresses assigned to a VSI follow the master VSD. With regards to the management IP`s these stay static to each firewall.


NSRP States
At any one time each VSD can be in one of 6 states.
  1. Master
  2. Primary Backup
  3. Backup
  4. Initial
  5. Ineligible
  6. Inoperable
Initial - Occurs when a VSD is first created due to reboot or configuration change. While in this state the VSD learns other devices in the VSD group, syncs the state with other VSD`s, and elections for which VSD should be master.
Master or Backup - Each VSD then either goes into a master or backup state.
Primary backup - If the backup node finds there is no primary backup VSD it sets itself to the Primary Backup for the VSD group. When in this state the VSD can either be prompted to master due to the old VSD disappearing or goes into an inoperable state.
Inoperable - The VSD will go into this state if it detects a failure that stops it from passing traffic, when in this state the VSD isnt included in elections.
Ineligible - This is an administratively down state of a VSD, of which is done manually. `set nsrp vsd-group id [number] mode ineligable`.The Master VSD is determined,
  • if there is no other VSD then the devices wins and becomes active
  • if there are 2 VSDs the device with the lowest priority wins ( `set nsrp vsd-group id X priority N` )
  • if both devices have same priority or its not set then the VSD with the lowest MAC address wins. 
A fail over can be caused by any of the following,
  • Software crashes
  • Hardware or power failure
  • Link failure on monitored interfaces or zones
  • Unavailability of one or more Tracked IP`s

Cluster Traffic
2 types of packets are exchanged over HA Links. These are control messages and data packets.
  • Control messages : Consists of Heartbeats, Link probes, VSD stat information and session synchronizations.
  • Data packets       : This is normal user traffic which is passed from one firewall to another. This happens in an Active/Active HA setup.
To check if both devices are in sync run the command,

ns5gt-> clear db
ns5gt-> exec nsrp sync global-config check-sum
ns5gt-> get db str

NSRP Track IP
Interface Track IP and VPN monitoring are not included with NSRP. NSRP Tracking allows you to fail across your cluster in the event of IP`s becoming unreachable. Such as a router IP. This allows for failovers in the event of a Netscreen interface or switch port failing.
If in the event of failure you required your traffic to take an alternative route, a configuration option would be to,

  1. Disable the default VSD group
  2. Create a new VSD group but leave out your interfaces that you require as being local.
  3. Set Track IP to poll an IP address (such as your Router)
In the event of failover this would prevent the failed interface from moving to the other VSD.
RTO Mirroring
Real-Time Object mirroring allows dynamic based information to be synchronized between the cluster nodes, such as DHCP leases, VPN sessions etc.
To enable RTO use the following commands,
ns5gt-> set nsrp cluster id1
ns5gt-> set nsrp rto-mirror sync
With some insecure protocols you may wish to disable sessions created by a certain policy from being mirrored when dealing with DoS attacks. To change this,
  1. Go into the policy
  2. Select "Advanced"
  3. Deselect "HA Session Backup" and click return.
  4. Click OK

Split Brain
Split Brain is a situation where the HA link fails and in turn both devices believe the other device has failed and then promotes itself to master.
There are 3 methods in which you can prevent this situation from arising,
  1. Dual HA links.
  2. Connect the Ha links directly using cross over cables.
  3. Add a secondary path for the HA link. This will use an existing traffic interfaces and is enabled via the commands.

ns5gt-> adding a secondary path
ns5gt-> set nsrp secondary-path eth1

"No Brain" Situation
In this situation both switches/switch ports fail. Both firewalls may be plugged into the same switch or different switches which may fail due to power failure etc. This causes both firewalls to place themselves into an inoperable state and then backup. Causing both firewalls to be in a backup state.
To ensure that one device is always master you can use the command,
ns5gt-> set nsrp vsd-group master-always-exists
The main issue with this occurs in a situation where both switches/switch ports fail for one network (i.e trust) and then a switch/switch port fails on the active node. In this case the cluster will not fail across to the secondary node even though it is the best candiate for master.

No comments: