Clusters de Asterisk con el foneBRIDGE2

  • english
  • spanish

At work we have an Asterisk cluster comprised of two Proliant servers and a Redfone‘s foneBRIDGE2 that handles the ISDN lines. The heartbeat daemon is installed on both servers, monitors them and, in the event of a system failure on the master, switches the service to the backup server, migrating the main IP and activating all the needed daemons. I’ll briefly explain the whole setup here as a reference.

Overview

As I’ve said we have two Asterisk servers, named asterisk00 and asterisk01.example.com, the former being the master. Each one of them has its IP address (say, 10.10.10.1 and .2) and there’s an additional “virtual” address (.3) that will “jump” from one server to the other if the primary crashes.

Our foneBRIDGE2 is a quad model, but we only use two ISDN lines: one to our telco, and the other to a legacy PBX. Besides the ISDN interfaces, the foneBRIDGE has two ethernet sockets to connect it to the servers, but only one of them (the first one) accepts configuration commands to set up the FB, switch servers, etc. You’d usually use a switch on that interface so that every server has access to it and can configure the FB, but my boss saw this switch as a single point of failure and refused to use one, a opinion I don’t share as it also has its drawbacks as we’ll see. So our setup is a little bit funny in that asterisk01 is connected to the primary FB interface and asterisk00 to the secondary one. The logic here is: asterisk00 is going to be running 99% of the time, and if it crashes, asterisk01 would have to re-configure the FB, so asterisk01 needs to have access to the config port. Of course, now asterisk01 is a SPOF: if our backup server goes down for any reason, we risk losing control of the FB rendering our cluster unusable!

We use the FreePBX web GUI, which in turn uses a mySQL DB to store all the settings. If you don’t use it, you can skip all instructions referring to mySQL and Apache.

mySQL synchronization

mySQL’s native ndb clustering is quite useful here. Set it up, have the service up at all times on both nodes, and the DB system automatically handles the synchronization across the cluster.

Setting up a mySQL cluster is out of the scope of this document, check the official docs here or look for a howto on Google. :)

Filesystem synchronization

All of Asterisk’s config files, libraries, modules, the users’ voicemail dirs… need to be synchronized over the cluster’s nodes. There are several alternatives here:

  • A SAN. Expensive but convenient. We don’t have one so it’s out of the question. :)
  • DRBD. If you don’t know it, think of it as a partition-level RAID1 system over the network. Works great, we use it on several other clusters, but not here. DRBD’s only drawback is that the synchronized partition can’t be mounted on both servers at once, so you can access the files only on the active node. We wanted to have everything accessible on both servers so that we could use the backup one as a testing ground for new configurations, software upgrades, etc., do DRBD wasn’t and option.
  • csync2. It’s like rsync on steroids. Similar to unison, but can synchronize files over more than two nodes. We’re using it for our Asterisk cluster.

Our csync2.conf file looks like this:

group asterisk
{
host asterisk00.example.com asterisk01.example.com;

key /etc/csync2.key_asterisk;

backup-directory /var/backups/csync2;
backup-generations 10;

auto none;

exclude  *~ .* ok lock control;
include /etc/csync2.cfg;

include /etc/hosts;
include /etc/ha.d/ha.cf;
include /etc/ha.d/haresources;

include /etc/asterisk;
include /etc/redfone*;

include /var/www;
include /var/lib/asterisk;
include /var/spool/asterisk;
include /usr/lib/asterisk;
include /etc/amportal.conf;
include /var/log/asterisk;
}

We run the synchronization every five minutes. There’s no need to sync more frequently, as there won’t be that many changes in the configuration (it’s a stable system, maybe a new phone added every X weeks) and we seldom use the voicemail. The synchronization is launched from /etc/cron.d/FB-csync2:

*/5 * * * * root [ -f /tmp/.FB-master ] && /usr/sbin/csync2 -xv

This /tmp/.FB-master file is just a “flag” that marks the master server, so that the synchronization is only run there. On the section about heartbeat we’ll see how and when this file is created.

fonulator

fonulator is Redfone’s utility to configure the foneBRIDGE. As I’ve explained before, only asterisk01 (the backup system) can configure the FB in our setup, and each server is connected to a different ethernet port on the FB. So in the event of a crash, we need to change the destination server AND the interface used to send it the TDMoE frames.

To this end, we have two different redfone.conf files (redfone_asterisk00.conf and redfone_asterisk01.conf). They look the same except for the “serverX” and “fbX” directives on the spans:

[globals]
fb1=00:50:C2:65:D0:68
fb2=00:50:C2:65 :D 0:69

# asterisk00.example.com
server1=00:80:5A:61:E7:FF
# asterisk01.example.com
server2=00:04:76:11:A3:EC

card=eth1,fb1

# Telco
[span1]
span=1,0,0,ccs,hdb3,crc4
server1
fb2
pri

# Legacy PBX
[span2]
span=2,0,0,ccs,hdb3,crc4
server1
fb2
pri

That was redfone_asterisk00.conf. It instructs the FB to send the ISDN traffic to asterisk00 (server1 here) over the second ethernet interface (fb2). The redfone_asterisk01.conf file uses server2 and fb1.

heartbeat

And now, the final piece that ties the rest together: heartbeat. Our haresources file looks like this:

asterisk00.example.com MailTo::asterisk@example.com::Asterisk 10.10.10.3 FB_fonulator FB_master FB_asterisk apache2

Meaning that:

  • asterisk00.example.com is the master server
  • in the event of a service takeover, send a mail to asterisk@example.com
  • the service’s virtual IP is 10.10.10.3
  • start (stop) the FB_fonulator, FB_master, FB_asterisk and apache2 services (remember to unlink the apache2 link from /etc/rc2.d, we don’t want it to be started at system bootup as heartbeat will handle it)

Now, the scripts. FB_fonulator runs fonulator in order to configure the FB and send the TDMoE traffic to the appropriate server. One important thing here is that, although this script will be run on both servers, it will only have an effect when run from asterisk01 as this is the server on the FB’s config interface:

#!/bin/sh

# Chech who I am and who the other host is
THISHOST="`hostname|cut -d. -f1`"
if [ "$THISHOST" == "asterisk00" ]
then
OTHERHOST="asterisk01"
else
OTHERHOST="asterisk00"
fi

# Bail out if there is no config file
F="/etc/redfone_$THISHOST.conf"
[ ! -f "$F" ] && exit 0
# Guess the appropiate interface card
export ETH=`grep -E "^card=" "$F" | cut -d= -f2 | cut -d, -f1`

case "$1" in
start)
echo "Fonulating…"
/usr/local/bin/fonulator -s -t 1 "/etc/redfone_$THISHOST.conf"
;;
stop)
/usr/local/bin/fonulator -s -t 1 "/etc/redfone_$OTHERHOST.conf"
;;
restart|status)
echo "Fonulator $1"
exit 0
;;
esac
exit 0

FB_master creates the /tmp/.FB-master “flag” file we talked about before, and forces a sync both on the start (to make sure both servers have the same data) and on the stop (to sync back to the primary server any changes after a takeover-and-back):

#!/bin/sh

F=/tmp/.FB-master

case "$1" in
start)
touch "$F"
# Activate log rotation
ln -sf /etc/asterisk/asterisk.logrotate /etc/logrotate.d/asterisk
# Force sync of these dirs
csync2 -fr /var/
csync2 -fr /etc/asterisk/
csync2 -xv
;;
stop)
if [ -f "$F" ]
then
# De-activate log rotation
rm -f /etc/logrotate.d/asterisk
# Force a last minute sync to the new master
csync2 -fr /var/
csync2 -fr /etc/asterisk/
csync2 -xv
rm -f "$F"
fi
;;
esac
exit 0

Finally, FB_asterisk starts the Asterisk service. We run Asterisk via daemontools using my scripts available here, so basically what this FB_asterisk script has to do is “svc -u/-d /service/asterisk”:

#!/bin/sh

case "$1" in
start)
echo "Starting Asterisk…"
# Check if Asterisk is already running
if /usr/sbin/asterisk -r -x "quit"
then
echo "Already running"
exit 0
fi
# Just in case…
rm -f /service/*
# Link services and start them up
ln -sf /etc/asterisk/services/asterisk/ /service/asterisk
ln -sf /etc/asterisk/services/fopserver/ /service/fopserver
svc -u /service/*
;;
stop)
echo "Stopping Asterisk …"
svc -d /service/*
rm -f /service/*
;;
restart)
echo "Restarting Astarisk …"
svc -t /service/*
;;
reload)
echo "Reloading Asterisk …"
/usr/sbin/asterisk -r -x "reload"
;;
status)
echo "Checking Asterisk’s status …"
/usr/sbin/asterisk -r -x "quit" && exit 0 || exit 1
;;
esac
exit 0

Download

All the aforementioned scripts and config files are available here. Think of them as a base to make your own Asterisk/foneBRIDGE setup. And feel free to mail me back any improvements, errors you may find, etc.

6 thoughts on “Clusters de Asterisk con el foneBRIDGE2

  1. Pingback: SinoLogic » Como configurar un FoneBridge2 (redfone)

  2. Hola,

    yo estoy intentando hacer una configuración similar pero con un ISDNGuard en lugar de con RedFone. Estoy utilizando csync2 para la sincronización de ficheros. A pesar de seguir la guía de configuración del manual de csync2, y los consejos que das tu en el post, tengo un problema:

    Cuando ejecuto csync2 en uno de los nodos obtengo lo siguiente:

    produccion:/etc# csync2 -Tvv
    My hostname is produccion.
    Database-File: /var/lib/csync2/produccion.db
    Config-File: /etc/csync2.cfg
    Running in-sync check for produccion madridcluster.
    Connecting to host madridcluster (SSL) …
    Can’t connect to remote host.
    ERROR: Connection to remote host failed.
    SQL: SELECT command, logfile FROM action GROUP BY command, logfile
    SQL Query finished.
    Finished with 1 errors.

    A pesar de que el otro host si es alcanzable:

    MadridCluster:/etc# tcpdump -i eth0 port 30865
    tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
    listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
    18:39:30.758298 IP produccion.33378 > madridcluster.csync2: S 3701654960:3701654960(0) win 5840
    18:39:30.773641 IP madridcluster.csync2 > produccion.33378: R 0:0(0) ack 3701654961 win 0

    2 packets captured
    2 packets received by filter
    0 packets dropped by kernel

    ¿Sabrías a que puede ser debido?

    Gracias y un saludo,

    Mariña

  3. Hola Mariña

    Si el puerto está abierto y el servicio en marcha, sólo se me ocurre que sea por el SSL. csync2 usa cifrado Y AUTENTICACIÓN vía SSL, tienes que crear certificados para ambas máquinas y configurarlos. Es bastante coñazo así que en el artículo me lo salté a la torera, porque si no hubiera dedicado más tiempo a ésto que al resto.

    Pégale un vistazo a la documentación del csync al respecto. De todas formas había una directiva para desactivar por completo el SSL, no tengo ahora los docs a mano pero creo que era:

    nossl host1 host2
    nossl host2 host1

    Si no recuerdo mal había que hacerlo así, indicando que no se iba a usar SSL en ambos sentidos, o si no en uno lo desactivabas pero en el otro no, con lo que seguía fallando.

    Prueba así, y si con ésto te va, ya es cuestión de que analices si necesitas realmente ese punto extra de seguridad que daría el cifrado y autenticación con certificados SSL. Cuanta más azucar, más dulce.

    Saludos

  4. Al final el problema era que no había ejecutado el csync2 -i en el servidor esclavo.

    Ahora ya intenta sincronizar pero me da errores:

    Updating /etc/asterisk/.svn/entries on madridcluster …
    While syncing file /etc/asterisk/.svn/entries:
    ERROR from peer madridcluster:
    ERROR: Auto-resolving failed. Giving up.
    File stays in dirty state. Try again later…
    Match (+): /etc/asterisk on /etc/asterisk/.svn/format
    Updating /etc/asterisk/.svn/format on madridcluster …
    File is different on peer (cktxt char #0).
    >>> PEER: OK (data_follows).
    >>> LOCAL: v1:mtime=0:mode=33060:uid=5060:gid=1001:type=reg:size=2
    Format-error while receiving data.
    SQL: COMMIT TRANSACTION

    ¿Sabes a que puede ser debido, o donde puedo encontrar información al respecto? Es que no encuentro nada…

    Saludos,

    Mariña

  5. Si el fichero ha sido modificado en ambos equipos, csync2 por defecto se lava las manos y te deja a tí que decidas cuál de los dos es “el bueno”.

    Desde un servidor puedes forzar que su copia sea la que se sincronice la próxima vez con “csync2 -f FICHERO”. También hay unas directivas para indicar en el fichero de configuración qué servidor tiene prioridad en caso de que haya un conflicto de éste tipo: si el de más a la izquierda en la lista, a la drcha., el que tenga el fichero más reciente, etc.

    La verdad es que del csync2 hay poca documentación. Yo lo único que he encontrado es éste PDF que está en su página web. A partir de ahí, todo es pelearse con él, probar, probar y probar:

    http://oss.linbit.com/csync2/paper.pdf

  6. Pingback: Enlace en la portada de Red-Fone | Jompeich d’er Bisente

Deja un comentario

Tu dirección de correo electrónico no será publicada. Los campos necesarios están marcados *

*

Puedes usar las siguientes etiquetas y atributos HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>