I'm using the net-snmp-lvs module to interface LVS statistics to SNMP so I can graph them (I'm using OpenNMS).

I have a virtual HTTP service that is balanced across eight real servers. In testing, everything seemed to work just fine and I got some nice graphs that show the Connection Rate, Packet Rate, and Byte Rate for the virtual service and each of the real servers.

This morning, we attempted a cutover, ie. we re-directed real traffic to the new service. Sadly, our perimeter firewall hit > 90% CPU so we had to revert. But, in the time that we were live, I noticed that the Connection Rate statistics were missing for both the virtual service and the real servers for the period in which the service was under high load:

LVS Connection Rate

LVS Packet Rate

LVS Byte Rate

Notice the gap in the Connection Rate graph when the Packet & Byte rate graphs show high values.

I am currently investigating the cause of this issue.

As I mentioned in a previous post, the MySQL RPMs provided for RHEL/CentOS by percona are not actually compatible with RHEL/CentOS. They use the same package layout as the MySQL-provided RPMs.

Here's how I create my own RPMs having the same package layout as the RHEL/CentOS packages but with the percona highperf patchset applied.

Continue reading

ourdelta provide MySQL packages for various platforms, built with assorted performance/feature patchsets.

Sadly, like the percona builds, the RPM packages for RHEL/CentOS are not upstream-compatible, ie. they package MySQL differently.

I was planning to re-build the ourdelta packages to use the upstream RPM package layout but I've decided to stick with re-building the percona packages as I've already done the work for that.

Anyway, in case it helps someone, here's how to rebuild the ourdelta packages from the SRPM:

rpmbuild --rebuild \
  MySQL-OurDelta-5.0.87.d10-65.el5.src.rpm \
  --define 'ourdelta 1' \
  --define 'mysqlversion 5.0.87' \
  --define 'elversion 5' \
  --define 'patchset d10'

I use puppet to distribute my sshd configuration, including pre-generated ssh certificates.

Here's how I bulk create certificates for a bunch of new nodes named b001-b034:

for n in $(seq -w 1 34); do
    ssh-keygen -q -t rsa -f b0$n -C '' -N ''

Having got racadm working on my workstation (see my previous post), the next step is to perform initial DRAC configuration, ie. change the root password, set the SSL cert values, etc.

First I checked that all DRACs were pingable:

for h in $(seq -w 1 34); do
    if ping -q -c 1 $hn >& /dev/null ; then
        echo OK
        echo failed

Next, I created a drac config file (named drac.cfg) containing the settings that are common to all devices:

# cfgUserAdminIndex=2
# cfgRacSecCsrCommonName=
cfgRacSecCsrOrganizationUnit=Web Services
cfgRacSecCsrLocalityName=My City
cfgRacSecCsrStateName=My State

I then ran a script to apply the common configuration to all devices. I also set the device-specific settings in the same script:

for n in $(seq -w 1 34); do
    racadm -r $fullname -u root -p calvin config -g cfgLanNetworking -o cfgDNSRacName $host
    racadm -r $fullname -u root -p calvin config -g cfgRacSecurity -o cfgRacSecCsrCommonName $fullname
    racadm -r $fullname -u root -p calvin config -f drac.cfg

Notice that I don't change the default password until last.

Now, I just need to work out how to generate the CSR, sign it, and upload the new cert…

I wanted to install racadm on my local (non-Dell) workstation running Fedora 11. I tried installing via the repos as per the instructions, but without success. I eventually managed by downloading manually:

wget http://linux.dell.com/repo/hardware/OMSA_6.1/platform_independent/rh50_64/racadm/mgmtst-racadm-6.1.0-648.i386.rpm
yum localinstall mgmtst-racadm-6.1.0-648.i386.rpm

The next problem was that racadm wouldn't run – it failed with the error:

ERROR: Failed to initialize transport

Running under strace showed that it seemed to be having problems finding a suitable ssl library – probably because racadm is i686 and I'm running it on x86_64. Using yum provides I determined that a suitable ssl lib was shipped with Adobe Reader, installed in /opt/Adobe/ on my workstation.

The following couple of symlinks fixed it:

ln -s ../../opt/Adobe/Reader9/Reader/intellinux/lib/libssl.so.0.9.8 /lib/i686/
ln -s ../../opt/Adobe/Reader9/Reader/intellinux/lib/libcrypto.so.0.9.8 /lib/i686/

I needed to check the integrity of the file systems on several xen domU guests while the guests were shutdown, ie. I needed to do it from the dom0.

I use LVM logical volumes for the block devices for the guests disks named $host-disk0. These are stored in a volume group named vg_guests. I use kpartx to access the partitions on the block device.

Each guest disk has a small physical partition for /boot; the rest of the disk is allocated to a 2nd partition which is used as an LVM volume group named vg_$host.

Here's a script I knocked up to do the job:

for host in host1 host2 host3 ; do
    # create devices from the LVs
    kpartx -av /dev/mapper/vg_guests-$host--disk*
    # Activate the VGs for the host
    for vg in $(vgs --noheadings | grep $host | awk '{print $1}' ) ; do
        echo Activating $vg
        vgchange -ay $vg
    # check the file systems
    for p in $(/dev/mapper/vg_$host* | grep -v swap); do
        e2fsck -p $p
    # Deactivate the VGs
    for vg in $(vgs --noheadings | grep $host | awk '{print $1}' ) ; do
        echo De-activating $vg
        vgchange -an $vg
    # Remove the devices
    kpartx -dv /dev/mapper/vg_guests-$host--disk0