Me in IT UNIX/Linux Consultancy is based in Utrecht, The Netherlands and specialized in UNIX and Linux consultancy. Experience with Red Hat Enterprise Linux (Red Hat Certified Architect), Fedora Project, CentOS, OpenBSD and related released Open Source products makes Me in IT UNIX/Linux Consultancy a great partner in implementing, maintaining and upgrading your environment.

Open Source software is an important aspect of any Linux distribution. Me in IT UNIX/Linux Consultancy tries to use Open Source software where possible and tries to share experiences actively. In the articles section you will find many UNIX/Linux adventures shared for others to benefit.

Zabbix triggers with "flap-detection" and a grace period.

Monitoring an environment with some monitoring system gives control, so it's pretty important. But it can be a challenge to setup a monitoring system; it should not alert too fast, but also not too slow.

Nagios uses "flap detection" to prevent many ERROR's and OK's being sent right after each other. Zabbix calls this "hysteresis". Zabbix's hysteresis is rather difficult to understand, so I'd like to share some triggers that I have setup for Zabbix that implement both flap detection/hysteresis and grace.

Grace can be defined like this: "When a value is higher (or lower) then a threshold, make sure it's a little lower (or higher) as the threshold that caused the trigger to alert, before recovering a trigger." I know; it's not easy to understand... Let's look at some examples.

Thresholds that should be above a certain value

With values that need to be below a threshold, like cpu load, number of users logged in or number of processes running:

({TRIGGER.VALUE}=0&{TEMPLATE:CHECK[ITEM].min(300)>ALERTVALUE)|({TRIGGER.VALUE}=1&{TEMPLATE:CHECK[ITEM].max(300)<RECOVERYVALUE)

Just to clarify the different part of the trigger:

  1. {TRIGGER.VALUE} makes sure the first part (before the |) is evaluated when there is no alert, the part after | indicates the trigger should be on/in alert.
  2. .min(300) makes sure the values are minimally as high as ALERTVALUE for 300 seconds.
  3. The last part (after the |) makes sure the trigger recovers when the measured value is lower than RECOVERYVALUE for 300 seconds.

For example CPU load with an ALERTVALUE of 5 and a RECOVERYVALUE of 4:

({TRIGGER.VALUE}=0&{Template_Linux:system.cpu.load[,avg1].min(300)}>5)|({TRIGGER.VALUE}=1&{Template_Linux:system.cpu.load[,avg1].max(300)}<4)

Thresholds that should be below a certain value

With values that need to be above a threshold, like percentage diskspace free, number of inodes free or number of httpd processes running:

({TRIGGER.VALUE}=0&{TEMPLATE:CHECK[ITEM].max(300)<ALERTVALUE)|({TRIGGER.VALUE}=1&{TEMPLATE:CHECK[ITEM].min(300)>RECOVERYVALUE)

For example disk space of /var free in percent with an ALERTVALUE of 10 and a RECOVERYVALUE of 11:

({TRIGGER.VALUE}=0&{Template_Linux:vfs.fs.size[/var,pfree].max(300)}<10)|({TRIGGER.VALUE}=1&{Template_Linux:vfs.fs.size[/var,pfree].min(300)}>11)

These rather complex triggers will prevent spikes of load or diskusage to cause an alert, but the drawback it that you might miss certain interesting spikes too. Overall my opinion is that a monitoring system should not drive people crazy because alerts will be ignored when too many are received.

Examples for batch on linux

Linux has a few ways to schedule jobs to be executed. I am sure most are familiar with crontab and at, but batch is lesser known.

"batch" can be used to: (from the man-page on batch)
executes commands when system load levels permit; in other words, when the load average drops below 0.8, or the value specified in the invocation of atrun.

So:

  • crontab is used for periodic scheduling.
  • at is used for executing something once at a specific time.
  • batch can be used to execute commands when your system has resourses.

You can also combine crontab and batch. Imagine you need to run a sequence of command in a specific order every hour. crontab does not guarantee one command is finished when it executs the next command.
batch can be used from crontab like so:

crontab -l
0 * * * * /usr/bin/batch now /usr/local/bin/prepare-something.sh
1 * * * * /usr/bin/batch now /usr/local/bin/process-something.sh
2 * * * * /usr/bin/batch now /usr/local/bin/report-something.sh

This batches these three commands in a specific order, one after the other, when the systemload is not too high.

One specific situation where I use this; Drupal needs to run a program (cron.php) every hour. crontab would be perfect for that, but when the load is too high, it's not a problem that this program is executed a little later. This is what I have setup:

0 * * * * /usr/bin/batch now /usr/bin/wget -o /dev/null -O /dev/null http://1.example.com/cron.php
1 * * * * /usr/bin/batch now /usr/bin/wget -o /dev/null -O /dev/null http://2.example.com/cron.php
2 * * * * /usr/bin/batch now /usr/bin/wget -o /dev/null -O /dev/null http://3.example.com/cron.php

This ensures that every hour cron.php is ran, but not if the systemload is too high (0.8 or more). One disadvantage of this solution; when your system is overloaded for a long period of time, these batch jobs pile up, then when the load drops below 0.8, all batched commands will be executed. Happily Drupals cron.php will not consume that much resources when it's ran twice.

Release scheme for RPM based Linux distributions

It can be rather confusing what the differences and similarities are on Fedora, Red Hat Enterprise Linux and CentOS. Especially with different versions. This article explains what release schedule and relations the various RPM based Linux distributions have.

Fedora is a Red Hat sponsored community project. Fedora is release approximately every 6 month. Fedora "supports" (supplies updates) for 13 month only. Clearly this is a development distribution.

Red Hat picks up a Fedora version and adds a few patches and call that "Red Hat Enterprise Linux". Red Hat supports that version for quite some time. Red Hat releases more conservatively; every 2 years. Red Hat supports a release for about 5 years after releasing, making this distribution much more "enterprise".

Fedora - Red Hat release relationship
Fedora release Red Hat release
Fedora Core 3 Red Hat Enterprise Linux 4
Fedora Core 6 Red Hat Enterprise Linux 5
Fedora Core 13 Red Hat Enterprise Linux 6

CentOS picks up the source code that Red Hat published for Red Hat Enterprise Linux. The CentOS community patches the artwork and very few other things. CentOS "supports" (provides updates) for as long as Red Hat supplies updates to Red Hat Enterprise Linux.
Interesting to know; once you have choosen to use a certain main version of CentOS, you'll automatically update to the most recent child-version of that release when using "yum update". So; if you install "CentOS 5.0" and run "yum update", you will automatically have "CentOS 5.7". (at the time of this writing.)

So; this makes this image:

Apache Tomcat 7 spec file RPM

I tried to find an RPM for Apache Tomcat version 7, but could not find one. You can use this one, it requires the source code, downloadable from Apache Tomcat's website, under "Source Code Distributions".

This SPEC file creates an RPM "apache-tomcat" that installs in /opt/tomcat, and the default web applications (apache-tomcat-manager, apache-tomcat-ROOT, apache-tomcat-docs, apache-tomcat-examples, apache-tomcat-host-manager). An init-script is included at the bottom, that needs to be available in the SOURCES directory, named "apache-tomcat-iniscript".

Downloads:
SOURCE:
Apache Tomcat SRC rpm

x86_64:
Apache Tomcat 86_64 rpm
Apache Tomcat ROOT application 86_64 rpm
Apache Tomcat docs 86_64 rpm
Apache Tomcat example application 86_64 rpm
Apache Tomcat host manager application 86_64 rpm
Apache Tomcat manager application 86_64 rpm

So far it's been fine, but any comments would be appreciated.

apache-tomcat.spec:

Name: apache-tomcat
Version: 7.0.20
Release: 1
Summary: Open source software implementation of the Java Servlet and JavaServer Pages technologies.
Group: Productivity/Networking/Web/Servers
License: Apache Software License.
Url: http://tomcat.apache.org
Source: %{name}-%{version}-src.tar.gz

BuildRoot: %{_tmppath}/%{name}-%{version}-build
BuildRequires: ant
BuildRequires: ant-trax
Requires: java-1.6.0-openjdk
BuildArch: x86_64

%description
Apache Tomcat is an open source software implementation of the Java Servlet and JavaServer Pages technologies. The Java Servlet and JavaServer Pages specifications are developed under the Java Community Process.

%package manager
Summary: The management web application of Apache Tomcat.
Group: System Environmnet/Applications
Requires: %{name} = %{version}-%{release}
BuildArch: noarch

%description manager
The management web application of Apache Tomcat.

%package ROOT
Summary: The ROOT web application of Apache Tomcat.
Group: System Environmnet/Applications
Requires: %{name}-%{version}-%{release}
BuildArch: noarch

%description ROOT
The ROOT web application of Apache Tomcat.

%package docs
Summary: The docs web application of Apache Tomcat.
Group: System Environmnet/Applications
Requires: %{name}-%{version}-%{release}
BuildArch: noarch

%description docs
The docs web application of Apache Tomcat.

%package examples
Summary: The examples web application of Apache Tomcat.
Group: System Environmnet/Applications
Requires: %{name}-%{version}-%{release}
BuildArch: noarch

%description examples
The examples web application of Apache Tomcat.

%package host-manager
Summary: The host-manager web application of Apache Tomcat.
Group: System Environmnet/Applications
Requires: %{name}-%{version}-%{release}
BuildArch: noarch

%description host-manager
The host-manager web application of Apache Tomcat.

%prep

%setup -q -n %{name}-%{version}-src
# This tells ant to install software in a specific directory.
cat << EOF >> build.properties
base.path=%{buildroot}/opt/apache-tomcat
EOF

%build
ant

%install
rm -Rf %{buildroot}
mkdir -p %{buildroot}/opt/apache-tomcat
mkdir -p %{buildroot}/opt/apache-tomcat/pid
mkdir -p %{buildroot}/opt/apache-tomcat/webapps
mkdir -p %{buildroot}/etc/init.d/
mkdir -p %{buildroot}/var/run/apache-tomcat
%{__cp} -Rip ./output/build/{bin,conf,lib,logs,temp,webapps} %{buildroot}/opt/apache-tomcat
%{__cp} %{_sourcedir}/apache-tomcat-initscript %{buildroot}/etc/init.d/apache-tomcat

%clean
rm -rf %{buildroot}

%pre
getent group tomcat > /dev/null || groupadd -r tomcat
getent passwd tomcat > /dev/null || useradd -r -g tomcat tomcat

%post
chkconfig --add %{name}

%preun
if [ "$1" = "0" ] ; then
service %{name} stop > /dev/null 2>&1
chkconfig --del %{name}
fi

%files
%defattr(-,tomcat,tomcat,-)
%dir /opt/apache-tomcat
%config /opt/apache-tomcat/conf/*
/opt/apache-tomcat/bin
/opt/apache-tomcat/lib
/opt/apache-tomcat/logs
/opt/apache-tomcat/temp
/opt/apache-tomcat/pid
%dir /opt/apache-tomcat/webapps
/var/run/apache-tomcat
%attr(0755,root,root) /etc/init.d/apache-tomcat

%files manager
/opt/apache-tomcat/webapps/manager

%files ROOT
/opt/apache-tomcat/webapps/ROOT

%files docs
/opt/apache-tomcat/webapps/docs

%files examples
/opt/apache-tomcat/webapps/examples

%files host-manager
/opt/apache-tomcat/webapps/host-manager

%changelog
* Fri Aug 19 2011 - robert (at) meinit.nl
- Updated to apache tomcat 7.0.20
- Split (example) applications into their own RPM.
* Mon Jul 4 2011 - robert (at) meinit.nl
- Initial release.

apache-tomcat-initscript:

#!/bin/sh
#
# apache-tomcat
#
# chkconfig: - 85 15
# description: Jakarta Tomcat Java Servlets and JSP server
# processname: java
# pidfile: /var/run/apache-tomcat/pid

. /etc/rc.d/init.d/functions

# Set Tomcat environment.
USER=tomcat
LOCKFILE=/var/lock/apache-tomcat
export BASEDIR=/opt/apache-tomcat
export TOMCAT_HOME=$BASEDIR
export CATALINA_PID=/var/run/apache-tomcat/pid
export CATALINA_OPTS="-DHOME=$BASEDIR/home -Xmx512m -Djava.awt.headless=true"

case "$1" in
  start)
        echo -n "Starting apache-tomcat: "
        status -p $CATALINA_PID apache-tomcat > /dev/null && failure || (su -p -s /bin/sh $USER -c "$TOMCAT_HOME/bin/catalina.sh start" > /dev/null && (touch $LOCKFILE ; success))
        echo
        ;;
  stop)
        echo -n "Shutting down apache-tomcat: "
        status -p $CATALINA_PID apache-tomcat > /dev/null && su -p -s /bin/sh $USER -c "$TOMCAT_HOME/bin/catalina.sh stop" > /dev/null && (rm -f $LOCKFILE ; success) || failure
        echo
        ;;
  restart)
        $0 stop
        $0 start
        ;;
  condrestart)
       [ -e $LOCKFILE ] && $0 restart
       ;;
  status)
        status -p $CATALINA_PID apache-tomcat
        ;;
  *)
        echo "Usage: $0 {start|stop|restart|condrestart|status}"
        exit 1
        ;;
esac

service status somedaemon - status: unrecognized service

I created an initscript for a daemon recently it works well for starting and stopping, but "service status somedaemon" would return "status: unrecognized service"

After a bit of googling I found that the top few lines of the initscript are critical. It should be a describe here:

#!/bin/sh
#
# somedaemon
#
# chkconfig: - 85 15
# description: Some daemon
# processname: somedaemon
# pidfile: /var/run/somedaemon.pid

Here is what these lines do.

  1. #!/bin/sh - This is a shell script.
  2. # somedaemon - This is the name of the service used with for "service status somedaemon". If this is missing, you'll see: "status: unrecognized service".
  3. chkconfig: - 85 15 - This tells chkconfig three things, firstly (the dash) at what runlevel to start it. A dash means default. Secondly at what priority to start and lastly at what priority to stop.
  4. description: Some daemon - Not used.
  5. processname: somedaemon - Not used.
  6. pidfile: /var/run/somedaemon.pid - Used in /etc/rc.d/init.d/functions to determine if the process is running or not.

Linux interview questions

From time to time you might need to interview somebody for a Linux role. It's hard to think of good questions; you don't want to scare somebody with your questions, but you do want to know if the person is knowledgable.

Questions for a junior.

"People that use your Linux server complain it's slow. What tools would you use to check resource usage?"
top, sar, netstat -an, iostat, free, df.

"You discover a disk is full on your Linux server. What do you use to discover where the bigest files/directories are?"

  • To create a short report on the largest directories: du -sk * | sort -nr | head -n10
  • You might use "find" to find large files: find / -size +100M

"A system needs more disk space. You want to add a partition /var/log. You add a drive to the machine, it becomes /dev/sdb. What actions do you take to use this disk?"
Depending if you use LVM or not. If you don't use LVM:

  1. "fdisk /dev/sdb" to add one partition so /dev/sdb1 becomes available.
  2. "mkfs /dev/sdb1" to create a filesystem on it.
  3. "mount /dev/sdb1 /var/log". Actually the data needs to be copied into the new /var/log.

If you do use LVM:

  1. "fdisk /dev/sdb" to add one partition so /dev/sdb1 becomes available.
  2. "pvcreate /dev/sdb1" to make it an LVM device.
  3. "vgextend SomeVolumeGroup /dev/sdb1" to extend an existing Volume Group.
  4. "lvcreate -L 1G -n SomeLogicalVolumeName SomeVolumeGroup" to create a new Logical Volume.
  5. "mkfs /dev/SomeVolumeGroup/SomeLogicalVolumeName" to create a filesystem on it.
  6. "mount /dev/SomeVolumeGroup/SomeLogicalVolumeName /var/log". Actually the data needs to be copied into the new /var/log.

Questions for a medior.

"People that use your Linux server complain it's slow. You've seen the disk usage is high. What can you do to improve this?"

  • Move away services, devide them over different server.
  • Add more memory so disk caching can be used better.
  • Look into the application, why is it reading/writing so much.
  • User faster disks.

"What happens in relation to DNS, SMTP and IMAP when I send send an email to [email protected]?"
My computer is likely to be configure to send the email to a mail server on port 25, SMTP protocol. That mailserver will query the DNS for the MX records of example.com. The mailserver that show up with the lowest priority will be contacted to deliver the email. That mailserver at example.com can accept the email and put in into the imap folder for the user or alias of [email protected]

"You need to setup 100 Red Hat Enterprise Linux systems. If you don't want to walk around and eject and insert a boot CD 100 times, what options would you have?"
Kickstarting would help out. Install one machine as you like it, save /root/anaconda-ks.cfg to a webserver. Setup a PXE (DHCP, TFTP, HTTP, DNS) environment and use that kickstart file to install the rest.

Questions for a senior.

"You have destroyed /etc/pam.d/system-auth and can't login anymore. Another machine has a propper version of /etc/pam.d/system-auth. How would you fix that broken machine?"
The machine needs to be booted in single user mode so you don't get a login prompt. After that here are some option:

  • Start network, use "nc" to get and replace that file.
  • Mount a CD that has the package to fix the broken file.
  • Manually repair it.

"You have installed apache, php and mysql and a webapplication such as Drupal. The webapplication tries to send emails to an external mailserver but fails. What could be the cause when these items have been verified:

  • The web application is correctly configured to use the external mailserver.
  • It's possible to connect to the mailserver on the specified port from the command line.
  • The logs of the mailserver don't record anything when the web application tries to send email."
  • IPtables allows connections out to port 25.

SELinux could be blocking apache from using port 25 on an external system. The logfile /var/log/messages might inform you about it. To fix it issue "setsebool -P httpd_can_network_connect=1".

"What determines the load (w, uptime) of a system?"
The number of processes that are waiting for execution. These processes are in the run queue. Processes could be waiting for io, network or memory allocation.

Drupal on Amazon's Elastic Compute Cloud (EC2)

Hosting Drupal site in the Amazon EC2 cloud is not difficult. Here is a recipe I have used, first attempt was a Fedora 14 EC2 ami, but Fedora 14 comes with php 5.3, which can't be combined with Drupal 5.x. If you only have Drupal 6 (or Drupal 7) sites to host, you can use Fedora 14. If you want to use a "small" instance, please read this bug about readdir64_r. The fix for that bug is easy:

echo "hwcap 1 nosegneg" > /etc/ld.so.conf.d/libc6-xen.conf

Let's continue with Drupal on CentOS. Rightscale provides perfect CentOS amis that can be used on Amazons EC2 platform. If you install one, these are the steps I took to make it Drupal 5, Drupal 6 and Drupal 7 ready:

# Update the software.
yum -y update

# Set the timezone for this machine.
cp /usr/share/zoneinfo/Europe/Amsterdam /etc/localtime

# This images came with 10Gb of ESB storage, I added another 64 Gb volume, here it's called "/dev/sdc".
# Use LVM to be ready to grow in the future.
pvcreate /dev/sdc
vgcreate vg0 /dev/sdc
lvcreate vg0 -L 32G -n var-www
lvcreate vg0 -L 1G -n var-lib-mysql
lvcreate vg0 -L 2G -n root

# Put filesystems on the logical volumes.
mke2fs -j /dev/vg0/var-www
mke2fs -j /dev/vg0/var-lib-mysql
mke2fs -j /dev/vg0/root

# Add the mountpoints to fstab.
echo "/dev/vg0/var-www /var/www/virtualhosts ext3 defaults 0 0" >> /etc/fstab
echo "/dev/vg0/var-lib-mysql /var/lib/mysql ext3 defaults 0 0" >> /etc/fstab
echo "/dev/vg0/root /root ext3 defaults 0 0" >> /etc/fstab

# Create the mountpoints.
mkdir /var/www /var/lib/mysql /root

# Mount all mounpoints in /etc/fstab.
mount -a

# Install the webserver.
yum -y install httpd
service httpd start
chkconfig httpd on

# Install the database server.
yum -y install mysql-server
service mysqld start
chkconfig mysqld on
/usr/bin/mysqladmin -u root password 'YourPassWord'

# Install PHP and all required Drupal php modules.
yum -y install php php-mysql php-mcrypt php-xml php-mbstring php-gd

# Add a single new file in the ESB root filesystem, that includes configurations from /var/www/conf.d
# Using this trick allows you to easily remount the volume on another host in case of troubles.
echo 'Include /var/www/conf.d/*.conf' >> /etc/httpd/conf.d/virtualhosts.conf
mkdir /var/www/conf.d

# Rightscale CentOS images comes with postfix and sendmail. Postfix is enabled, but sendmail is fine for me.
# First erase postfix.
yum -y erase postfix

# Now reinstall sendmail to fix a few permissions.
yum -y reinstall sendmail
service sendmail start
chkconfig on sendmail

# Reboot the box to make sure it's working properly.
reboot

Fedora 14 to be released

So it's not so long before Fedora 14 is released. Lets take a look at some features Fedora 14 will bring that look interesting to me.

  • An Amazon EC2 Image (AIM) for Fedora will be released! Finally move on from that old Fedora 8 image that was available on Amazon EC2.
  • For Desktop Virtualization, Spice is released. This is a KVM feature that enable super fast access to virtualized desktop systesm, Windows, Linux or any other.
  • Perl, Python and Ruby will be updated, nice but not very important to me.
  • OpenSCAP can ensure security compliance. Should be helpful to many customers.

All in all, a pretty good release although it's kind of hard to improve an already good distribution.

Many of the changes done in Fedora, will end up in RHEL. Current estimate is that RHEL 7 will be based on Fedora 16 (earliest) up to Fedora 19. (latest)

Moving a single Drupal instalation into a multisite environment.

If you'd like to move a single installation of Drupal into a multisite environment, use these steps, and replace example.com for your websites name. In this example, Drupal is installed in /var/www/drupal/ .

1. Change the webserver configuration.

This one is obvious, the website was first pointing to an individual installation of Drupal, it needs to be directed to the multi-site installation of Drupal.

2. Copy templates, settings.php and files into the multisite environment.

mkdir -p /var/www/drupal/sites/example.com/
cd /var/www/example.com
cp -Rip html/sites/all/themes html/sites/default/settings.php html/sites/default/files/ /var/www/drupal/sites/example.com/

3. Update the MySQL database with the new paths.

If you have used files (including images) on your website, the path needs to be updated. Earlier files were located on "sites/default/files/", but this will become "sites/example.com/files"

mysql -u root -pPaSsWoRd
USE examplecom;
UPDATE files SET filepath=REPLACE (filepath, 'default', 'example.com');
UPDATE files SET filepath=REPLACE ('filepath', 'images', 'sites/example.com/images') WHERE filepath REGEXP '^images.*';
QUIT;

4. Change the location of the icons for the selected theme.

Go to Administer - Site building - Themes - Your Theme "configure" and change the path to reflect the right one. Mostly this means changing the word "files/" to "sites/example.com/files/".

5. Restart the web server and clean up the old environment.

For Apache, that would be:

apachectl configtest
apachectl restart

Check the website, everything should work, maybe you have to reselect your template to make it look better. If all works well, remove the old code.

rm -R /var/www/example.com

Shrinking a filesystem with LVM

After an installation you might find some file systems are too large, they are almost empty. When you want to use that space for another file system, here are the steps you can take:
Imagine /opt is now 10 Gb, but 1 Gb would be sufficient.

  1. Check if the file system is in use. Using lsof /opt you will get a list of processes that currently use /opt. Stop these processes.
  2. Find out what device is used for /opt with df -h /opt or mount. In my example, I found /dev/mapper/VolGroup/opt hold files on /opt.
  3. Unmount the filesystem, using umount /opt
  4. Resize the filesystem using resize2fs /dev/mapper/VolGroup/opt 1G. This frees the "right" part of the disk that LVM will un-allocate in a moment. All data from the file system is on the "left hand side".
  5. Run lvreduce -L 1G /dev/mapper/VolGroup-opt to shrink the logical volume. (It might warn you that you need to run e2fsck -f /dev/mapper/VolGroup-opt before you can continue.
  6. Remount the filesystem with a command as mount /opt.

For /opt or any other filesystem that can easily be freed from open file handles, the above procedure works fine, but for "busy" filesystems, like /, /var, /usr, and so on, you'd have boot the machine without mounting filesystems. One way to do this is using the installation CD and starting up the "rescue" environment.

About Consultancy Articles Contact




References Red Hat Certified Architect By Robert de Bock Robert de Bock
Curriculum Vitae By Fred Clausen +31 6 14 39 58 72
By Nelson Manning [email protected]
Syndicate content