next up previous contents
Next: Shared Memory II; Threads, Up: Linux & Cluster Previous: Hardware Performance Counters   Contents


Tuning with /proc

The /proc file system is not really a file system at all, but a window on the running kernel. It contains handles that can be used to extract information from the kernel or, in some cases, change parameters deep inside the kernel. In a properly configured Beowulf node, nearly all of the available CPU cycles and memory are devoted to the scientific application. Trimming down the kernel and removing unneeded daemons and processes provides slightly more room for the host application. Tuning up the remaining very small kernel can further refine the results. Occasionally, a performance bottleneck can be dislodged with some simple kernel tuning.
for a look at the Ethernet device:
% cat /proc/net/dev
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
    lo: 3030052   16423    0    0    0     0          0         0  3030052   16423    0    0    0     0       0          0
  eth0:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
irlan0:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
vmnet3:       0       0    0    0    0     0          0         0        0     521    0    0    0     0       0          0
vmnet1:       0       0    0    0    0     0          0         0        0     521    0    0    0     0       0          0
vmnet8:       0       0    0    0    0     0          0         0        0     521    0    0    0     0       0          0
  eth1:475834626  101601  774 44956    0     0          0         0 168328047  108335    0    0    0     0       0          0
One set of important values is the total bytes and the total packets sent or received on an interface. Sometimes a little basic scientific observation and data gathering can go a long way. Are the numbers reasonable? Is application traffic using the correct interface? You may need to tune the default route to use a high-speed interface. Is something flooding your network? What is the size of the average packet? Another key set of values is for the collisions (colls), errs, drop, and frame. All of those values represent some degree of inefficiency in the Ethernet. Ideally, they will all be zero.

Tunable kernel parameters are in /proc/sys. Network parameters are generally in /proc/sys/net. Many parameters can be changed. Some administrators tweak a Beowulf kernel by modifying parameters such as tcp_sack, tcp_timestamps, tcp_window_scaling, rmem_default, rmem_max, wmem_default, or wmem_max.
Memory, the meminfo handle provides many useful data points:

% cat /proc/meminfo
       total:    used:     free:    shared:  buffers: cached:
Mem: 263380992 152883200 110497792 64057344 12832768 44445696
Swap: 271392768 17141760 254251008
MemTotal: 257208 kB
MemFree: 107908 kB
MemShared: 62556 kB
Buffers: 12532 kB
Cached: 43404 kB
SwapTotal: 265032 kB
SwapFree: 248292 kB
In the example output, the system has 256 megabytes of RAM, about 12.5 megabytes allocated for buffers and 108 megabytes of free memory. The tunable virtual memory parameters are in /proc/sys/vm. Some Beowulf administrators may wish to tune the amount of memory used for buffering.
% cat /proc/sys/vm/buffermem
2 10 60
The first value represents, as a percentage, the amount of the total system memory used for buffering on the Beowulf node. For a 256-megabyte node, no less than about 5 megabytes will be used for buffering. To change the value is simple:
% echo 4 10 60 > /proc/sys/vm/buffermem
Probing the file system.
Like networking and virtual memory, there are many /proc handles for tuning or probing the file system. A node spawning many tasks can use many file handles. A standard ssh to a remote machine, where the connection is maintained, and not dropped, requires four file handles. The number of file handles permitted can be displayed with the command
% cat /proc/sys/fs/file-max
102106
The command for a quick look at the current system is
% cat /proc/sys/fs/file-nr
4160    0       102106
This shows the high-water mark (in this case, we have nothing to worry about), the current number of handles in use, and the max. The utility /sbin/hdparm is especially handy at querying, testing, and even setting hard disk parameters:
% /sbin/hdparm -t /dev/sda1

/dev/sda1:
 Timing buffered disk reads:  118 MB in  3.04 seconds =  38.85 MB/sec
you can understand whether your disk is performing as it should, and as you expect. Finally, some basic parameters of that kernel can be displayed or modified. /proc/sys/kernel contains structures. For some message-passing codes, the key may be /proc/sys/kernel/shmmax. It can be used to get or set the maximum size of shared-memory segments. For example,
% cat /proc/sys/kernel/shmmax
33554432
shows that the largest shared-memory segment available is 32 megabytes. Especially on an SMP, some messaging layers may use shared-memory segments to pass messages within a node, and for some systems and applications 32 megabytes may be too small.


next up previous contents
Next: Shared Memory II; Threads, Up: Linux & Cluster Previous: Hardware Performance Counters   Contents
Cem Ozdogan 2009-01-05