Saturday, December 19, 2009

Process Concepts














Process Concepts


One of the appealing characteristics of Solaris and other UNIX-like systems is that applications can execute (or spawn) other applications: after all, user shells are nothing more than applications themselves. A shell can spawn another shell or application, which can spawn another shell or application, and so on. Instances of applications, such as the sendmail mail transport agent or the telnet remote access application, can be uniquely identified as individual processes and are associated with a unique process identifier (PID), which is an integer.


You may be wondering why process identifiers are not content addressable—that is, why the sendmail process cannot be identified as simply sendmail. Such a scheme would be quite sensible if it were impossible to execute multiple, independent instances of the same application (like early versions of the MacOS). However, Solaris allows the same user or different users to concurrently execute the same application independently, which means that an independent identifier is required for each process. This also means that each PID is related to a user identifier (UID) and to that user’s group identifier (GID). The UID in this case can be either the real UID of the user who executed the process or the effective UID if the file executed is setUID. Similarly, the GID in this case can either be the real GID, which the user who executed the process belongs to, or the effective GID if the file executed is setGID.






Tip 

When an application can be executed as setUID and setGID, other users can execute such a program as the user who owns the file. This means that setting a file as setGID for root can be dangerous in some situations, although necessary in others.



An application, such as a shell, can spawn another application by using the system call system() in a C program. This is expensive performance-wise, however, because a new shell process is spawned in addition to the target application. An alternative is to use the fork() system call, which spawns child processes directly, with applications executed using exec(). Each child process is linked back to its parent process: if the parent process exits, the parent process automatically reverts to PID 1, which exits when the system is shut down or rebooted.


In this section, you’ll look at ways to determine which processes are currently running on your system and how to examine process lists and tables to determine what system resources are being used by specific processes.


The main command used to list commands is ps, which is highly configurable and has many command-line options. These options, and the command format, changed substantially from Solaris 1.x to Solaris 2.x: the former used BSD-style options, like ps aux, while the latter uses System V–style parameters, like ps -eaf. The proctool monitoring application is also supplied with Solaris 9.



ps takes a snapshot of the current process list; many administrators find that they need to interactively monitor processes on systems that have a high load, so they kill processes that are consuming too much memory, or at least assign them a lower execution priority.






Tip 

One popular process monitoring tool is top, which is described later in this chapter in the section “Using the top Program.”



As shown in Figure 8-1, the CDE (Common Desktop Environment) also has its own graphical “process finder,” which lists currently active processes. It is possible to list processes here by PID, name, owner, percentage of CPU time consumed, physical memory used, virtual memory used, date started, parent PID, and the actual command executed. This does not provide as much information as top, but it is a useful tool within the CDE.






Figure 8-1: CDE’s graphical process finder



Sending Signals


Because all processes are identifiable by a single PID, the PID can be used to manage that process by means of a signal. Signals can be sent to other processes in C programs using the signal() function, or they can be sent directly from within the shell. Solaris supports a number of standard signal types that can be used as a means of interprocess communication.


A common use for signals is to manage user applications that are launched from a shell. A “suspend” signal, for example, can be sent to an application running in the foreground by pressing CTRL-Z at any time. To run this application in the background in the C shell, for example, you would need to type bg at the command prompt. A unique background job number is then assigned to the job, and typing fg n, where n is that job number, brings the process back to the foreground.






EXAM TIP  

You can run as many applications as you like in the background.



In the following example, httpd is run in the foreground. When you press CTRL-Z, the process is suspended, and when you type bg, it is assigned the background process number 1. You can then execute other commands, such as ls, while httpd runs in the background. When you then type fg, the process is brought once again into the foreground.


client 1% httpd
^z
Suspended
client 2% bg
[1] httpd&
client 3% ls
httpd.conf access.conf srm.conf
client 4% fg

A useful command is the kill command, which is used to send signals directly to any process on the system. It is usually called with two parameters—the signal type and the PID. For example, if you have made changes to the configuration file for the Internet superdaemon, you must send a signal to the daemon to tell it to reread its configuration file. Note that you don’t need to restart the daemon itself: This is one of the advantages of a process-based operating system that facilitates interprocess communication. If inetd had the PID 167, typing


# kill -1 167

would force inetd to reread its configuration file and update its internal settings. The -1 parameter stands for the SIGHUP signal, which means "hang up." However, imagine a situation in which you wanted to switch off inetd temporarily to perform a security check. You could send a kill signal to the process by using the -9 parameter (the SIGKILL signal):


# kill -9 167

Although SIGHUP and SIGKILL are the most commonly used signals in the shell, several others are used by programmers and are defined in the signal.h header file. Another potential consequence of sending a signal to a process is that instead of “hanging up” or “being killed,” the process could exit and dump a core file, which is a memory image of the process to which the message was sent. This result is useful for debugging, although too many core files will quickly fill up your file system! You can always obtain a list of available signals to the kill command by passing the -l option:


$ kill -l
HUP INT QUIT ILL TRAP ABRT EMT FPE KILL BUS SEGV SYS PIPE
ALRM TERM USR1 USR2 CLD PWR WINCH URG POLL STOP TSTP CONT
TTIN TTOU VTALRM PROF XCPU XFSZ WAITING LWP FREEZE THAW
RTMIN RTMIN+1 RTMIN+2 RTMIN+3 RTMAX-3 RTMAX-2 RTMAX-1
RTMAX





Listing Processes


You can use the ps command to list all currently active processes on the local system. By default, ps prints the processes belonging to the user who issues the ps command:


$ ps
PID TTY TIME CMD
29081 pts/8 0:00 ksh

The columns in the default ps list are the process identifier (PID), the terminal from which the command was executed (TTY), the CPU time consumed by the process (TIME), and the actual command that was executed (CMD), including any command- line options passed to the program.


Alternatively, if you would like more information about the current user’s processes, you can add the -f parameter:


$ ps -f
UID PID PPID C STIME TTY TIME CMD
pwatters 29081 29079 0 10:40:30 pts/8 0:00 /bin/ksh

Again, the PID, TTY, CPU time, and command are displayed. However, the username is also displayed, as is the PID of the parent process (PPID), along with the starting time of the process (STIME). In addition, a deprecated column (C) is used to display processor utilization. To obtain the maximum detail possible, you can also use the -l option, which means "long"—and long it certainly is, as shown in this example:



$ ps -l
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
8 S 6049 29081 29079 0 51 20 e11b4830 372 e11b489c pts/8 0:00 ksh
8 0 6049 29085 29081 0 51 20 e101b0d0 512 pts/8 0:00 bash


Here, you can see the following:




  • The flags (F) associated with the processes




  • The state (S) of the processes (29081 is sleeping “S,” 29085 is running “O”)




  • The process identifier (29081 and 29085)




  • Parent process identifier (29079 and 29081)




  • Processor utilization (deprecated)




  • Process priority (PRI), which is 51




  • Nice value (NI), which is 20




  • Memory address (ADDR), which is expressed in hex (e11b4830 and e101b0d0)




  • Size (SZ), in pages of memory, which is 372 and 512




  • The memory address for sleeping process events (WCHAN), which is e11b489c for PID 29081




  • CPU time used (TIME)




  • The command executed (CMD)




If you’re a system administrator, you’re probably not interested in the status of just your own processes; you probably want details about all or some of the processes actively running on the system, and you can do this in many ways. You can generate a process list using the -A or the -e option, for example, and either of these lists information for all processes currently running on the machine:


# ps -A
PID TTY TIME CMD
0 ? 0:00 sched
1 ? 0:01 init
2 ? 0:01 pageout
3 ? 9:49 fsflush
258 ? 0:00 ttymon
108 ? 0:00 rpcbind
255 ? 0:00 sac
60 ? 0:00 devfseve
62 ? 0:00 devfsadm
157 ? 0:03 automount
110 ? 0:01 keyserv
112 ? 0:04 nis_cache
165 ? 0:00 syslogd

Again, the default display of PID, TTY, CPU time, and command is generated. The processes listed relate to the scheduler, init, the system logging facility, the NIS cache, and several other standard applications and services.






Tip 

It is good practice for you to become familiar with the main processes on your system and the relative CPU times they usually consume. This can be useful information when troubleshooting or when evaluating security.



One of the nice features of the ps command is the ability to combine multiple flags to print out a more elaborate process list. For example, we can combine the -A option (all processes) with the -f option (full details) to produce a process list with full details. Here are the full details for the same process list:



# ps -Af
UID PID PPID C STIME TTY TIME CMD
root 0 0 0 Mar 20 ? 0:00 sched
root 1 0 0 Mar 20 ? 0:01 /etc/init -
root 2 0 0 Mar 20 ? 0:01 pageout
root 3 0 0 Mar 20 ? 9:51 fsflush
root 258 255 0 Mar 20 ? 0:00 /usr/lib/saf/ttymon
root 108 1 0 Mar 20 ? 0:00 /usr/sbin/rpcbind
root 255 1 0 Mar 20 ? 0:00 /usr/lib/saf/sac -t 300
root 60 1 0 Mar 20 ? 0:00 /usr/lib/devfsadm/devfseventd
root 62 1 0 Mar 20 ? 0:00 /usr/lib/devfsadm/devfsadmd
root 157 1 0 Mar 20 ? 0:03 /usr/lib/autofs/automountd
root 110 1 0 Mar 20 ? 0:01 /usr/sbin/keyserv
root 112 1 0 Mar 20 ? 0:05 /usr/sbin/nis_cachemgr
root 165 1 0 Mar 20 ? 0:00 /usr/sbin/syslogd


Another common use for ps is to print process information in a format that is suitable for the scheduler:


% ps -c
PID CLS PRI TTY TIME CMD
29081 TS 48 pts/8 0:00 ksh
29085 TS 48 pts/8 0:00 bash

This can be useful when used in conjunction with the priocntl command, which displays the parameters used for process scheduling. This allows administrators, in particular, to determine the process classes currently available on the system, or to set the class of a specific process to interactive or time-sharing. You can obtain a list of all supported classes by passing the -l parameter to priocntl:


# priocntl -l
CONFIGURED CLASSES
==================
SYS (System Class)
TS (Time Sharing)
Configured TS User Priority Range: -60 through 60
IA (Interactive)
Configured IA User Priority Range: -60 through 60

You can combine this with a -f full display flag to ps -c to obtain more information:



$ ps -cf
UID PID PPID CLS PRI STIME TTY TIME CMD
paul 29081 29079 TS 48 10:40:30 pts/8 0:00 /bin/ksh
paul 29085 29081 TS 48 10:40:51 pts/8 0:00 /usr/local/bin/bash


If you want to obtain information about processes being executed by a particular group of users, this can be specified on the command line by using the -g option, followed by the GID of the target group. In this example, all processes from users in group 0 will be printed:


$ ps -g 0
PID TTY TIME CMD
0 ? 0:00 sched
1 ? 0:01 init
2 ? 0:01 pageout
3 ? 9:51 fsflush

Another common configuration option used with ps is -j, which displays the session identifier (SID) and the process group identifier (PGID), as shown here:


$ ps -j
PID PGID SID TTY TIME CMD
29081 29081 29081 pts/8 0:00 ksh
29085 29085 29081 pts/8 0:00 bash

Finally, you can print out the status of lightweight processes (LWPs) in your system. These are virtual CPU or execution resources, which are designed to make the best use of available CPU resources based on their priority and scheduling class. Here is an example:


$ ps -L
PID LWP TTY LTIME CMD
29081 1 pts/8 0:00 ksh
29085 1 pts/8 0:00 bash





Using the top Program


If you’re an administrator, you probably want to keep an eye on all processes running on a system, particularly if the system is in production use. This is because buggy programs can consume large amounts of CPU time, preventing operational applications from carrying out their duties efficiently. Monitoring the process list almost constantly is necessary, especially if performance begins to suffer on a system. Although you could keep typing ps –eaf every 5 minutes or so, a much more efficient method is to use the top program to monitor the processes in your system interactively, and to use its “vital statistics,” such as CPU activity states, real and virtual memory status, and the load average. In addition, top displays the details of the leading processes that consume the greatest amount of CPU time during each sampling period. An alternative to top is prstat, which has the advantage of being bundled with the operating system.


The display of top can be customized to include any number of these leading processes at any one time, but displaying the top 10 or 20 processes is usually sufficient to keep an eye on rogue processes.






Tip 

The latest version of top can always be downloaded from ftp://ftp.groupsys.com/pub/top.



top reads the /proc file system to generate its process statistics. This usually means that top runs as a setUID process, unless you remove the read and execute permissions for nonroot users and run it only as root. Paradoxically, doing this may be just as dangerous, because any errors in top may impact the system at large if executed by the root user. Again, setUID processes are dangerous, and you should evaluate whether the tradeoff between accessibility and security is worthwhile in this case.






Caution 

One of the main problems with top running on Solaris is that top is very sensitive to changes in architecture and/or operating system versions. This is particularly the case if the GNU gcc 2.x compiler is used to build top, as it has its own set of include files. These must exactly match the version of the current operating system; otherwise, top will not work properly: the CPU state percentages may be wrong, indicating that processes are consuming all CPU time, when the system is actually idle. The solution is to rebuild gcc so that it generates header files that are appropriate for your current operating system version.



Let’s examine a printout from top:



last PID: 16630;  load averages:  0.17,  0.08,  0.06     09:33:29
72 processes: 71 sleeping, 1 on cpu
CPU states: 87.6% idle, 4.8% user, 7.5% kernel, 0.1% iowait, 0.0% swap
Memory: 128M real, 3188K free, 72M swap in use, 172M swap free


This summary tells us that the system has 72 processes, with only 1 running actively and 71 sleeping. The system was 87.6 percent idle in the previous sampling epoch, and there was little swapping or iowait activity, ensuring fast performance. The load average for the previous 1, 5, and 15 minutes was 0.17, 0.08, and 0.06 respectively—this is not a machine that is taxed by its workload. The last PID to be issued to an application, 16630, is also displayed.



  PID USERNAME THR PRI NICE  SIZE   RES STATE   TIME    CPU COMMAND
259 root 1 59 0 18M 4044K sleep 58:49 1.40% Xsun
16630 pwatters 1 59 0 1956K 1536K cpu 0:00 1.19% top
345 pwatters 8 33 0 7704K 4372K sleep 0:21 0.83% dtwm
16580 pwatters 1 59 0 5984K 2608K sleep 0:00 0.24% dtterm
9196 pwatters 1 48 0 17M 1164K sleep 0:28 0.01% netscape
13818 pwatters 1 59 0 5992K 872K sleep 0:01 0.00% dtterm
338 pwatters 1 48 0 7508K 0K sleep 0:04 0.00% dtsession
112 pwatters 3 59 0 1808K 732K sleep 0:03 0.00% nis_cachemgr
157 pwatters 5 58 0 2576K 576K sleep 0:02 0.00% automountd
422 pwatters 1 48 0 4096K 672K sleep 0:01 0.00% textedit
2295 pwatters 1 48 0 7168K 0K sleep 0:01 0.00% dtfile
8350 root 10 51 0 3000K 2028K sleep 0:01 0.00% nscd
8757 pwatters 1 48 10 5992K 1340K sleep 0:01 0.00% dtterm
4910 nobody 1 0 0 1916K 0K sleep 0:00 0.00% httpd
366 pwatters 1 28 0 1500K 0K sleep 0:00 0.00% sdtvolcheck


This top listing shows a lot of information about each process running on the system, including the PID, the user who owns the process, the nice value (priority), the size of the application, the amount resident in memory, its current state (active or sleeping), the CPU time consumed, and the command name. For example, the Apache web server runs as the httpd process (PID=4910), by the user nobody, and is 1916KB in size.


Changing the nice value of a process ensures that it receives more or less priority from the process scheduler. Reducing the nice value ensures that the process priority is increased, while increasing the nice value decreases the process priority. Unfortunately, while ordinary users can increase their nice value, only the superuser can decrease the nice value for a process. In the preceding example for top, the dtterm process is running with a nice value of 10, which is low. If the root user wanted to increase the priority of the process to 20, he or she would issue the command


# nice --20 dtterm

Increasing the nice value can be performed by any user. To increase the nice value of the top process, the following command would be used:


$ nice -20 ps

Now, if you execute an application that requires a lot of CPU power, you will be able to monitor the impact on the system as a whole by examining the changes in the processes displayed by top. If you execute the command


$ find . -name apache -print

the impact on the process distribution is immediately apparent:



last PID: 16631;  load averages: 0.10, 0.07, 0.06   09:34:08
73 processes: 71 sleeping, 1 running, 1 on cpu
CPU states: 2.2% idle, 0.6% user, 11.6% kernel, 85.6% iowait, 0.0% swap
Memory: 128M real, 1896K free, 72M swap in use, 172M swap free


This summary tells you that the system now has 73 processes, with only 1 running actively, 1 on the CPU, and 71 sleeping. The new process is the find command, which is actively running. The system is now only 2.2 percent idle, a large increase on the previous sampling epoch. There is still no swapping activity, but iowait activity has risen to 85.6 percent, slowing system performance. The load average for the previous 1, 5, and 15 minutes was 0.10, 0.07, and 0.06, respectively—on the average, this machine is still not taxed by its workload and wouldn’t be unless the load averages grew to greater than 1. The last PID to be issued to an application, 16631, is also displayed, and in this case it again refers to the find command.



  PID USERNAME THR PRI NICE  SIZE   RES STATE   TIME    CPU COMMAND
16631 pwatters 1 54 0 788K 668K run 0:00 1.10% find
259 root 1 59 0 18M 4288K sleep 58:49 0.74% Xsun
16630 pwatters 1 59 0 1956K 1536K cpu 0:00 0.50% top
9196 pwatters 1 48 0 17M 3584K sleep 0:28 0.13% netscape
8456 pwatters 1 59 0 5984K 0K sleep 0:00 0.12% dtpad
345 pwatters 8 59 0 7708K 0K sleep 0:21 0.11% dtwm
16580 pwatters 1 59 0 5992K 2748K sleep 0:00 0.11% dtterm
13838 pwatters 1 38 0 2056K 652K sleep 0:00 0.06% bash
13818 pwatters 1 59 0 5992K 1884K sleep 0:01 0.06% dtterm
112 root 3 59 0 1808K 732K sleep 0:03 0.02% nis_cachemgr
337 pwatters 4 59 0 4004K 0K sleep 0:00 0.01% ttsession
338 pwatters 1 48 0 7508K 0K sleep 0:04 0.00% dtsession
157 root 5 58 0 2576K 604K sleep 0:02 0.00% automountd
2295 pwatters 1 48 0 7168K 0K sleep 0:01 0.00% dtfile
422 pwatters 1 48 0 4096K 0K sleep 0:01 0.00% textedit



find now uses 1.1 percent of CPU power, which is the highest of any active process (that is, in the "run" state) on the system. It uses 788K of RAM, less than most other processes; however, most other processes are in the "sleep" state, and do not occupy much resident memory.






Using the truss Program


If you’ve identified a process that appears to be having problems, and you suspect it’s an application bug, it’s not just a matter of going back to the source to debug the program or making an educated guess about what’s going wrong. In fact, one of the great features of Solaris is the ability to trace system calls for every process running on the system. This means that if a program is hanging—for example, because it can’t find its initialization file—the failed system call revealed using truss would display this information. truss prints out each system call, line by line, as it is executed by the system. The syntax is rather like a C program, making it easy for C programmers to interpret the output. The arguments are displayed by retrieving information from the appropriate headers, and any file information is also displayed.


As an example, let’s look at the output from the cat command, which we can use to display the contents of /etc/resolv.conf, which is used by the Domain Name Service (DNS) to identify domains and name servers. Let’s look at the operations involved in running this application:



# truss cat /etc/resolv.conf
execve("/usr/bin/cat", 0xEFFFF740, 0xEFFFF74C) argc = 2
open("/dev/zero", O_RDONLY) = 3
mmap(0x00000000, 8192, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE, 3, 0) =
0xEF7B0000
open("/usr/lib/libc.so.1", O_RDONLY) = 4
fstat(4, 0xEFFFF2DC) = 0
mmap(0x00000000, 8192, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0xEF7A0000
mmap(0x00000000, 704512, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) =
0xEF680000
munmap(0xEF714000, 57344)
= 0
mmap(0xEF722000, 28368, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|
MAP_FIXED, 4, 598016) = 0xEF722000
mmap(0xEF72A000, 2528, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|
MAP_FIXED, 3, 0) = 0xEF72A000
close(4) = 0
open("/usr/lib/libdl.so.1", O_RDONLY) = 4
fstat(4, 0xEFFFF2DC) = 0
mmap(0xEF7A0000, 8192, PROT_READ|PROT_EXEC, MAP_PRIVATE|
MAP_FIXED, 4, 0) = 0xEF7A0000
close(4) = 0
open("/usr/platform/SUNW,Ultra-2/lib/libc_psr.so.1", O_RDONLY) = 4
fstat(4, 0xEFFFF0BC) = 0
mmap(0x00000000, 8192, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0xEF790000
mmap(0x00000000, 16384, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) =
0xEF780000
close(4) = 0
close(3) = 0
munmap(0xEF790000, 8192) = 0
fstat64(1, 0xEFFFF648) = 0
open64("resolv.conf", O_RDONLY) = 3
fstat64(3, 0xEFFFF5B0) = 0
llseek(3, 0, SEEK_CUR) = 0
mmap64(0x00000000, 98, PROT_READ, MAP_SHARED, 3, 0) = 0xEF790000
read(3, " d", 1) = 1
memcntl(0xEF790000, 98, MC_ADVISE, 0x0002, 0, 0) = 0
domain paulwatters.com
nameserver 192.56.67.16
nameserver 192.56.67.32
nameserver 192.56.68.16
write(1, " d o m a i n p a u l w a t t e r s .".., 98) = 98
llseek(3, 98, SEEK_SET) = 98
munmap(0xEF790000, 98) = 0
llseek(3, 0, SEEK_CUR) = 98
close(3) = 0
close(1) = 0
llseek(0, 0, SEEK_CUR) = 57655
_exit(0)


Firstly, cat is called using execve(), with two arguments (that is, the application name, cat, and the file to be displayed, /etc/resolv.conf). The arguments to execve() include the name of the application (/usr/bin/cat), a pointer to the argument list (0xEFFFF740), and a pointer to the environment (0xEFFFF74C). Next, library files such as /usr/lib/libc.so.1 are read. Memory operations (such as mmap()) are performed continuously. The resolv.conf file is opened as read-only, after which the contents are literally printed to standard output. Then the file is closed.






Tip 

truss can be used to trace the system calls for any process running on your system.







Automating Jobs


Many system administration tasks need to be performed on a regular basis. For example, log files for various applications need to be archived nightly and a new log file created. Often a short script is created to perform this, by following these steps:




  1. Kill the daemon affected, using the kill command.




  2. Compress the log file using the gzip or compress command.




  3. Change the log filename to include a time stamp so that it can be distinguished from other log files by using the time command.




  4. Move it to an archive directory, using the mv command.




  5. Create a new log file by using the touch command.




  6. Restart the daemon by calling the appropriate /etc/init.d script.




Instead of the administrator having to execute these commands interactively at midnight, they can be scheduled to run daily using the cron scheduling command. Alternatively, if a job needs to be run only once at a particular time, like bringing a new web site online at 7:00 A.M. one particular morning, the at scheduler can be used. The next section looks at the advantages and disadvantages of each scheduling method.



Using at


You can schedule a single system event for execution at a specified time by using the at command. The jobs are specified by files in the /var/spool/cron/atjobs, while configuration is managed by the file /etc/cron.d/at.deny. The job can be a single command, or it can refer to a script that contains a set of commands.


Suppose, for example, that you want to start up sendmail at a particular time because some maintenance of the network infrastructure is scheduled to occur until 8:30 A.M tomorrow morning, but you really don’t feel like logging in early and starting up sendmail (you’ve switched it off completely during an outage to prevent users from filling the queue). You can add a job to the queue, which is scheduled to run at 8:40 A.M., giving the network crew a 10-minute window to do their work:


$ at 0840
at>> /usr/lib/sendmail -bd
at>> <<EOT>>
commands will be executed using /bin/ksh
job 954715200.a at Mon Apr 3 08:40:00 2000

After submitting a job using at, check that the job is properly scheduled by seeing whether an atjob has been created:



$ cd /var/spool/cron/atjobs
client% ls -l
total 8
-r-Sr--r-- 1 paul other 3701 Apr 3 08:35 954715200.a


The file exists, which is a good start. Now check that it contains the appropriate commands to run the job:


$ cat 954715200.a
: at job
: jobname: stdin
: notify by mail: no
export PWD; PWD='/home/paul'
export _; _='/usr/bin/at'
cd /home/paul
umask 22
ulimit unlimited
/usr/lib/sendmail -bd

This looks good. After 8:40 A.M. the next morning, the command should have executed at the appropriate time, and some output should have been generated and sent to you as an e-mail message.


Here’s what the message contains:


From paul Sat Apr  1 08:40:00 2000
Date: Sat Apr 1 2000 08:40:00 +1000 (EST)
From: paul <<paul>>
To: paul
Subject: Output from "at" job
Your "at" job on austin
"/var/spool/cron/atjobs/954715200.a"
produced the following output:
/bin/ksh[5]: sendmail: 501 Permission denied

Oops! You forgot to submit the job as root: normal users don’t have permission to start sendmail in the background daemon mode. You would need to submit this job as root to be successful.





Scheduling with cron


An at job executes only once at a particular time. However, cron is much more flexible because you can schedule system events to execute repetitively at regular intervals by using the crontab command.


Each user on the system can have a crontab file, which allows them to schedule multiple events at multiple times on multiple dates. The jobs are specified by files in the /var/spool/cron/cronjobs, while configuration information is managed by the files /etc/ cron.d/cron.allow and /etc/cron.d/cron.deny.


To check root’s crontab file, you can use the crontab -l command:



# crontab -l root
10 3 * * 0,4 /etc/cron.d/logchecker
10 3 * * 0 /usr/lib/newsyslog
15 3 * * 0 /usr/lib/fs/nfs/nfsfind
1 2 * * * [ -x /usr/sbin/rtc ] && /usr/sbin/rtc -c >> /dev/null 2>>&1
30 3 * * * [ -x /usr/lib/gss/gsscred_clean ] && /usr/lib/gss/gsscred_clean


This is the standard crontab command generated by Solaris for root, and it performs tasks like checking whether the cron log file is approaching the system limit at 3:10 A.M. on Sundays and Thursdays, creating a new system log at 3:10 A.M. only on Sundays, and reconciling time differences at 2:01 A.M every day of the year.


The six fields in the crontab file stand for the following:




  • Minutes, in the range 0–59




  • Hours, in the range 0–23




  • Days of the month, in the range 1–31




  • Months of the year, in the range 1–12




  • Days of the week, in the range 0–6, starting with Sundays




  • The command to execute




If you want to add or delete an entry from your crontab file, you can use the crontab -e command. This will start up your default editor (vi on the command line, textedit in CDE, or as defined by the EDITOR environment variable), and you can make changes interactively.






Caution 

After saving your job, you need to run crontab by itself to make the changes.
















No comments: