This document is meant to help an unskilled and non-privileged user discover various facts about the UNIX system they are using, it's operation, resources and status. In particular, it is meant to allow a user to find out enough information about their machine so that they can discover why they are having problems, or, at least, give sufficient information to a system manager so that they can figure out why the user is having problems. In either case, figuring out what's wrong, if anything, goes a long way to fixing the problem.
To know what is going on on your system, the most common commands to use are ps (process status) and sar (system activities report).
To know which disks are mounted in your system, use df (disk freespace). To know the file system disk space usage information, use du (disk usage).
To kill a process, use kill. Non-privileged users can only kill their own processes.
Useful options: -e show all processes -l show long list -f show full list -u username show processes owned by a specific userFor example:
d0sgi6[70]% ps -ef
UID PID PPID C STIME TTY TIME COMD
root 0 0 0 Nov 17 ? 0:00 sched
root 1 0 0 Nov 17 ? 2:23 /etc/init
root 2 0 0 Nov 17 ? 0:00 vhand
root 3 0 0 Nov 17 ? 5:27 bdflush
root 4 0 0 Nov 17 ? 2:25 vfs_sync
root 5 0 0 Nov 17 ? 0:00 pdflush
dongzhao 19183 1 0 Dec 01 ? 0:03 xwsh -name winterm
...proceses were neglected to save space...
key things to check:
TIME - how many minutes of cpu time has the process been using mm:ss
STIME - how long ago was the process started hh:mm:ss or date if
not started today
UID - who is running the job
C - the higher the number the more cpu cycles the job gets
(this is not a priority) just a way to show you that if
the computer has nothing to do then it's working on the jobs
with the higher numbers more often
d0sgi6[77]% ps -el
F S UID PID PPID C PRI NI P SZ:RSS WCHAN TTY TIME COMD
39 S 0 0 0 0 39 RT * 0:0 801632c0 ? 0:00 sched
30 S 0 1 0 0 39 20 * 69:41 801632f0 ? 2:24 init
39 S 0 2 0 0 39 RT * 0:0 80163180 ? 0:00 vhand
30 S 6354 8927 8926 0 26 20 * 372:67 8025c440 pts/1 0:00 telnet
30 S 0 8151 205 0 26 20 * 294:47 8025c3c0 pts/0 0:01 rlogind
30 S 6354 13670 13669 1 39 20 * 538:222 801632f0 pts/3 0:07 tcsh
30 R 6354 9743 13670 10 65 20 0 320:57 pts/3 0:00 ps
30 S 6354 19168 1 0 26 20 * 1494:664 8025c250 ? 1:23 4Dwm
30 S 6354 20166 1 0 26 10 * 763:301 8025c2b0 ? 0:35 xwsh
...proceses were neglected to save space...
key fields:
NI - priority of the job (also called nice #)
20 means standard priority, 0 is the highest, 40 is the lowest
SZ - amount of memory the program uses in 4096 bytes
RSS - amount of memory of the program actually in RAM in 4096 bytes
S - shows whats R Running or S sleeping at the time the
ps command was executed
If you want to figure out how much memory a program is using, take RSS * 4096
the 4Dwm is using the most memory, 664 * 4096 = 2.7 MB
Useful options: -u CPU usage report -r Memory usage report -d device reportFor example:
sar -u 5 5 ( system CPU usage giving 5 samples of 5 seconds each ) d0sgi6[80]% sar -u 5 5 IRIX d0sgi6 5.3 11091810 IP12 12/03/97 13:39:20 %usr %sys %intr %wio %idle %sbrk %wfs %wswp %wphy %wgsw %wfif 13:39:25 2 3 3 0 92 0 0 0 0 0 0 13:39:30 4 3 2 0 91 0 0 0 0 0 100 13:39:35 1 2 2 1 94 0 50 0 0 0 50 13:39:40 5 3 6 0 86 0 0 0 0 0 100 13:39:45 6 3 2 5 84 0 0 0 0 0 100 13:39:45 %usr %sys %intr %wio %idle %sbrk %wfs %wswp %wphy %wgsw %wfif Average 4 3 3 1 89 0 7 0 0 0 93 what percentage are processes being run in user, system, interupt, wait i/o and idle modes if idle percent is high then there is plenty of CPU time to run your program. wait i/o (wio) means that processes are waiting for data to be retreived from disk (most likely)
sar -r 5 2 (show me memory pages, every 5 seconds I want 2 samples)
d0sgi6[82]% sar -r 5 2
IRIX d0sgi6 5.3 11091810 IP12 12/03/97
13:42:37 freemem freeswp
13:42:42 5801 210000
13:42:47 5801 210000
13:42:47 freemem freeswp
Average 5801 210000
freemem is pages of free memory in 4096 bytes/page
freeswap is that amount of free swap space in 512Kb disk blocks
4096 * 5801 = 23.8 MB of free RAM
512 * 210000 = 107 MB of free Swap space
d0sgi6[83]% df Filesystem Type blocks use avail %use Mounted on /dev/root efs 37615 20519 17096 55% / /dev/usr efs 853020 517401 335619 61% /usr /dev/dsk/dks0d2s7 efs 1975100 1695823 279277 86% /exports/data0 /dev/dsk/dks0d1s2 efs 879625 740154 139471 84% /exports/usr/people d0chb:/d0dist nfs 4426512 3956200 470312 89% /d0dist d0cha:/d0library nfs 4319768 2972823 1346945 69% /d0library d0sgi0:/usr/local nfs 651875 146295 505580 22% /usr/local d0sgi0:/usr/products nfs 2406580 2268638 137942 94% /usr/products d0sgi0:/exports/usr/peo nfs 7603592 7230643 372949 95% /tmp_mnt/d0sgi0/usr0
Useful options:
-s causes only the grand total (for each of the specified names) to be given.
-k will cause du to express all block counts in terms of 1024 byte
blocks, instead of the default 512 byte blocks.
Example:
d0sgi6[92]% ls -l
total 5
drwxr-xr-x 2 berezhno D0 512 Sep 1 1994 berezhnoi/
drwxr-xr-x 2 bhat D0 512 Nov 9 1993 bhat/
drwxr-xr-x 2 diesburg D0 512 Nov 9 1993 diesburg/
drwxr-xr-x 4 dongzhao D0 512 Nov 14 11:44 dongzhao/
drwxr-xr-x 6 root sys 512 Jan 20 1994 reco_comp/
d0sgi6[93]% du -sk *
5545 berezhnoi
1 bhat
1 diesburg
11557 dongzhao
830735 reco_comp
If you suspect that some of your processes are going wrong, first use ps -fu username to get a list of your processes, then use kill -9 PID to kill a process.
For example:
d0chb[45]% ps -fu piaf
UID PID PPID C STIME TTY TIME COMD
piaf 26224 26206 1 14:03:52 ttyq13 0:03 -tcsh
piaf 7930 26224 8 14:20:30 ttyq13 0:00 ps -fu piaf
piaf 16056 1 0 Dec 02 ? 3:38 Netscape.real
d0chb[46]% kill -9 16056
will kill the Netscape process.
Last modified by Dong Zhao on December 3 1997