How does a unix process determine the name with which it was invoked? Why is this important to know?

  • do different things depending on how it was called
  • relocatable applications

At present there is no direct way to do this. There is no system call that will reliably tell a process what it’s name is.

argv[0] conventions

By convention, a process’ argv[0] is the name it was called as. The exec() family of functions is usually called something like this:

execlp("/bin/ls", "ls", "-l", "/", NULL);

The first argument is the path to the executable, and the second argument becomes argv[0] of the running process. The process that invokes exec() can set this to anything:

execlp("/bin/ls", "wrong", "-l", "/", NULL);

So argv[0] is unreliable. But most of the time it can provide a clue to what the name of the executable might be. argv[0] is almost always the first step.

Try it. Compile these two programs and run ./argv0wrap.

Makefile:

all: argv0 argv0wrap

argv0: argv0.c
	cc -o argv0 argv0.c

argv0wrap: argv0wrap.c
	cc -o argv0wrap argv0wrap.c

argv0wrap.c:

#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char **argv)
{
    execlp("./argv0", "wrong", NULL);
        exit(1);
}

argv.c:

#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char **argv)
{
    printf("my name is %s\n", argv[0]);
    exit(0);
}

A program name may actually be a symbolic link. So even if the calling process sets argv[0] honestly, it still may not be the actual name of the executable. This is not really a problem until the name is used as a point of reference in a relocatable package.

Login shells

argv[0] for login shells can be prefixed with -.

Try from a login shell:

$ echo $0
-ksh
$ 

Interpreters

Interpreters such as sh, ksh, bash do not necessarily suffer from this problem because they must always know which script they are running. Whether a script is invoked from the command line as in:

$ sh script.sh

or indirectly via shebang, the interpreter always knows exactly which script file it is executing. Getting that information from an interpreter may be another story, though. ksh and bash seem to set $0 to the full path of the script if PATH was used to find the script. Otherwise, $0 is relative to the current working directly. But other interpreters such as Perl may be different.

/proc

Many systems have a pseudo-filesystem called /proc which is used as a convenient way to access process information. Unfortunately, there are problems with /proc that hinder portability:

  • /proc may not be mounted, even on a system that supports procfs.
  • There is no standard location for the type of information we want. All systems seem to present information under /proc/pid, but beyond that, there is no standard.
  • There is no standard format for information under /proc. Linux tends to support easily-parsed ASCII strings which can be used from shell scripts with echo and cat, while Solaris presents the information in a binary format. Other systems have ioctl() interfaces to files under /proc.
  • The name under /proc is probably the realpath of the executable and may not be the name the program was invoked as. For example, my login shell on a FreeBSD 4.11 system is /bin/ksh. This is a symlink to /bin/ksh.2005-02-02.freebsd.i386. /proc reports this:
$ ls -l /proc/$$/file
lr-xr-xr-x  1 dl  dl  32 Nov 17 01:40 /proc/20358/file -> /bin/ksh.2005-02-02.freebsd.i386
$

But even with the above limitations, /proc can sometimes be used to verify the contents of argv[0]. The path named in argv[0] can be compared to the executable named under /proc/pid.

System Pathname Description
Solaris /proc/$$/object/a.out The inode number of a.out appears to match the real inode number in the filesystem. But which filesystem is a question.
FreeBSD /proc/$$/file This is a symbolic link to the real executable.
HP/UX n/a As of 11.11, the /proc filesystem is not supported.
Linux < 2.2 /proc/$$/exe Symlink to file named for device and inode of executable. find -inum must be used to find the actual file. man find(1).
Linux >= 2.2 /proc/$$/exe Symlink to the actual path name of the executed command.
Tru64 /proc/$$ This is a file. It must be opened and then ioctl() calls must be made to get information about the running process. Not sure yet, but it looks like one of the ioctl() calls will return a structure with an element named pr_fname, which the man page describes as “last component of exec’d pathname”.

utssys()

Solaris supports the utssys() system call which can return a list of processes which have a particular file open as “text”, an executing process. This can be used to verify the contents of argv[0].

See Number of processes accessing a file [broken].

kvm* Functions

Some systems have family of system calls which can be used to access kernel virtual memory. Functions such as kvm_open() are used with /dev/kmem. These could probably give us the information we want in some system-specific sort of way. Unfortunately, the calling process usually needs to be a member of the kmem or sys group to be allowed to open /dev/kmem. In practice, this is rarely a good idea. However, the kvm functions might be used if the process already happens to have enough privilege to open /dev/kmem.

fuser/fstat

Some systems have fuser or fstat commands which can return which processes hold a file open. Usually some “text” flag is displayed indicating the process is using the file as its text file. This is exactly the information we need. The trouble is, to get this information, we must fork() and exec() to run fuser or fstat. But, if one of these programs is available, it could be used.