Wednesday, October 21, 2009

8.9. File System



8.9. File System


Tandem
uses the term file system to mean the access to system resources that can supply
data ("read") or accept it ("write"). Apart from disk files, the file system
also handles devices, such as terminals, printers, and tape units, and processes
(interprocess communication).


8.9.1. File Naming


There is a common naming
convention for devices, disk files, and processes, but unfortunately it is
complicated by many exceptions. Processes can have names, but only I/O processes
and paired processes must
have a name. In all cases, the file "name" is 24 characters long and consists of
three 8-byte components. Only the first component is required; the other two are
used only for disk files and named processes.


Unnamed processes use only the first 8
bytes of the name. Unpaired system processes, such as the monitor or memory
manager, have the format shown in Figure 8-7.



Figure 8-7. Name format for unpaired system
processes




Unpaired user processes have the format
shown in Figure
8-8.



Figure 8-8. Name format for unpaired user
processes




The combination CPU and PIN together forms the process ID, or PID. The PIN is the process
identification number
within the CPU. This limits each CPU to 256
processes.


Real names start with a $ sign.
Devices use only the first 8 bytes, and disk files use all three components. The
individual components look like the names of the disk, the directory, and the
file, though in fact there is only one directory per disk volume. Processes can
also use the other two components for passing information to the process.


Typical names are shown in Table 8-2.































Table 8-2. Typical file
names
$TAPETape drive
$LPPrinter
$SPLSSpooler process
$TERM15Terminal device
$SYSTEMSystem disk
$SYSTEM SYSTEM LOGFILESystem log file on disk
$SYSTEM
$SPLS #DEFAULTDefault spooler print
queue
$RECEIVEIncoming message queue,
for interprocess communication




If a component is less than 8 bytes
long, it is padded with ASCII spaces. Externally, names are represented in ASCII
with periods, for example, $SYSTEM.SYSTEM.LOGFILE and
$SPLS.#DEFAULT.


There are still further quirks in the
naming. Process subnames must start with a hash mark (#), and user process names (but not device names, which are
really I/O process names) have the PID at the end of the first component; see Figure 8-9.



Figure 8-9. Name format for
named user processes




The PID in this example is the PID of the
primary process. It limits the length of user process names to six characters,
including the initial $.


As if that wasn't enough, there is a
separate set of names for designating processes, disk files, or devices on
remote systems. In this case, the initial $
sign is replaced by a \ symbol, and the
second byte of the name is the system number, shifting the rest of the name one
byte to the right. This limits the length of process names to five characters if
they are to be network-visible. So from another system, the spooler process we
saw earlier might have the external name \ESSG.$SPLS and have the internal format shown in Figure 8-10.



Figure 8-10. Name format for network-visible
processes




The number 173 is the node number of
system \ESSG.


8.9.2. Asynchronous I/O


One of
the important features of the file system interface is the strong emphasis on
asynchronous I/O. We've seen that the message system is intrinsically
asynchronous in nature, so this is relatively simple to implement.


Processes can choose synchronous or
asynchronous ("no wait") I/O at the time they open a file. When a file is opened
no-wait, an I/O request will return immediately, and only errors that are
immediately apparent will be reported—for example, if the file descriptor isn't
open. At a later time the user calls awaitio to
check the status of the request. This gives rise to a programming style where a
process issues a number of no-wait requests, then goes into a central loop to
call awaitio and handle the completion of
the requests, typically issuing a new request.


8.9.3. Interprocess
Communication


At a file system level,
interprocess communication is a relatively direct interface to the message
system. This causes a problem: the message system is asymmetrical. The requestor
sends a message and may receive a reply. There's nothing that corresponds to a
file system read command. On the server side,
the server reads a message and replies to it; there's nothing that corresponds
to a write command.


The file system provides
read and write procedures, but read only works with I/O processes, which map them
to message system requests. read doesn't work for the interprocess
communication level, and in practice write also is
not used much. Instead, the requestor uses a procedure called
writeread to first write a message
to the server and then get a reply from it. Either the message or the reply can
be null (zero length).


These messages find their way to
the server's message queue. At a file system level, the message queue is a
pseudofile called $RECEIVE. The server opens
$RECEIVE and normally uses the
procedure readupdate to read a
message. At a later point it can reply with the procedure reply.


8.9.4. System Messages


The system uses
$RECEIVE to pass messages to
processes. One of the most important is the startup
message
, which passes parameters to a newly started
process. The following example is written in TAL, Tandem's low-level system
programming language (though the name stands for "Tandem Application Language").
TAL is derived from HP's SPL, and it is similar to Pascal and Algol. One of the
more unusual characteristics is the use of the caret (^) character in
identifiers; the underscore ( _ ) character is
not allowed. This example should be close enough to C to be intelligible. It
shows a process that starts a child server process and then communicates with
it.


The first piece shows the parent process (requestor):


 


call newprocess (program^file^name,,,,,, process^name); -- start the server process
call open (process^name, process^fd); -- open process
call writeread (process^fd, startup^message, 66); -- write startup message
while 1 do
begin
read data from terminal
call writeread (process^fd,
data, data^length, -- write data
reply, max^reply, -- read data back
@reply^length); -- return real reply length
if reply^length > 0
write data back to terminal
end;



The following shows the child process (server):


 


call open (receive, receive^fd);
do
call read (receive^fd, startup^message, 66);
until startup^message = -1; -- first word of startup message is -1.
while 1 do
begin
call readupdate (receive^fd, message, read^count, count^read);
process message received, replacing buffer contents
call reply (message, reply^length);
end;



The first messages
that the child receives are system messages: the parent open of the child sends an open message to the child, and then the first call to
writeread sends the startup message. The child
process handles these messages and replies to them. It can use the open message
to keep track of requestors or receive information passed in the last 16 bytes
of the file name. Only then does the process receive the normal message traffic
from the parent. At this point, other processes can also communicate with the
child. Similarly, when a requestor closes the server, the server receives a
close system message.


8.9.5. Device I/O


It's important to remember that device I/O, including
disk file I/O, is handled by I/O processes, so "opening a device" is really
opening the I/O process. Still, I/O to devices and files is implemented in a
slightly different manner, though the file system procedures are the same. In
particular, the typical procedures used to access files are the more
conventional read and write, and normally
disk I/O is not no-wait.


8.9.6. Security


In
keeping with the time, the T/16 is not an overly secure system. In practice,
this hasn't caused any serious problems, but one issue is worth mentioning: the
transition from nonprivileged to privileged procedures is based on the position
of the procedure entry point in the PEP table and the value of the
priv bit in the E register. Early on, exploits
became apparent. If you could get a privileged procedure to return a value via a
pointer and get it to overwrite the saved E register on the stack in such a way
that the priv bit was set, the process
would remain privileged on return from that procedure. It is the responsibility
of callable procedures to check their pointer parameters to ensure that they
don't have any addressing exceptions, and that they return values only to the
user environment. A bug in the procedure setlooptimer, which sets a watchdog timer and optionally returns the old
value, made it possible to become the SUPER.SUPER
(the root user, with ID 255,255, or –1):


 


proc make^me^super main;
begin
int .TOS = 'S'; -- top of stack address

call setlooptimer (%2017); -- set a timer value
call setlooptimer (0, @TOS [4]); -- reset, return old value to saved E reg
pcb [mypid.<8:15>].pcbprocaid := -1; -- dick in my PCB and make me super
end;



The second call to
setlooptimer returns the old value
%2017 to the saved E register contents on stack,
in particular setting the priv
bit, which leaves the process in privileged state. Theoretically this value
could have been decremented to %2016, but this
would not make any difference (this is the saved RP field, which is not
restored). The program then uses SG-relative addressing to modify the user
information in its own process control block (PCB). mypid is a function returning the current process's PID, and
the last 8 bits (<8:15>) were the PIN,
which is used as an index in the PCB table.


This bug was quickly
fixed, of course, but it showed a weakness in the approach: it is up to the
programmer to check the parameters passed to callable procedures. Throughout the
life of the architecture, such problems have reoccurred.


8.9.7. File Access Security


Tandem's approach to file access security is similar to that
of Unix, but users can belong only to a single group, which is part of the
username. Thus my username, SUPPORT.GREG,
also written numerically as 20,102,
indicates that I belong to the SUPPORT group
(20) only, and that within that group my user ID is 102. Each of these fields is
8 bits long, so the complete user ID fits in a word. If I wanted to be a member
of another group, I would need another user ID, possibly with a different
number—for example, SUPER.GREG with user ID
255,17.


Each file has a number of bits describing
what access the owner, the group, or all users have to the file. Unlike Unix,
however, the bits are organized differently: the four permissions are read, write, execute, and purge.
Purge is the Tandem name for
delete, and it's
necessary because directories don't have their own security settings.


For each of these access modes, there
is a choice of who is allowed to use them:




  • Owner
    means only the owner of the file.



  • Group means
    anybody in the same group.



  • All
    means anybody.


All of these relate only to the same
system as the one in which the file is located. A second set of modes was
introduced with networking to regulate access from users on other systems:




  • User means
    only a user with the same user and group number as the owner of the
    file.



  • Class
    means anybody with the same group number as the owner of the file.



  • Network means anybody,
    anywhere.


There is no security whatsoever
for devices, and user processes have to roll their own. The former is a
particular disadvantage in a networked environment. At a security seminar in
early 1989, I was able to demonstrate stealing the SUPER.SUPER (root)
password on system \TSII, which was in the
middle of the management area in Cupertino, simply by putting a fake prompt on
the system console. I was in Düsseldorf (Germany) at the time.


 


No comments: