teacl/doc/prog-guide.txt

857 lines
40 KiB
Plaintext
Raw Normal View History

TECO-C
Programmer's Guide
(last updated February 18, 1991 to reflect TECO-C version 140)
1 INTRODUCTION
These notes apply to TECOC version 135, which runs under VAX/VMS,
MS-DOS, and Unix (SunOS, which is BSD). See file AAREADME.TXT for the
specifics of operating system and compilers which have been used.
TECO-C is meant to be a complete implementation of TECO as defined by
the Standard TECO User's Guide and Language Reference Manual, (in file
TECO.DOC). It was written so that the author could move to many machines
without knowing many editors.
2 COMPILING AND LINKING
Conditional compilation directives are used to build TECO-C correctly
for different environments. Identifiers automatically defined by
the different compilers are used. Some identifiers defined in file ZPORT.H
control whether video support or extra debugging code is included. See
"VIDEO" and "DEBUGGING". Files are provided which "build" TECO-C in various
environments. See file AAREADME.TXT for details.
3 RUNNING TECO-C
When you run TECO, the command line used to invoke TECO is parsed for
an input/output file name and several optional switches. TECO-11 parses the
command line using a TECO macro imbedded in the program. TECO-C does the
same thing. Actually, the imbedded macro used to parse the command line
was stolen from TECO-11. I commented it and then modified it to repair
minor inconsistencies. Use of TECO-11's macro makes TECO-C invocation
identical to TECO-11's, even responding to "make love" with "Not war?".
The macro is in file CLPARS.TES. The compressed version (no comments
or whitespace) is in file CLPARS.TEC. The GENCLP program converts
CLPARS.TEC into CLPARS.H, an include file suitable for compiling into
TECO-C.
4 CODE CONVENTIONS
The code is not modular. Communication between almost all functions
is through global variables, not argument lists. There is a reason: the
nature of the basic parsing algorithm is to use the characters in the
command string as indices into a table of functions. This makes for very
fast command parsing, but it means that all the functions have to modify
Page 2
global values, because no arguments are passed in. In other words, there
were going to be 130 or so un-modular functions anyway, so I gave up on
modularity. This explanation does not explain some of the complications in
the search code, like the global variable SrcTyp. Oh, well.
Here's a brief list of some of the conventions followed by the code:
1. TECO-C is portable, so some convention was needed to separate
portable code from system-dependent code. There is one file
containing the system-dependent code for each platform TECO-C
supports. These files have names that start with a "Z": ZVMS.C,
ZMSDOS.C and ZUNIX.C.
All the system-dependent functions in those files start with a
"Z". For example, the function that allocates memory is called
ZAlloc. A VMS version of ZAlloc can be found in ZVMS.C, and an
MS-DOS version can be found in ZMSDOS.C.
An extra file called ZUNKN.C exists to help efforts to port TECO-C
to a new environment. This file contains stubs for all the
system-dependent functions.
2. All system-independent global variables are declared
alphabetically in file TECOC.C. They are defined in DEFEXT.H,
which is included by all modules.
3. File TECOC.H contains the "global" definitions, including those
for most structures.
4. Variables and functions are defined using the portability
identifiers defined in the ZPORT.H file. Functions which do not
return a value are defined as VOID. TECO-C should compile and
link on different machines by changing only the environment
definitions in the ZPORT.H file.
5. At one time, every function was in a file with the same name
as the function. This made it easy to find the code for a
function. The problem was that some groups of functions use data
not needed by the other functions. This was especially true of
the system-dependent functions. Also, some functions were called
only by one other function, so it made sense for them to be in the
same module as the caller and be made "static". So now, most
functions are in a file named the same as the function, with the
following exceptions:
1. All the "Z" functions are in are in the "Z" file for the given
system.
2. The conditionally-compiled functions (ZCpyBl in ZINIT.C, the
"Dbg" functions at the bottom of TECOC.C, the "v" functions in
EXEW.C) aren't in their own files. If they were, then the
command procedures/makefiles that compile the files would need
to contain logic to conditionally compile the files.
Page 3
3. The functions for the "E" and "F" commands are in EXEE.C and
EXEF.C, respectively. So if you want to find function ExeEX,
don't look for a file named EXEEX.C.
6. Symbols are 6 characters long or less. The way I remember it,
this was caused by the first system I wrote TECOC for: CP/M-68k,
which had a limit of 8 characters for file names. The last two
characters had to be ".C", so 6 characters were left for the file
name. Since the file name was the same as the function it
contained, functions were limited to 6 characters in length. When
I saw how nicely the function declarations looked (they fit in one
tab slot), I used 6 characters for other symbols too.
I've since been told that CP/M-68k has 8-character file names
followed by 3-character file types, so CP/M-68k can't be blamed.
So shoot me.
This standard has prevented problems from occurring with compilers
that don't support very many characters of uniqueness.
In order to make up for the resultant cryptic names, upper and
lower case are mixed. An uppercase letter indicates a new word.
For example, "EBfEnd" stands for "Edit Buffer End". If you need
to know what a variable name means, look at the definition of the
variable in DEFEXT.H. The expanded version of the abbreviated
name appears in the comment on the same line as the variable
definition. A detailed description can be found with the
declaration of the variable in TECOC.C.
The limit of 6 letters in variable names is relaxed in
system-dependent code.
7. Variable and function names follow patterns where possible. For
instance, "EBfBeg" and "EBfEnd" are the beginning and end of the
edit buffer. If you see a variable named "BBfBeg", you can assume
that it is the beginning of some other buffer, and that a "BBfEnd"
exists which is the end of that buffer.
8. Character strings are usually represented in C by a pointer to a
sequences of bytes terminated with a null character. I didn't do
that in TECO-C because I thought it was too inefficient. To get
the length of a string, you have to count characters. Most
strings in TECO-C are therefore represented by two pointers, on to
the first character and one to the character following the last
character. With this representation, it's easy to add characters
to a string and trivial to get the length.
9. Each file has a consistent format, which is:
1. a comment describing the function
2. include directives
3. the function declaration
Page 4
4. local variable definitions, in alphabetical order
5. code
5 TOP LEVEL EXECUTION AND COMMAND PARSING
The top level code for TECO-C is contained in file TECOC.C. It is
very simple: after initializing, a loop is entered which reads a command
string from the user, executes it, and loops back to read another command
string. If the user executes a command which causes TECO-C to exit, the
program is exited directly via a call to the TAbort function. TECO-C never
exits by "falling out the bottom" of the main function.
After a command string is read, the ExeCSt function is called to
execute the command string. ExeCSt contains the top-level parsing code.
The parse is trivial: each command character is used as an index into a
table of functions. The table contains one entry for each of the 128
possible characters. Each function is responsible for "consuming" its
command so that when it returns, the command string pointer points to the
next command.
5.1 Error Handling
When an error is detected, an error message is displayed at the point
that the error is detected, and the function in which the error was
detected returns a FAILURE status to its caller. Almost always, the caller
returns a FAILURE status to it's caller, which returns a FAILURE status to
it's caller, etc. When a FAILURE status is returned to the main command
string parser, parsing of the command string stops and the user is prompted
for a new command string.
This style tends to cause all function calls to follow the same form,
which is
if (function() == FAILURE)
return(FAILURE);
Things get more complicated in the system-dependent code (in the files
with names that start with a "Z"). I extended TECO's error reporting
slightly to allow the user to see the operating system's reason for an
error, as this is often useful. For example, under VAX/VMS there are many
reasons why an attempt to create an output file might fail. They include:
errors in file name syntax, destination directory non-existence, file
protection violations or disk quota violation. In order to supply enough
information to the user, TECO-C outputs multiple-line error messages when a
system error occurs.
Multiple-line error messages contain one line that describes the
operating system's perception of the error and one line that describe's
TECO's perception of the error. For instance, if a user of VAX/VMS does a
"EW[abc]test.txt$$" command when the directory [abc] does not exist, the
Page 5
error message generated by TECO-C is:
?SYS %RMS-F-DNF, directory not found
?UFO unable to open file "[abc]test.txt" for output
System errors are therefore reported in a system-dependent fashion,
using whatever messages the operating system can supply. Under VAX/VMS,
the system service $GETMSG provides human-readable messages that TECO-C can
use in the "SYS" part of the error message. Under UNIX, syserrlist[error]
is a pointer to these messages.
There is another way in which error reporting in the system-dependent
code is tricky. Under VAX/VMS, some system calls may return a code that is
"successful" but contains extra information. For instance, when a user has
set his directories so that only a limited number of versions of a file can
exist, RMS will automatically purge the oldest version of the file when the
user creates a file. This only happens if the newly created file would
cause too many versions of the file to exist. When this happens, the VMS
service returns a FILEPURGED status, which is successful. TECO-C informs
the user about these things by displaying the message in brackets.
5.2 Command Modifiers (CmdMod)
Command parsing is complicated by command modifiers and numeric
arguments, which may precede some commands. These are implemented in a way
that maintains the basic "jump table" idea. For instance, when an at-sign
(@) modifier is encountered in a command string, the at-sign command
function (ExeAtS) is called. The only thing ExeAtS does is set a flag
indicating that an at-sign has been encountered. Commands which are
affected by an at-sign modifier check this flag and behave accordingly.
The flags which indicate command modifiers are contained in global
variable CmdMod. A bit in CmdMod is reserved for each command modifier.
The modifiers are "@", ":" and "::". Of course, once the flag has been
set, it must be cleared. With this parsing algorithm, the only way to do
that is to make every command function explicitly reset CmdMod before a
successful return. This is not too bad: clearing all the flags in CmdMod
is done with one statement: "CmdMod = '\0';".
For numeric arguments to commands, an expression stack is used (see
Stacks). The EstTop variable is the pointer to the top of the expression
stack. Commands which handle numeric arguments check EStTop to see if the
expression stack contains a value.
A special case of numeric arguments is "m,n". The "m" part is
encountered and causes the value to be pushed onto the expression stack.
The comma causes the ExeCom function to move the value into a special
"m-argument" global variable (MArgmt), clear the expression stack and set
another flag in CmdMod indicating that the "m" part of an "m,n" pair is
defined. Then the "n" is encountered and pushed onto the stack. Commands
which can take "m,n" pairs check the flag in CmdMod.
Page 6
To summarize, CmdMod and EStTop are variables which describe the
context of a command. Each command function tests these variables to see
if it was preceded by modifiers or numbers. For this to work, it is
important that the expression stack and the flags in CmdMod are cleared at
the right times. It is the responsibility of each command function to
leave CmdMod and EStTop with the proper values before successfully
returning. The rules are:
1. If the command function is returning FAILURE, don't worry about
clearing CmdMod or EStTop. They will be cleared before the next
command string is executed.
2. If the command function leaves a value on the expression stack, do
not clear EStTop before returning SUCCESS. If the command calls
GetNmA, do not clear EStTop, as GetNmA does it for you.
Otherwise, clear EStTop before returning SUCCESS.
3. Clear CmdMod unless the command function sets flags or needs to
leave them alone. ExeDgt, for example, handles digit strings and
doesn't clear CmdMod because the MARGIS bit may be set.
6 SEARCHING
The search algorithm in TECO-C is complex. The war between the desire
for a fast search and the need to handle all the features of TECO'ssearch
commands has produced code which can be a real pain to follow. This
section attempts to explain how things got the way they are. The code is
explained in a bottom-up fashion, to follow the way it evolved in the
author's twisted mind.
The basic search idea is to scan a contiguous edit buffer for a search
string. The steps are:
1. Search the edit buffer for the first character in the search
string. If you reach the end of the edit buffer without matching,
the search fails.
2. When the first character of the search string matches a character
in the edit buffer, try to match successive characters in the
search string with the characters which follow the found character
in the edit buffer. If they all match, the search succeeds. If
one doesn't, go back to step 1.
This is basically what TECO-C does. The features of TECO's search
commands has buried these steps deep within some confusing code.
The first complication is introduced by pattern matching characters.
TECO has 17 "match constructs", whiceh are indicated in the search string
by the special characters ^X, ^S, ^N and ^Ex where "x" can be several other
characters. For instance, a ^X in the search string means that any
character is to be accepted as a match in place of the ^X. Characters
Page 7
other than the match constructs represent themselves. An example: the
search string "a^Xb" contains 3 match constructs: a, ^X and b.
TECO also supports forward or backward searching. When searching
backwards, only the search for the first match construct in the search
string is done in a backwards direction. When the character is found, the
characters following it are compared in a forward direction to the edit
buffer characters. This means that once the first match construct has been
found, a single piece of code can be used to compare successive characters
in the search string with successive characters in the edit buffer,
regardless of whether the search is forwards or backwards.
Adding these new features, the new description of searching is:
1. Search the edit buffer forwards or backwards for a character which
matches the first match construct in the search string. If you
reach the end of the edit buffer without matching, the search
fails.
2. When the first match construct of the search string matches a
character in the edit buffer, try to match successive match
constructs in the search string with the characters which follow
the found character in the edit buffer. If they all match, the
search succeeds. If one doesn't, go back to step 1.
To begin a description of which routines implement the above steps,
and in order to have a reference for later discussion, the following
hierarchy chart of "who calls who" is presented.
Page 8
ExeEUn ExeFB ExeFC ExeFD ExeFK ExeFN ExeFS ExeFUn ExeN ExeS ExeUnd
| | | | | | | | | | |
| | | | | | | | | | |
------------------------------------------------------------
|
V
Search
|
V
SrcLop
|
V
SSerch
| | |
+------+ | +------+
+---+ | | | +---+
| V V | V V |
| ZFrSrc | BakSrc |
| | | | | | |
+---+ | | | +---+
+------+ | +------+
V V V
CMatch <--+
| |
+--------+
At the top are the functions that implement search commands (E_, FB,
FC, FD, FK, FN, FS, F_, N, S and _). All of these functions call the main
search function: Search.
At the lower level are the functions which implement steps 1 and 2
described above. ZFrSrc searches forwards in the edit buffer for
characters which match the first character in the search string. BakSrc
does the same thing, but searches backwards. SSerch calls one of these two
functions and then executes a loop which calls CMatch to compare successive
match constructs in the search string to characters following the found
character in the edit buffer. The reason that ZFrSrc, BakSrc and CMatch
call themselves is to handle some of the more esoteric match constructs.
Case dependence in TECO is controlled by the search mode flag (see the
^X command). The variable SMFlag holds the value of the search mode flag,
and is used by ZFrSrc, BakSrc and CMatch.
One final point to help confuse things: ZFrSrc is system-dependent.
It contains a VAX/VMS-specific version which uses the LIB$SCANC run-time
library routine to access the SCANC instruction. The SCANC instruction
looks like it was designed to handle TECO's match constructs. I couldn't
resist using it, but it was a mistake, as it needlessly complicates an
already messy algorithm. I have decided to remove the VMS-specific code
some time in the future.
Further complications of the search algorithm arise because of the
following capabilities of TECO searches:
Page 9
1. If there is no text argument, use the previous search argument.
2. If colon modified, return success/failure and no error message
3. If the search fails and we're in a loop and a semicolon follows
the search command, exit the loop without displaying an error
message.
4. Handle optional repeat counts
5. If the ES flag is non-zero, verify the search based on the value
of the flag.
6. If bit 64 of the ED flag is set, move dot by one on multiple
searches.
7. If bit 16 of the ED flag is set, don't move after a failing
search.
8. Be fast.
7 MEMORY MANAGEMENT
7.1 The Edit Buffer And Input Buffer
TECO-C is based on TECO-11, but it uses a different form of edit
buffer memory management. Here's why.
The edit buffer in TECO-11 is implemented as a continuous block of
memory. This allows rapid movement through the edit buffer (by just
maintaining a pointer to the current spot) and makes searches very
straightforward. Insertion and deletion of text is expensive, because each
insertion or deletion requires moving the text following the spot where the
insertion or deletion occurs in order to maintain a continuous block of
memory. This gets to be a real pain when a video editing capability is
added to TECO, because in video mode text is added/deleted one character at
a time very rapidly.
TECO-C uses a edit buffer gap scheme. The edit buffer occupies a
continuous piece of memory, but there is a gap at the "current spot" in the
edit buffer. When the user moves around the edit buffer, the gap is moved
by shuffling text from one side of the gap to the other. This means that
moving around the text buffer is slower than for TECO-11's scheme, but text
insertion and deletion is very fast. Searches are still fast because most
searches start at the current spot and go forwards or backwards, so a
continuous piece of memory is searched. In the future, when some kind of
video mode is added, insertion and deletion one-character-at-a-time will be
fast using the gap scheme.
The variables that maintain pointers to the edit buffer and the gap
within the buffer can be confusing, so here's some examples. Suppose that
10000 bytes are allocated for the edit buffer when TECO-C is initialized.
Page 10
Suppose the allocated memory starts at address 3000.
Empty edit buffer (the gap spans the whole edit buffer):
EBfBeg = 3000 (edit buffer beginning)
GapBeg = 3000 (gap beginning)
GapEnd = 13000 (gap end)
EBfEnd = 13000 (edit buffer end)
Buffer contains "test", character pointer is before the first 't':
EBfBeg = 3000 (edit buffer beginning)
GapBeg = 3000 (gap beginning)
GapEnd = 12996 (gap end)
12997 't'
12998 'e'
12999 's'
EBfEnd = 13000 't' (edit buffer end)
Buffer contains "test", character pointer is after the last 't':
EBfBeg = 3000 't' (edit buffer beginning)
3001 'e'
3002 's'
3003 't'
GapBeg = 3004 (gap beginning)
GapEnd = 13000 (gap end)
EBfEnd = 13000 (edit buffer end)
Buffer contains "test", character pointer is after the 'e':
EBfBeg = 3000 't' (edit buffer beginning)
3001 'e'
GapBeg = 3002 (gap beginning)
GapEnd = 12998 (gap end)
12999 's'
EBfEnd = 13000 't' (edit buffer end)
When an insertion command is executed, the text is inserted starting
at GapBeg. When a deletion command is executed, GapEnd is incremented for
a forward delete or GapBeg is decremented for a backwards delete. When the
character pointer is moved forwards, the gap is moved forwards by copying
text from the end of the gap to the beginning. When the character pointer
is moved backwards, the gap is moved backwards by copying text from the the
area just before the gap to the area at the end of the gap.
There are a few messy cases, such as when a bounded search is executed
and the bounded text area includes the edit buffer gap. In this case, the
gap is temporarily moved so that the search can proceed over a continuous
memory area.
Page 11
In order to confuse things a little, TECO-C has one addition to the
basic edit buffer gap management. Following the end of the edit buffer
(EBfEnd) is the current input stream buffer. Since file input commands
always cause text to be appended to the end of the edit buffer, this is
natural. Thus, no input buffer is needed: text is input directly into the
edit buffer. This makes the code a little confusing, but it avoids the
problem of having an input buffer. When you have an input buffer, you have
to deal with the question of how large the buffer should be and what to do
with it when it's too small. this scheme is fast and and saves some
memory. (see File Input)
7.2 Q-registers
Q-registers have two parts: a numeric part and a text part. Each
q-register is represented by a structure containing three fields: one to
hold the numeric part and two to point to the beginning and end of the
memory holding the text part. If the text part of the q-register is empty,
then the pointer to the beginning of the text is NULL.
There are 36 global q-registers, one for each letter of the alphabet
and 1 for each digit from 0 to 9. These q-registers are accessible from
any macro level. There are 36 local q-registers for each macro level. The
names for local q-registers are preceded by a period. Thus the command
"1xa" inserts a line into global q-register "a", while the command "1x.a"
inserts a line into local q-register ".a". Storage for the data structure
defining local q-registers is not allocated until a local q-register is
first used. This saves space and time, because local q-registers are
rarely used, and doing things this way avoids allocating and freeing memory
every time a macro is executed.
8 STACKS
8.1 Expression Stack
An expression stack is used to parse TECO's expressions. Consider the
command string QA+50=$$. When the command string is executed, the value of
QA is pushed on the expression stack, then the operator "+" is pushed on
the expression stack, and then the value "50" is pushed on the expression
stack. Whenever a full expression that can be reduced is on the expression
stack, it is reduced. For the above example, the stack is reduced when the
value "50" is pushed.
The expression stack is implemented in the following variables:
EStack the stack itself, containing saved operators and operands
EStTop index of the top element in EStack
EStBot index of the current "bottom" of the stack in EStack
The "bottom" of the expression stack can change because an expression
can include a macro invocation. For example, the command QA+M3=$$ causes
the value of "QA" to be pushed on the expression stack, then the "+" is
Page 12
pushed, and then the macro contained in q-register 3 is executed. The
macro in q-register 3 returns a value to be used in the expression. When
the macro is entered, a new expression stack "bottom" is established. This
allows the macro to have a "local" expression stack bottom while
maintaining the stack outside the macro.
8.2 Loop Stack
The loop stack contains the loop count and the address of the first
command in the loop. For example, in the command 5<FSMP$mt$>$$, the loop
stack contains the loop count (5) and the address of the first command in
the loop (F). Whenever the end-of-loop character (>) is encountered, the
loop count is decremented. If the loop count is still greater than zero
after it has been decremented, then the command string pointer is reset to
point to the first character in the loop (F).
The loop stack is implemented in the following variables:
LStack the stack itself, containing saved counts and addresses
LStTop index of the top element in LStack
LStBot index of the current "bottom" of the stack in LStack
The loop stack needs a "floating" bottom for the same reason that the
expression stack needs one: macros. Consider the command string
4<Smp$M7$>$$. When the "<" in is encountered, the loop count (4) and the
address of the first character in the loop (S) are placed on the loop
stack. Command execution continues, and the "M7" command is encountered.
Suppose that q-register 7 contains the erroneous command string 10>DL>$$.
When the ">" command is encountered in the macro, TECO expects the loop
stack to contain a loop count and an address for the first character in the
loop. In this example, there is no matching "<" command in the macro which
would have set up the loop stack. It would be very bad if TECO were to
think that the loop count was 4 and the first command in the loop was "S".
In this situation, what TECO should do is generate the error message "BNI >
not in iteration". In order to implement this, the variable LStBot is
adjusted each time a macro is entered or exited. LStBot represents the
bottom of the loop stack for the current macro level.
8.3 Macro Stack
The macro stack is used to preserve context each time a macro is
entered. All important values are pushed onto the stack before a macro is
entered and popped off the stack when the macro is exited. The macro stack
is also used by the EI command, which means it's used when executing
initialization files and mung files.
9 HELP
This section discusses on-line HELP, which is available only under
Page 13
VAX/VMS.
The HELP command is not documented in the TECO manual distributed by
DEC., even though it is supported in TECO-11 and TECO-32. To get help,
simply type "HELP" followed by a carriage return. HELP is the only TECO
command that is not terminated by double escapes.
Help in TECOC is different than help in TECO-11. In TECO-C,
interactive help mode is entered, so that a user can browse through a help
tree, as he can from DCL. In TECO-C, access is provided to only two
libraries: the library specific to TECO-C (pointed to by logical name
TEC$INIT) and the system help library. To get help on TECO-C, just say
"HELP", with or without arguments. To get help from the system library,
say "HELP/S". I find this easier to use than TECO-11's syntax.
The help library for TECO-C is contained in file TECOC.HLB, which is
generated from TECOC.HLP, which is generated from TECOC.RNH. See file
TECOC.RNH for a description of how to do it. This help library is far
broader than the library for TECO-11, but much of it has yet to be filled
in.
The help library is also the repository for verbose error messages,
which are displayed when the help flag (EH) is set to 3. For systems other
than VMS, the ZHelp function displays verbose text contained in static
memory (see file ZHELP.C).
10 FILE INPUT
TECO has an elegant design that allows high speed input. There are no
linked list data structures to keep track of, and most file input goes
directly to the end of the edit buffer.
TECO-C takes advantage of this by reading normal file input directly
to the end of the edit buffer. After each input call, nothing needs to be
moved; the pointer to the end of the edit buffer is simply adjusted to
point to the end of the new record. The pointer to the end of the edit
buffer (EBfEnd) serves two purposes: it points to the end of the edit
buffer and to the beginning of the input buffer.
A side effect of this scheme is the sharing of memory between the edit
buffer and the input buffer. When the edit buffer is empty, it can be made
smaller by shrinking the edit buffer gap in order to make the input buffer
larger. Obviously, if the edit buffer needs to be expanded, the input
buffer can suffer before more memory is actually requested from the
operating system. This is easily achieved by moving the pointer to the
"end-of-the-edit-buffer"/ "beginning-of-the-input-buffer".
This scheme works, but provides no support for the other forms of file
input. The EP and ER$ commands provide a complete secondary input stream
which can be open at the same time as the primary stream (two input files
at once). The EI command reads and executes files containing TECO
commands, and is used to execute the initialization file, if one exists.
The EQq command, if implemented, reads the entire contents of a file
Page 14
directly into a Q-register.
A second problem arises: on each of the open files, the quantum unit
of input is not standard. For A, Y and P commands, a form feed or
end-of-file "terminate" the read. For n:A commands, form feed, end-of-line
or end-of-file "terminate" each read. For EI commands, two escapes or
end-of-file "terminate" the read. The input code must "save" the portion
of an input record following a special character and yield the saved text
when the next command for the file is executed.
The scheme used in TECO-C is to read text from the current input
stream directly to the end of the edit buffer. When the input stream is
switched via a EP or ER$ command, the obvious switching of file descriptors
happens, and any text that's "leftover" from the last read is explicitly
saved elsewhere. Note that this happens VERY rarely, so a malloc/free is
acceptable.
For EI and EQq commands, the input memory following the edit buffer is
used as a temporary input buffer. After the file is read, the text is
copied to a Q-register in the case of EQq and to a separate buffer in the
case of EI.
11 VIDEO
As of 18-Feb-1991, TECO-C supports video only under Unix. The code
was written by Mark Henderson, using the CURSES package. See file
VIDEO.TXT for a discussion of how it works.
12 PORTABILITY
TECO-C was written with portability in mind. The first development
machine was "minimal": a SAGE IV (68000) running CP/M-68k. In that
environment, there was no "make" utility.
Initially, the system-independent code (files that don't start with a
"Z") had absolutely no calls to standard C runtime functions. This was
because I had several problems with the "standard" functions not being
"standard" on different machines. With the onset of ANSI C I've grown less
timid, but the main code still references almost no standard functions.
This is less of a limitation than you might think: TECO-C doesn't use
null-terminated strings. It also doesn't use unions, floating point or bit
fields.
13 PORTING TO A NEW ENVIRONMENT
1. Move the source code to the target machine.
Page 15
2. Inspect file ZPORT.H. You need to select the compiler you want
the code compiled for. For instance, if you are porting to a Unix
system, then fix ZPORT.H so that the unix identifier is defined
(it is usually defined by default by the compiler). If your
compiler is nothing like anything supported by ZPORT.H, then set
the UNKNOWN identifier.
3. Compile and link. See file AAREADME.TXT for descriptions of how
TECO-C is built in supported environments, and steal like mad.
The problem here is that you need a "Z" file for your environment,
containing all the "Z" functions needed by TECO-C. The easiest
thing to do is copy ZUNKN.C to your own "Z" file and link against
that. For instance, if I ever port TECO-C to a Macintosh, I'll
copy ZUNKN.C to ZMAC.C.
4. Fix things so the compile/link is successful. If you have
compiled with UNKNOWN set, you should get an executable file that
displays a message and dies when the first system-dependent
function is called. The strategy is to fix that function (often
by stealing from the code for other operating systems), relink and
deal with the next message until you have something that works.
Functions should be implemented in roughly the following order:
ZInit, ZTrmnl, ZExit, ZDspCh, ZAlloc, ZRaloc, ZFree, ZChin. This
will give you a TECO with everything but file I/O. You can run
it, add text to the edit buffer, delete text, search, use
expressions and the = sign command (a calculator). Then do file
input: ZOpInp, ZRdLin, ZIClos. Then do file output: ZOpout,
ZWrBfr, ZOClos, ZOClDe. Use the test macros (*tst*.tec) to test
how everything works (see Testing).
14 TESTING
Testing of TECO-C is performed by executing macros. The macros are
contained in files named TSTxxx.TEC, where XXX is some kind of indication
as to what is tested. For instance, TSTQR.TEC tests q-registers. The test
macros do not test all the functions provided by TECO. They were
originally used to verify that TECO-C performs exactly the same as TECO-11
under the VMS operating system. When I needed to test a chunk of code, I
sometimes did it the right way and wrote a macro.
15 DEBUGGING
A debugging system (very ugly, very useful) is imbedded within the
code. It is conditionally complied into the code by turning on or off an
identifier (DEBUGGING) defined in the TECOC.H file. When debugging code is
compiled in, you can access it using the ^P command, which is not used by
regular TECO. The ^P command with no argument will display help about how
to use ^P.
Page 16
If you are working under VMS, it sometimes helps to compare the
execution of TECO-C with TECO-11. Put a test command string into a file.
Use DEFINE/USER_MODE to redirect the output of TECO-C to a file and execute
the macro with TECO-C. Then do the same thing with TECO-11. Use the
DIFFERENCES command to compare the two output files. They should be 100
percent identical.