teacl/doc/prog-guide.txt



                                     TECO-C

                               Programmer's Guide
         (last updated February 18, 1991 to reflect TECO-C version 140)


1  INTRODUCTION

     These notes apply to TECOC version  135,  which  runs  under  VAX/VMS,
MS-DOS,  and  Unix  (SunOS,  which  is BSD).  See file AAREADME.TXT for the
specifics of operating system and compilers which have been used.

     TECO-C is meant to be a complete implementation of TECO as defined  by
the  Standard  TECO  User's  Guide  and Language Reference Manual, (in file
TECO.DOC).  It was written so that the author could move to  many  machines
without knowing many editors.


2  COMPILING AND LINKING

     Conditional compilation directives are used to build TECO-C  correctly
for   different   environments.    Identifiers   automatically  defined  by
the different compilers are used.  Some identifiers defined in file ZPORT.H
control  whether  video  support  or extra debugging code is included.  See
"VIDEO" and "DEBUGGING". Files are provided which "build" TECO-C in various
environments.  See file AAREADME.TXT for details.


3  RUNNING TECO-C

     When you run TECO, the command line used to invoke TECO is parsed  for
an input/output file name and several optional switches. TECO-11 parses the
command line using a TECO macro imbedded in the program.  TECO-C  does  the
same  thing.   Actually,  the imbedded macro used to parse the command line
was stolen from TECO-11.  I commented it and then  modified  it  to  repair
minor  inconsistencies.   Use  of  TECO-11's  macro makes TECO-C invocation
identical to TECO-11's, even responding to "make love" with "Not war?".

     The macro is in file CLPARS.TES.  The compressed version (no  comments
or  whitespace)  is  in  file  CLPARS.TEC.   The  GENCLP  program  converts
CLPARS.TEC into CLPARS.H, an  include  file  suitable  for  compiling  into
TECO-C.


4  CODE CONVENTIONS

     The code is not modular.  Communication between almost  all  functions
is  through  global variables, not argument lists.  There is a reason:  the
nature of the basic parsing algorithm is  to  use  the  characters  in  the
command  string  as indices into a table of functions.  This makes for very
fast command parsing, but it means that all the functions  have  to  modify

                                                                Page 2


global  values,  because no arguments are passed in.  In other words, there
were going to be 130 or so un-modular functions anyway,  so  I gave  up  on
modularity.  This explanation does not explain some of the complications in
the search code, like the global variable SrcTyp.  Oh, well.

     Here's a brief list of some of the conventions followed by the code:

     1.  TECO-C is portable, so some  convention  was  needed  to  separate
         portable  code  from  system-dependent  code.   There  is one file
         containing the system-dependent  code  for  each  platform  TECO-C
         supports.   These files have names that start with a "Z":  ZVMS.C,
         ZMSDOS.C and ZUNIX.C.

         All the system-dependent functions in those  files  start  with  a
         "Z".   For  example,  the function that allocates memory is called
         ZAlloc.  A VMS version of ZAlloc can be found in  ZVMS.C,  and  an
         MS-DOS version can be found in ZMSDOS.C.

         An extra file called ZUNKN.C exists to help efforts to port TECO-C
         to  a  new  environment.   This  file  contains  stubs for all the
         system-dependent functions.

     2.  All   system-independent    global    variables    are    declared
         alphabetically  in  file  TECOC.C.   They are defined in DEFEXT.H,
         which is included by all modules.

     3.  File TECOC.H contains the "global"  definitions,  including  those
         for most structures.

     4.  Variables  and  functions  are  defined  using   the   portability
         identifiers  defined  in the ZPORT.H file.  Functions which do not
         return a value are defined as VOID.   TECO-C  should  compile  and
         link  on  different  machines  by  changing  only  the environment
         definitions in the ZPORT.H file.

     5.  At one time, every function was in a file with the same  name
         as  the  function.   This  made  it  easy  to  find the code for a
         function.  The problem was that some groups of functions use  data
         not  needed  by  the other functions.  This was especially true of
         the system-dependent functions.  Also, some functions were  called
         only by one other function, so it made sense for them to be in the
         same module as the caller and be  made  "static".   So  now,  most
         functions  are  in a file named the same as the function, with the
         following exceptions:

         1.  All the "Z" functions are in are in the "Z" file for the given
             system.

         2.  The conditionally-compiled functions (ZCpyBl in  ZINIT.C,  the
             "Dbg" functions at the bottom of TECOC.C, the "v" functions in
             EXEW.C) aren't in their own files.  If  they  were,  then  the
             command procedures/makefiles that compile the files would need
             to contain logic to conditionally compile the files.

                                                                Page 3


         3.  The functions for the "E" and "F" commands are in  EXEE.C  and
             EXEF.C,  respectively.  So if you want to find function ExeEX,
             don't look for a file named EXEEX.C.


     6.  Symbols are 6 characters long or less.  The  way  I  remember  it,
         this  was caused by the first system I wrote TECOC for:  CP/M-68k,
         which had a limit of 8 characters for file names.   The  last  two
         characters  had to be ".C", so 6 characters were left for the file
         name.  Since the file  name  was  the  same  as  the  function  it
         contained, functions were limited to 6 characters in length.  When
         I saw how nicely the function declarations looked (they fit in one
         tab slot), I used 6 characters for other symbols too.

         I've since been told that  CP/M-68k  has  8-character  file  names
         followed  by  3-character file types, so CP/M-68k can't be blamed.
         So shoot me.

         This standard has prevented problems from occurring with compilers
         that don't support very many characters of uniqueness.

         In order to make up for the resultant  cryptic  names,  upper  and
         lower  case  are mixed.  An uppercase letter indicates a new word.
         For example, "EBfEnd" stands for "Edit Buffer End".  If  you  need
         to  know what a variable name means, look at the definition of the
         variable in DEFEXT.H.  The expanded  version  of  the  abbreviated
         name  appears  in  the  comment  on  the same line as the variable
         definition.   A  detailed  description  can  be  found  with   the
         declaration of the variable in TECOC.C.

         The  limit  of  6  letters  in  variable  names  is   relaxed   in
         system-dependent code.

     7.  Variable and function names follow patterns where  possible.   For
         instance,  "EBfBeg"  and "EBfEnd" are the beginning and end of the
         edit buffer.  If you see a variable named "BBfBeg", you can assume
         that it is the beginning of some other buffer, and that a "BBfEnd"
         exists which is the end of that buffer.

     8.  Character strings are usually represented in C by a pointer  to  a
         sequences  of bytes terminated with a null character.  I didn't do
         that in TECO-C because I thought it was too inefficient.   To  get
         the  length  of  a  string,  you  have  to count characters.  Most
         strings in TECO-C are therefore represented by two pointers, on to
         the  first  character  and one to the character following the last
         character.  With this representation, it's easy to add  characters
         to a string and trivial to get the length.

     9.  Each file has a consistent format, which is:

         1.  a comment describing the function
         2.  include directives
         3.  the function declaration

                                                                Page 4


         4.  local variable definitions, in alphabetical order
         5.  code


5  TOP LEVEL EXECUTION AND COMMAND PARSING

     The top level code for TECO-C is contained in  file  TECOC.C.   It  is
very  simple:   after initializing, a loop is entered which reads a command
string from the user, executes it, and loops back to read  another  command
string.   If  the  user executes a command which causes TECO-C to exit, the
program is exited directly via a call to the TAbort function.  TECO-C never
exits by "falling out the bottom" of the main function.

     After a command string is read,  the  ExeCSt  function  is  called  to
execute  the  command  string.  ExeCSt contains the top-level parsing code.
The parse is trivial:  each command character is used as an  index  into  a
table  of  functions.   The  table  contains  one entry for each of the 128
possible characters.  Each function  is  responsible  for  "consuming"  its
command  so  that when it returns, the command string pointer points to the
next command.


5.1  Error Handling

     When an error is detected, an error message is displayed at the  point
that  the  error  is  detected,  and  the  function  in which the error was
detected returns a FAILURE status to its caller.  Almost always, the caller
returns  a FAILURE status to it's caller, which returns a FAILURE status to
it's caller, etc.  When a FAILURE status is returned to  the  main  command
string parser, parsing of the command string stops and the user is prompted
for a new command string.

     This style tends to cause all function calls to follow the same  form,
which is

                    if (function() == FAILURE)
                            return(FAILURE);

     Things get more complicated in the system-dependent code (in the files
with  names  that  start  with  a  "Z").  I extended TECO's error reporting
slightly to allow the user to see the  operating  system's  reason  for  an
error,  as this is often useful.  For example, under VAX/VMS there are many
reasons why an attempt to create an output file might fail.  They  include:
errors  in  file  name  syntax,  destination  directory non-existence, file
protection violations or disk quota violation.  In order to  supply  enough
information to the user, TECO-C outputs multiple-line error messages when a
system error occurs.

     Multiple-line error messages  contain  one  line  that  describes  the
operating  system's  perception  of  the error and one line that describe's
TECO's perception of the error.  For instance, if a user of VAX/VMS does  a
"EW[abc]test.txt$$"  command  when  the directory [abc] does not exist, the

                                                                Page 5


error message generated by TECO-C is:

            ?SYS   %RMS-F-DNF, directory not found
            ?UFO   unable to open file "[abc]test.txt" for output

     System errors are therefore reported in  a  system-dependent  fashion,
using  whatever  messages  the operating system can supply.  Under VAX/VMS,
the system service $GETMSG provides human-readable messages that TECO-C can
use  in the "SYS" part of the error message.  Under UNIX, syserrlist[error]
is a pointer to these messages.

     There is another way in which error reporting in the  system-dependent
code is tricky.  Under VAX/VMS, some system calls may return a code that is
"successful" but contains extra information.  For instance, when a user has
set his directories so that only a limited number of versions of a file can
exist, RMS will automatically purge the oldest version of the file when the
user  creates  a  file.   This only happens if the newly created file would
cause too many versions of the file to exist.  When this happens,  the  VMS
service  returns  a FILEPURGED status, which is successful.  TECO-C informs
the user about these things by displaying the message in brackets.


5.2  Command Modifiers (CmdMod)

     Command parsing  is  complicated  by  command  modifiers  and  numeric
arguments, which may precede some commands.  These are implemented in a way
that maintains the basic "jump table" idea.  For instance, when an  at-sign
(@)  modifier  is  encountered  in  a  command  string, the at-sign command
function (ExeAtS) is called.  The only thing ExeAtS  does  is  set  a  flag
indicating  that  an  at-sign  has  been  encountered.   Commands which are
affected by an at-sign modifier check this flag and behave accordingly.

     The flags which indicate command modifiers  are  contained  in  global
variable  CmdMod.   A  bit in CmdMod is reserved for each command modifier.
The modifiers are "@", ":" and "::".  Of course, once  the  flag  has  been
set,  it  must be cleared.  With this parsing algorithm, the only way to do
that is to make every command function explicitly  reset  CmdMod  before  a
successful  return.  This is not too bad:  clearing all the flags in CmdMod
is done with one statement:  "CmdMod = '\0';".

     For numeric arguments to commands, an expression stack  is  used  (see
Stacks).   The  EstTop variable is the pointer to the top of the expression
stack.  Commands which handle numeric arguments check EStTop to see if  the
expression stack contains a value.

     A special case of  numeric  arguments  is  "m,n".   The  "m"  part  is
encountered  and  causes  the value to be pushed onto the expression stack.
The comma causes the ExeCom function to  move  the  value  into  a  special
"m-argument"  global  variable (MArgmt), clear the expression stack and set
another flag in CmdMod indicating that the "m" part of  an  "m,n"  pair  is
defined.   Then the "n" is encountered and pushed onto the stack.  Commands
which can take "m,n" pairs check the flag in CmdMod.

                                                                Page 6


     To summarize, CmdMod and  EStTop  are  variables  which  describe  the
context  of  a command.  Each command function tests these variables to see
if it was preceded by modifiers or  numbers.   For  this  to  work,  it  is
important  that the expression stack and the flags in CmdMod are cleared at
the right times.  It is the responsibility  of  each  command  function  to
leave  CmdMod  and  EStTop  with  the  proper  values  before  successfully
returning.  The rules are:

     1.  If the command function is returning FAILURE,  don't  worry  about
         clearing  CmdMod  or EStTop.  They will be cleared before the next
         command string is executed.

     2.  If the command function leaves a value on the expression stack, do
         not  clear  EStTop before returning SUCCESS.  If the command calls
         GetNmA,  do  not  clear  EStTop,  as  GetNmA  does  it  for   you.
         Otherwise, clear EStTop before returning SUCCESS.

     3.  Clear CmdMod unless the command function sets flags  or  needs  to
         leave  them alone.  ExeDgt, for example, handles digit strings and
         doesn't clear CmdMod because the MARGIS bit may be set.


6  SEARCHING

     The search algorithm in TECO-C is complex.  The war between the desire
for  a  fast search and the need to handle all the features of TECO'ssearch
commands has produced code which can  be  a  real  pain  to  follow.   This
section  attempts  to explain how things got the way they are.  The code is
explained in a bottom-up fashion, to follow  the  way  it  evolved  in  the
author's twisted mind.

     The basic search idea is to scan a contiguous edit buffer for a search
string.  The steps are:

     1.  Search the edit buffer for  the  first  character  in  the  search
         string.  If you reach the end of the edit buffer without matching,
         the search fails.

     2.  When the first character of the search string matches a  character
         in  the  edit  buffer,  try  to match successive characters in the
         search string with the characters which follow the found character
         in  the  edit buffer.  If they all match, the search succeeds.  If
         one doesn't, go back to step 1.


     This is basically what TECO-C does.  The  features  of  TECO's  search
commands has buried these steps deep within some confusing code.

     The first complication is introduced by pattern  matching  characters.
TECO  has  17 "match constructs", whiceh are indicated in the search string
by the special characters ^X, ^S, ^N and ^Ex where "x" can be several other
characters.   For  instance,  a  ^X  in  the  search  string means that any
character is to be accepted as a match in  place  of  the  ^X.   Characters

                                                                Page 7


other  than  the  match  constructs represent themselves.  An example:  the
search string "a^Xb" contains 3 match constructs:  a, ^X and b.

     TECO also supports forward  or  backward  searching.   When  searching
backwards,  only  the  search  for  the first match construct in the search
string is done in a backwards direction.  When the character is found,  the
characters  following  it  are  compared in a forward direction to the edit
buffer characters.  This means that once the first match construct has been
found,  a single piece of code can be used to compare successive characters
in the search  string  with  successive  characters  in  the  edit  buffer,
regardless of whether the search is forwards or backwards.

     Adding these new features, the new description of searching is:

     1.  Search the edit buffer forwards or backwards for a character which
         matches  the  first  match construct in the search string.  If you
         reach the end of the edit  buffer  without  matching,  the  search
         fails.

     2.  When the first match construct of  the  search  string  matches  a
         character  in  the  edit  buffer,  try  to  match successive match
         constructs in the search string with the characters  which  follow
         the  found  character  in the edit buffer.  If they all match, the
         search succeeds.  If one doesn't, go back to step 1.


     To begin a description of which routines implement  the  above  steps,
and  in  order  to  have  a  reference  for later discussion, the following
hierarchy chart of "who calls who" is presented.


                                                                Page 8


 ExeEUn ExeFB ExeFC ExeFD ExeFK ExeFN ExeFS ExeFUn ExeN ExeS ExeUnd
    |     |     |     |     |     |     |     |      |    |    |
    |     |     |     |     |     |     |     |      |    |    |
    ------------------------------------------------------------
                                  |
                                  V
                                Search
                                  |
                                  V
                                SrcLop
                                  |
                                  V
                                SSerch
                               |  |  |
                        +------+  |  +------+
                 +---+  |         |         |  +---+
                 |   V  V         |         V  V   |
                 |  ZFrSrc        |        BakSrc  |
                 |   |  |         |         |  |   |
                 +---+  |         |         |  +---+
                        +------+  |  +------+
                               V  V  V
                                CMatch  <--+
                                  |        |
                                  +--------+


     At the top are the functions that implement search commands  (E_,  FB,
FC,  FD, FK, FN, FS, F_, N, S and _).  All of these functions call the main
search function:  Search.

     At the lower level are the functions which implement  steps  1  and  2
described   above.   ZFrSrc  searches  forwards  in  the  edit  buffer  for
characters which match the first character in the  search  string.   BakSrc
does the same thing, but searches backwards.  SSerch calls one of these two
functions and then executes a loop which calls CMatch to compare successive
match  constructs  in  the  search string to characters following the found
character in the edit buffer.  The reason that ZFrSrc,  BakSrc  and  CMatch
call themselves is to handle some of the more esoteric match constructs.

     Case dependence in TECO is controlled by the search mode flag (see the
^X  command).  The variable SMFlag holds the value of the search mode flag,
and is used by ZFrSrc, BakSrc and CMatch.

     One final point to help confuse things:  ZFrSrc  is  system-dependent.
It  contains  a  VAX/VMS-specific version which uses the LIB$SCANC run-time
library routine to access the SCANC  instruction.   The  SCANC  instruction
looks  like  it was designed to handle TECO's match constructs.  I couldn't
resist using it, but it was a mistake,  as  it  needlessly  complicates  an
already  messy  algorithm.   I have decided to remove the VMS-specific code
some time in the future.

     Further complications of the search algorithm  arise  because  of  the
following capabilities of TECO searches:

                                                                Page 9


     1.  If there is no text argument, use the previous search argument.

     2.  If colon modified, return success/failure and no error message

     3.  If the search fails and we're in a loop and  a  semicolon  follows
         the  search  command,  exit  the  loop without displaying an error
         message.

     4.  Handle optional repeat counts

     5.  If the ES flag is non-zero, verify the search based on  the  value
         of the flag.

     6.  If bit 64 of the ED flag is set,  move  dot  by  one  on  multiple
         searches.

     7.  If bit 16 of the ED flag  is  set,  don't  move  after  a  failing
         search.

     8.  Be fast.


7  MEMORY MANAGEMENT

     7.1  The Edit Buffer And Input Buffer

     TECO-C is based on TECO-11, but it  uses  a  different  form  of  edit
buffer memory management.  Here's why.

     The edit buffer in TECO-11 is implemented as  a  continuous  block  of
memory.   This  allows  rapid  movement  through  the  edit buffer (by just
maintaining a  pointer  to  the  current  spot)  and  makes  searches  very
straightforward.  Insertion and deletion of text is expensive, because each
insertion or deletion requires moving the text following the spot where the
insertion  or  deletion  occurs  in order to maintain a continuous block of
memory.  This gets to be a real pain when a  video  editing  capability  is
added to TECO, because in video mode text is added/deleted one character at
a time very rapidly.

     TECO-C uses a edit buffer gap scheme.   The  edit  buffer  occupies  a
continuous piece of memory, but there is a gap at the "current spot" in the
edit buffer.  When the user moves around the edit buffer, the gap is  moved
by  shuffling  text from one side of the gap to the other.  This means that
moving around the text buffer is slower than for TECO-11's scheme, but text
insertion  and deletion is very fast.  Searches are still fast because most
searches start at the current spot and  go  forwards  or  backwards,  so  a
continuous  piece  of memory is searched.  In the future, when some kind of
video mode is added, insertion and deletion one-character-at-a-time will be
fast using the gap scheme.

     The variables that maintain pointers to the edit buffer  and  the  gap
within  the buffer can be confusing, so here's some examples.  Suppose that
10000 bytes are allocated for the edit buffer when TECO-C  is  initialized.

                                                               Page 10


Suppose the allocated memory starts at address 3000.

        Empty edit buffer (the gap spans the whole edit buffer):

                EBfBeg = 3000           (edit buffer beginning)
                GapBeg = 3000           (gap beginning)
                GapEnd = 13000          (gap end)
                EBfEnd = 13000          (edit buffer end)

        Buffer contains "test",  character pointer is before the first 't':

                EBfBeg = 3000           (edit buffer beginning)
                GapBeg = 3000           (gap beginning)
                GapEnd = 12996          (gap end)
                         12997  't'
                         12998  'e'
                         12999  's'
                EBfEnd = 13000  't'     (edit buffer end)


        Buffer contains "test",  character pointer is after the last 't':

                EBfBeg = 3000   't'     (edit buffer beginning)
                         3001   'e'
                         3002   's'
                         3003   't'
                GapBeg = 3004           (gap beginning)
                GapEnd = 13000          (gap end)
                EBfEnd = 13000          (edit buffer end)


        Buffer contains "test",  character pointer is after the 'e':

                EBfBeg = 3000   't'     (edit buffer beginning)
                         3001   'e'
                GapBeg = 3002           (gap beginning)
                GapEnd = 12998          (gap end)
                         12999  's'
                EBfEnd = 13000  't'     (edit buffer end)

     When an insertion command is executed, the text is  inserted  starting
at  GapBeg.  When a deletion command is executed, GapEnd is incremented for
a forward delete or GapBeg is decremented for a backwards delete.  When the
character  pointer  is moved forwards, the gap is moved forwards by copying
text from the end of the gap to the beginning.  When the character  pointer
is moved backwards, the gap is moved backwards by copying text from the the
area just before the gap to the area at the end of the gap.

     There are a few messy cases, such as when a bounded search is executed
and  the bounded text area includes the edit buffer gap.  In this case, the
gap is temporarily moved so that the search can proceed over  a  continuous
memory area.

                                                               Page 11


     In order to confuse things a little, TECO-C has one  addition  to  the
basic  edit  buffer  gap  management.  Following the end of the edit buffer
(EBfEnd) is the current input stream buffer.   Since  file  input  commands
always  cause  text  to  be appended to the end of the edit buffer, this is
natural.  Thus, no input buffer is needed:  text is input directly into the
edit  buffer.   This  makes  the code a little confusing, but it avoids the
problem of having an input buffer.  When you have an input buffer, you have
to  deal with the question of how large the buffer should be and what to do
with it when it's too small.  this  scheme  is  fast  and  and  saves  some
memory.  (see File Input)


7.2  Q-registers

     Q-registers have two parts:  a numeric part and  a  text  part.   Each
q-register  is  represented by a structure containing three fields:  one to
hold the numeric part and two to point to the  beginning  and  end  of  the
memory holding the text part.  If the text part of the q-register is empty,
then the pointer to the beginning of the text is NULL.

     There are 36 global q-registers, one for each letter of  the  alphabet
and  1  for  each digit from 0 to 9.  These q-registers are accessible from
any macro level.  There are 36 local q-registers for each macro level.  The
names  for  local  q-registers  are preceded by a period.  Thus the command
"1xa" inserts a line into global q-register "a", while the  command  "1x.a"
inserts  a line into local q-register ".a".  Storage for the data structure
defining local q-registers is not allocated until  a  local  q-register  is
first  used.   This  saves  space  and  time, because local q-registers are
rarely used, and doing things this way avoids allocating and freeing memory
every time a macro is executed.


8  STACKS

     8.1  Expression Stack

     An expression stack is used to parse TECO's expressions.  Consider the
command string QA+50=$$.  When the command string is executed, the value of
QA is pushed on the expression stack, then the operator "+"  is  pushed  on
the  expression  stack, and then the value "50" is pushed on the expression
stack.  Whenever a full expression that can be reduced is on the expression
stack, it is reduced.  For the above example, the stack is reduced when the
value "50" is pushed.

     The expression stack is implemented in the following variables:

        EStack  the stack itself,  containing saved operators and operands
        EStTop  index of the top element in EStack
        EStBot  index of the current "bottom" of the stack in EStack

     The "bottom" of the expression stack can change because an  expression
can  include  a macro invocation.  For example, the command QA+M3=$$ causes
the value of "QA" to be pushed on the expression stack,  then  the  "+"  is

                                                               Page 12


pushed,  and  then  the  macro  contained in q-register 3 is executed.  The
macro in q-register 3 returns a value to be used in the  expression.   When
the macro is entered, a new expression stack "bottom" is established.  This
allows  the  macro  to  have  a  "local"  expression  stack  bottom   while
maintaining the stack outside the macro.


8.2  Loop Stack

     The loop stack contains the loop count and the address  of  the  first
command  in  the loop.  For example, in the command 5<FSMP$mt$>$$, the loop
stack contains the loop count (5) and the address of the first  command  in
the  loop  (F).  Whenever the end-of-loop character (>) is encountered, the
loop count is decremented.  If the loop count is still  greater  than  zero
after  it has been decremented, then the command string pointer is reset to
point to the first character in the loop (F).

     The loop stack is implemented in the following variables:

        LStack  the stack itself,  containing saved counts and addresses
        LStTop  index of the top element in LStack
        LStBot  index of the current "bottom" of the stack in LStack

     The loop stack needs a "floating" bottom for the same reason that  the
expression   stack   needs  one:   macros.   Consider  the  command  string
4<Smp$M7$>$$.  When the "<" in is encountered, the loop count (4)  and  the
address  of  the  first  character  in  the loop (S) are placed on the loop
stack.  Command execution continues, and the "M7" command  is  encountered.
Suppose  that  q-register 7 contains the erroneous command string 10>DL>$$.
When the ">" command is encountered in the macro,  TECO  expects  the  loop
stack to contain a loop count and an address for the first character in the
loop.  In this example, there is no matching "<" command in the macro which
would  have  set  up  the loop stack.  It would be very bad if TECO were to
think that the loop count was 4 and the first command in the loop was  "S".
In this situation, what TECO should do is generate the error message "BNI >
not in iteration".  In order to implement  this,  the  variable  LStBot  is
adjusted  each  time  a  macro is entered or exited.  LStBot represents the
bottom of the loop stack for the current macro level.


8.3  Macro Stack

     The macro stack is used to preserve  context  each  time  a  macro  is
entered.   All important values are pushed onto the stack before a macro is
entered and popped off the stack when the macro is exited.  The macro stack
is  also  used  by  the  EI  command,  which means it's used when executing
initialization files and mung files.


9  HELP

     This section discusses on-line HELP, which  is  available  only  under

                                                               Page 13


VAX/VMS.

     The HELP command is not documented in the TECO manual  distributed  by
DEC.,  even  though  it  is supported in TECO-11 and TECO-32.  To get help,
simply type "HELP" followed by a carriage return.  HELP is  the  only  TECO
command that is not terminated by double escapes.

     Help  in  TECOC  is  different  than  help  in  TECO-11.   In  TECO-C,
interactive  help mode is entered, so that a user can browse through a help
tree, as he can from DCL.  In  TECO-C,  access  is  provided  to  only  two
libraries:   the  library  specific  to  TECO-C (pointed to by logical name
TEC$INIT) and the system help library.  To get help  on  TECO-C,  just  say
"HELP",  with  or  without arguments.  To get help from the system library,
say "HELP/S".  I find this easier to use than TECO-11's syntax.

     The help library for TECO-C is contained in file TECOC.HLB,  which  is
generated  from  TECOC.HLP,  which  is  generated from TECOC.RNH.  See file
TECOC.RNH for a description of how to do it.   This  help  library  is  far
broader  than  the library for TECO-11, but much of it has yet to be filled
in.

     The help library is also the repository for  verbose  error  messages,
which are displayed when the help flag (EH) is set to 3.  For systems other
than VMS, the ZHelp function displays  verbose  text  contained  in  static
memory (see file ZHELP.C).


10  FILE INPUT

     TECO has an elegant design that allows high speed input.  There are no
linked  list  data  structures  to  keep track of, and most file input goes
directly to the end of the edit buffer.

     TECO-C takes advantage of this by reading normal file  input  directly
to  the end of the edit buffer.  After each input call, nothing needs to be
moved; the pointer to the end of the edit  buffer  is  simply  adjusted  to
point  to  the  end  of the new record.  The pointer to the end of the edit
buffer (EBfEnd) serves two purposes:  it points to  the  end  of  the  edit
buffer and to the beginning of the input buffer.

     A side effect of this scheme is the sharing of memory between the edit
buffer and the input buffer.  When the edit buffer is empty, it can be made
smaller by shrinking the edit buffer gap in order to make the input  buffer
larger.   Obviously,  if  the  edit  buffer needs to be expanded, the input
buffer can suffer  before  more  memory  is  actually  requested  from  the
operating  system.   This  is  easily achieved by moving the pointer to the
"end-of-the-edit-buffer"/ "beginning-of-the-input-buffer".

     This scheme works, but provides no support for the other forms of file
input.   The  EP and ER$ commands provide a complete secondary input stream
which can be open at the same time as the primary stream (two  input  files
at  once).   The  EI  command  reads  and  executes  files  containing TECO
commands, and is used to execute the initialization file,  if  one  exists.
The  EQq  command,  if  implemented,  reads  the  entire contents of a file

                                                               Page 14


directly into a Q-register.

     A second problem arises:  on each of the open files, the quantum  unit
of  input  is  not  standard.   For  A,  Y  and  P commands, a form feed or
end-of-file "terminate" the read.  For n:A commands, form feed, end-of-line
or  end-of-file  "terminate"  each  read.   For EI commands, two escapes or
end-of-file "terminate" the read.  The input code must "save"  the  portion
of  an  input record following a special character and yield the saved text
when the next command for the file is executed.

     The scheme used in TECO-C is to  read  text  from  the  current  input
stream  directly  to  the end of the edit buffer.  When the input stream is
switched via a EP or ER$ command, the obvious switching of file descriptors
happens,  and  any  text that's "leftover" from the last read is explicitly
saved elsewhere.  Note that this happens VERY rarely, so a  malloc/free  is
acceptable.

     For EI and EQq commands, the input memory following the edit buffer is
used  as  a  temporary  input  buffer.  After the file is read, the text is
copied to a Q-register in the case of EQq and to a separate buffer  in  the
case of EI.


11  VIDEO

     As of 18-Feb-1991, TECO-C supports video only under  Unix.   The  code
was  written  by  Mark  Henderson,  using  the  CURSES  package.   See file
VIDEO.TXT for a discussion of how it works.


12  PORTABILITY

     TECO-C was written with portability in mind.   The  first  development
machine  was  "minimal":   a  SAGE  IV  (68000)  running CP/M-68k.  In that
environment, there was no "make" utility.

     Initially, the system-independent code (files that don't start with  a
"Z")  had  absolutely  no  calls to standard C runtime functions.  This was
because I had several problems with  the  "standard"  functions  not  being
"standard" on different machines.  With the onset of ANSI C I've grown less
timid, but the main code still references  almost  no  standard  functions.
This  is  less  of  a  limitation than you might think:  TECO-C doesn't use
null-terminated strings.  It also doesn't use unions, floating point or bit
fields.


13  PORTING TO A NEW ENVIRONMENT


     1.  Move the source code to the target machine.

                                                               Page 15


     2.  Inspect file ZPORT.H.  You need to select the  compiler  you  want
         the code compiled for.  For instance, if you are porting to a Unix
         system, then fix ZPORT.H so that the unix  identifier  is  defined
         (it  is  usually  defined  by  default  by the compiler).  If your
         compiler is nothing like anything supported by ZPORT.H,  then  set
         the UNKNOWN identifier.

     3.  Compile and link.  See file AAREADME.TXT for descriptions  of  how
         TECO-C  is  built  in  supported environments, and steal like mad.
         The problem here is that you need a "Z" file for your environment,
         containing  all  the  "Z" functions needed by TECO-C.  The easiest
         thing to do is copy ZUNKN.C to your own "Z" file and link  against
         that.   For  instance,  if I ever port TECO-C to a Macintosh, I'll
         copy ZUNKN.C to ZMAC.C.

     4.  Fix things  so  the  compile/link  is  successful.   If  you  have
         compiled  with UNKNOWN set, you should get an executable file that
         displays a  message  and  dies  when  the  first  system-dependent
         function  is  called.  The strategy is to fix that function (often
         by stealing from the code for other operating systems), relink and
         deal  with  the  next message until you have something that works.
         Functions should be implemented in roughly  the  following  order:
         ZInit,  ZTrmnl, ZExit, ZDspCh, ZAlloc, ZRaloc, ZFree, ZChin.  This
         will give you a TECO with everything but file I/O.   You  can  run
         it,  add  text  to  the  edit  buffer,  delete  text,  search, use
         expressions and the = sign command (a calculator).  Then  do  file
         input:   ZOpInp,  ZRdLin,  ZIClos.   Then do file output:  ZOpout,
         ZWrBfr, ZOClos, ZOClDe.  Use the test macros (*tst*.tec)  to  test
         how everything works (see Testing).


14  TESTING

     Testing of TECO-C is performed by executing macros.   The  macros  are
contained  in  files named TSTxxx.TEC, where XXX is some kind of indication
as to what is tested.  For instance, TSTQR.TEC tests q-registers.  The test
macros  do  not  test  all  the  functions  provided  by  TECO.   They were
originally used to verify that TECO-C performs exactly the same as  TECO-11
under  the  VMS operating system.  When I needed to test a chunk of code, I
sometimes did it the right way and wrote a macro.


15  DEBUGGING

     A debugging system (very ugly, very useful)  is  imbedded  within  the
code.   It  is conditionally complied into the code by turning on or off an
identifier (DEBUGGING) defined in the TECOC.H file.  When debugging code is
compiled  in,  you can access it using the ^P command, which is not used by
regular TECO.  The ^P command with no argument will display help about  how
to use ^P.

                                                               Page 16


     If you are working under  VMS,  it  sometimes  helps  to  compare  the
execution  of  TECO-C with TECO-11.  Put a test command string into a file.
Use DEFINE/USER_MODE to redirect the output of TECO-C to a file and execute
the  macro  with  TECO-C.   Then  do  the same thing with TECO-11.  Use the
DIFFERENCES command to compare the two output files.  They  should  be  100
percent identical.