TECO-C Programmer's Guide (last updated February 18, 1991 to reflect TECO-C version 140) 1 INTRODUCTION These notes apply to TECOC version 135, which runs under VAX/VMS, MS-DOS, and Unix (SunOS, which is BSD). See file AAREADME.TXT for the specifics of operating system and compilers which have been used. TECO-C is meant to be a complete implementation of TECO as defined by the Standard TECO User's Guide and Language Reference Manual, (in file TECO.DOC). It was written so that the author could move to many machines without knowing many editors. 2 COMPILING AND LINKING Conditional compilation directives are used to build TECO-C correctly for different environments. Identifiers automatically defined by the different compilers are used. Some identifiers defined in file ZPORT.H control whether video support or extra debugging code is included. See "VIDEO" and "DEBUGGING". Files are provided which "build" TECO-C in various environments. See file AAREADME.TXT for details. 3 RUNNING TECO-C When you run TECO, the command line used to invoke TECO is parsed for an input/output file name and several optional switches. TECO-11 parses the command line using a TECO macro imbedded in the program. TECO-C does the same thing. Actually, the imbedded macro used to parse the command line was stolen from TECO-11. I commented it and then modified it to repair minor inconsistencies. Use of TECO-11's macro makes TECO-C invocation identical to TECO-11's, even responding to "make love" with "Not war?". The macro is in file CLPARS.TES. The compressed version (no comments or whitespace) is in file CLPARS.TEC. The GENCLP program converts CLPARS.TEC into CLPARS.H, an include file suitable for compiling into TECO-C. 4 CODE CONVENTIONS The code is not modular. Communication between almost all functions is through global variables, not argument lists. There is a reason: the nature of the basic parsing algorithm is to use the characters in the command string as indices into a table of functions. This makes for very fast command parsing, but it means that all the functions have to modify Page 2 global values, because no arguments are passed in. In other words, there were going to be 130 or so un-modular functions anyway, so I gave up on modularity. This explanation does not explain some of the complications in the search code, like the global variable SrcTyp. Oh, well. Here's a brief list of some of the conventions followed by the code: 1. TECO-C is portable, so some convention was needed to separate portable code from system-dependent code. There is one file containing the system-dependent code for each platform TECO-C supports. These files have names that start with a "Z": ZVMS.C, ZMSDOS.C and ZUNIX.C. All the system-dependent functions in those files start with a "Z". For example, the function that allocates memory is called ZAlloc. A VMS version of ZAlloc can be found in ZVMS.C, and an MS-DOS version can be found in ZMSDOS.C. An extra file called ZUNKN.C exists to help efforts to port TECO-C to a new environment. This file contains stubs for all the system-dependent functions. 2. All system-independent global variables are declared alphabetically in file TECOC.C. They are defined in DEFEXT.H, which is included by all modules. 3. File TECOC.H contains the "global" definitions, including those for most structures. 4. Variables and functions are defined using the portability identifiers defined in the ZPORT.H file. Functions which do not return a value are defined as VOID. TECO-C should compile and link on different machines by changing only the environment definitions in the ZPORT.H file. 5. At one time, every function was in a file with the same name as the function. This made it easy to find the code for a function. The problem was that some groups of functions use data not needed by the other functions. This was especially true of the system-dependent functions. Also, some functions were called only by one other function, so it made sense for them to be in the same module as the caller and be made "static". So now, most functions are in a file named the same as the function, with the following exceptions: 1. All the "Z" functions are in are in the "Z" file for the given system. 2. The conditionally-compiled functions (ZCpyBl in ZINIT.C, the "Dbg" functions at the bottom of TECOC.C, the "v" functions in EXEW.C) aren't in their own files. If they were, then the command procedures/makefiles that compile the files would need to contain logic to conditionally compile the files. Page 3 3. The functions for the "E" and "F" commands are in EXEE.C and EXEF.C, respectively. So if you want to find function ExeEX, don't look for a file named EXEEX.C. 6. Symbols are 6 characters long or less. The way I remember it, this was caused by the first system I wrote TECOC for: CP/M-68k, which had a limit of 8 characters for file names. The last two characters had to be ".C", so 6 characters were left for the file name. Since the file name was the same as the function it contained, functions were limited to 6 characters in length. When I saw how nicely the function declarations looked (they fit in one tab slot), I used 6 characters for other symbols too. I've since been told that CP/M-68k has 8-character file names followed by 3-character file types, so CP/M-68k can't be blamed. So shoot me. This standard has prevented problems from occurring with compilers that don't support very many characters of uniqueness. In order to make up for the resultant cryptic names, upper and lower case are mixed. An uppercase letter indicates a new word. For example, "EBfEnd" stands for "Edit Buffer End". If you need to know what a variable name means, look at the definition of the variable in DEFEXT.H. The expanded version of the abbreviated name appears in the comment on the same line as the variable definition. A detailed description can be found with the declaration of the variable in TECOC.C. The limit of 6 letters in variable names is relaxed in system-dependent code. 7. Variable and function names follow patterns where possible. For instance, "EBfBeg" and "EBfEnd" are the beginning and end of the edit buffer. If you see a variable named "BBfBeg", you can assume that it is the beginning of some other buffer, and that a "BBfEnd" exists which is the end of that buffer. 8. Character strings are usually represented in C by a pointer to a sequences of bytes terminated with a null character. I didn't do that in TECO-C because I thought it was too inefficient. To get the length of a string, you have to count characters. Most strings in TECO-C are therefore represented by two pointers, on to the first character and one to the character following the last character. With this representation, it's easy to add characters to a string and trivial to get the length. 9. Each file has a consistent format, which is: 1. a comment describing the function 2. include directives 3. the function declaration Page 4 4. local variable definitions, in alphabetical order 5. code 5 TOP LEVEL EXECUTION AND COMMAND PARSING The top level code for TECO-C is contained in file TECOC.C. It is very simple: after initializing, a loop is entered which reads a command string from the user, executes it, and loops back to read another command string. If the user executes a command which causes TECO-C to exit, the program is exited directly via a call to the TAbort function. TECO-C never exits by "falling out the bottom" of the main function. After a command string is read, the ExeCSt function is called to execute the command string. ExeCSt contains the top-level parsing code. The parse is trivial: each command character is used as an index into a table of functions. The table contains one entry for each of the 128 possible characters. Each function is responsible for "consuming" its command so that when it returns, the command string pointer points to the next command. 5.1 Error Handling When an error is detected, an error message is displayed at the point that the error is detected, and the function in which the error was detected returns a FAILURE status to its caller. Almost always, the caller returns a FAILURE status to it's caller, which returns a FAILURE status to it's caller, etc. When a FAILURE status is returned to the main command string parser, parsing of the command string stops and the user is prompted for a new command string. This style tends to cause all function calls to follow the same form, which is if (function() == FAILURE) return(FAILURE); Things get more complicated in the system-dependent code (in the files with names that start with a "Z"). I extended TECO's error reporting slightly to allow the user to see the operating system's reason for an error, as this is often useful. For example, under VAX/VMS there are many reasons why an attempt to create an output file might fail. They include: errors in file name syntax, destination directory non-existence, file protection violations or disk quota violation. In order to supply enough information to the user, TECO-C outputs multiple-line error messages when a system error occurs. Multiple-line error messages contain one line that describes the operating system's perception of the error and one line that describe's TECO's perception of the error. For instance, if a user of VAX/VMS does a "EW[abc]test.txt$$" command when the directory [abc] does not exist, the Page 5 error message generated by TECO-C is: ?SYS %RMS-F-DNF, directory not found ?UFO unable to open file "[abc]test.txt" for output System errors are therefore reported in a system-dependent fashion, using whatever messages the operating system can supply. Under VAX/VMS, the system service $GETMSG provides human-readable messages that TECO-C can use in the "SYS" part of the error message. Under UNIX, syserrlist[error] is a pointer to these messages. There is another way in which error reporting in the system-dependent code is tricky. Under VAX/VMS, some system calls may return a code that is "successful" but contains extra information. For instance, when a user has set his directories so that only a limited number of versions of a file can exist, RMS will automatically purge the oldest version of the file when the user creates a file. This only happens if the newly created file would cause too many versions of the file to exist. When this happens, the VMS service returns a FILEPURGED status, which is successful. TECO-C informs the user about these things by displaying the message in brackets. 5.2 Command Modifiers (CmdMod) Command parsing is complicated by command modifiers and numeric arguments, which may precede some commands. These are implemented in a way that maintains the basic "jump table" idea. For instance, when an at-sign (@) modifier is encountered in a command string, the at-sign command function (ExeAtS) is called. The only thing ExeAtS does is set a flag indicating that an at-sign has been encountered. Commands which are affected by an at-sign modifier check this flag and behave accordingly. The flags which indicate command modifiers are contained in global variable CmdMod. A bit in CmdMod is reserved for each command modifier. The modifiers are "@", ":" and "::". Of course, once the flag has been set, it must be cleared. With this parsing algorithm, the only way to do that is to make every command function explicitly reset CmdMod before a successful return. This is not too bad: clearing all the flags in CmdMod is done with one statement: "CmdMod = '\0';". For numeric arguments to commands, an expression stack is used (see Stacks). The EstTop variable is the pointer to the top of the expression stack. Commands which handle numeric arguments check EStTop to see if the expression stack contains a value. A special case of numeric arguments is "m,n". The "m" part is encountered and causes the value to be pushed onto the expression stack. The comma causes the ExeCom function to move the value into a special "m-argument" global variable (MArgmt), clear the expression stack and set another flag in CmdMod indicating that the "m" part of an "m,n" pair is defined. Then the "n" is encountered and pushed onto the stack. Commands which can take "m,n" pairs check the flag in CmdMod. Page 6 To summarize, CmdMod and EStTop are variables which describe the context of a command. Each command function tests these variables to see if it was preceded by modifiers or numbers. For this to work, it is important that the expression stack and the flags in CmdMod are cleared at the right times. It is the responsibility of each command function to leave CmdMod and EStTop with the proper values before successfully returning. The rules are: 1. If the command function is returning FAILURE, don't worry about clearing CmdMod or EStTop. They will be cleared before the next command string is executed. 2. If the command function leaves a value on the expression stack, do not clear EStTop before returning SUCCESS. If the command calls GetNmA, do not clear EStTop, as GetNmA does it for you. Otherwise, clear EStTop before returning SUCCESS. 3. Clear CmdMod unless the command function sets flags or needs to leave them alone. ExeDgt, for example, handles digit strings and doesn't clear CmdMod because the MARGIS bit may be set. 6 SEARCHING The search algorithm in TECO-C is complex. The war between the desire for a fast search and the need to handle all the features of TECO'ssearch commands has produced code which can be a real pain to follow. This section attempts to explain how things got the way they are. The code is explained in a bottom-up fashion, to follow the way it evolved in the author's twisted mind. The basic search idea is to scan a contiguous edit buffer for a search string. The steps are: 1. Search the edit buffer for the first character in the search string. If you reach the end of the edit buffer without matching, the search fails. 2. When the first character of the search string matches a character in the edit buffer, try to match successive characters in the search string with the characters which follow the found character in the edit buffer. If they all match, the search succeeds. If one doesn't, go back to step 1. This is basically what TECO-C does. The features of TECO's search commands has buried these steps deep within some confusing code. The first complication is introduced by pattern matching characters. TECO has 17 "match constructs", whiceh are indicated in the search string by the special characters ^X, ^S, ^N and ^Ex where "x" can be several other characters. For instance, a ^X in the search string means that any character is to be accepted as a match in place of the ^X. Characters Page 7 other than the match constructs represent themselves. An example: the search string "a^Xb" contains 3 match constructs: a, ^X and b. TECO also supports forward or backward searching. When searching backwards, only the search for the first match construct in the search string is done in a backwards direction. When the character is found, the characters following it are compared in a forward direction to the edit buffer characters. This means that once the first match construct has been found, a single piece of code can be used to compare successive characters in the search string with successive characters in the edit buffer, regardless of whether the search is forwards or backwards. Adding these new features, the new description of searching is: 1. Search the edit buffer forwards or backwards for a character which matches the first match construct in the search string. If you reach the end of the edit buffer without matching, the search fails. 2. When the first match construct of the search string matches a character in the edit buffer, try to match successive match constructs in the search string with the characters which follow the found character in the edit buffer. If they all match, the search succeeds. If one doesn't, go back to step 1. To begin a description of which routines implement the above steps, and in order to have a reference for later discussion, the following hierarchy chart of "who calls who" is presented. Page 8 ExeEUn ExeFB ExeFC ExeFD ExeFK ExeFN ExeFS ExeFUn ExeN ExeS ExeUnd | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------ | V Search | V SrcLop | V SSerch | | | +------+ | +------+ +---+ | | | +---+ | V V | V V | | ZFrSrc | BakSrc | | | | | | | | +---+ | | | +---+ +------+ | +------+ V V V CMatch <--+ | | +--------+ At the top are the functions that implement search commands (E_, FB, FC, FD, FK, FN, FS, F_, N, S and _). All of these functions call the main search function: Search. At the lower level are the functions which implement steps 1 and 2 described above. ZFrSrc searches forwards in the edit buffer for characters which match the first character in the search string. BakSrc does the same thing, but searches backwards. SSerch calls one of these two functions and then executes a loop which calls CMatch to compare successive match constructs in the search string to characters following the found character in the edit buffer. The reason that ZFrSrc, BakSrc and CMatch call themselves is to handle some of the more esoteric match constructs. Case dependence in TECO is controlled by the search mode flag (see the ^X command). The variable SMFlag holds the value of the search mode flag, and is used by ZFrSrc, BakSrc and CMatch. One final point to help confuse things: ZFrSrc is system-dependent. It contains a VAX/VMS-specific version which uses the LIB$SCANC run-time library routine to access the SCANC instruction. The SCANC instruction looks like it was designed to handle TECO's match constructs. I couldn't resist using it, but it was a mistake, as it needlessly complicates an already messy algorithm. I have decided to remove the VMS-specific code some time in the future. Further complications of the search algorithm arise because of the following capabilities of TECO searches: Page 9 1. If there is no text argument, use the previous search argument. 2. If colon modified, return success/failure and no error message 3. If the search fails and we're in a loop and a semicolon follows the search command, exit the loop without displaying an error message. 4. Handle optional repeat counts 5. If the ES flag is non-zero, verify the search based on the value of the flag. 6. If bit 64 of the ED flag is set, move dot by one on multiple searches. 7. If bit 16 of the ED flag is set, don't move after a failing search. 8. Be fast. 7 MEMORY MANAGEMENT 7.1 The Edit Buffer And Input Buffer TECO-C is based on TECO-11, but it uses a different form of edit buffer memory management. Here's why. The edit buffer in TECO-11 is implemented as a continuous block of memory. This allows rapid movement through the edit buffer (by just maintaining a pointer to the current spot) and makes searches very straightforward. Insertion and deletion of text is expensive, because each insertion or deletion requires moving the text following the spot where the insertion or deletion occurs in order to maintain a continuous block of memory. This gets to be a real pain when a video editing capability is added to TECO, because in video mode text is added/deleted one character at a time very rapidly. TECO-C uses a edit buffer gap scheme. The edit buffer occupies a continuous piece of memory, but there is a gap at the "current spot" in the edit buffer. When the user moves around the edit buffer, the gap is moved by shuffling text from one side of the gap to the other. This means that moving around the text buffer is slower than for TECO-11's scheme, but text insertion and deletion is very fast. Searches are still fast because most searches start at the current spot and go forwards or backwards, so a continuous piece of memory is searched. In the future, when some kind of video mode is added, insertion and deletion one-character-at-a-time will be fast using the gap scheme. The variables that maintain pointers to the edit buffer and the gap within the buffer can be confusing, so here's some examples. Suppose that 10000 bytes are allocated for the edit buffer when TECO-C is initialized. Page 10 Suppose the allocated memory starts at address 3000. Empty edit buffer (the gap spans the whole edit buffer): EBfBeg = 3000 (edit buffer beginning) GapBeg = 3000 (gap beginning) GapEnd = 13000 (gap end) EBfEnd = 13000 (edit buffer end) Buffer contains "test", character pointer is before the first 't': EBfBeg = 3000 (edit buffer beginning) GapBeg = 3000 (gap beginning) GapEnd = 12996 (gap end) 12997 't' 12998 'e' 12999 's' EBfEnd = 13000 't' (edit buffer end) Buffer contains "test", character pointer is after the last 't': EBfBeg = 3000 't' (edit buffer beginning) 3001 'e' 3002 's' 3003 't' GapBeg = 3004 (gap beginning) GapEnd = 13000 (gap end) EBfEnd = 13000 (edit buffer end) Buffer contains "test", character pointer is after the 'e': EBfBeg = 3000 't' (edit buffer beginning) 3001 'e' GapBeg = 3002 (gap beginning) GapEnd = 12998 (gap end) 12999 's' EBfEnd = 13000 't' (edit buffer end) When an insertion command is executed, the text is inserted starting at GapBeg. When a deletion command is executed, GapEnd is incremented for a forward delete or GapBeg is decremented for a backwards delete. When the character pointer is moved forwards, the gap is moved forwards by copying text from the end of the gap to the beginning. When the character pointer is moved backwards, the gap is moved backwards by copying text from the the area just before the gap to the area at the end of the gap. There are a few messy cases, such as when a bounded search is executed and the bounded text area includes the edit buffer gap. In this case, the gap is temporarily moved so that the search can proceed over a continuous memory area. Page 11 In order to confuse things a little, TECO-C has one addition to the basic edit buffer gap management. Following the end of the edit buffer (EBfEnd) is the current input stream buffer. Since file input commands always cause text to be appended to the end of the edit buffer, this is natural. Thus, no input buffer is needed: text is input directly into the edit buffer. This makes the code a little confusing, but it avoids the problem of having an input buffer. When you have an input buffer, you have to deal with the question of how large the buffer should be and what to do with it when it's too small. this scheme is fast and and saves some memory. (see File Input) 7.2 Q-registers Q-registers have two parts: a numeric part and a text part. Each q-register is represented by a structure containing three fields: one to hold the numeric part and two to point to the beginning and end of the memory holding the text part. If the text part of the q-register is empty, then the pointer to the beginning of the text is NULL. There are 36 global q-registers, one for each letter of the alphabet and 1 for each digit from 0 to 9. These q-registers are accessible from any macro level. There are 36 local q-registers for each macro level. The names for local q-registers are preceded by a period. Thus the command "1xa" inserts a line into global q-register "a", while the command "1x.a" inserts a line into local q-register ".a". Storage for the data structure defining local q-registers is not allocated until a local q-register is first used. This saves space and time, because local q-registers are rarely used, and doing things this way avoids allocating and freeing memory every time a macro is executed. 8 STACKS 8.1 Expression Stack An expression stack is used to parse TECO's expressions. Consider the command string QA+50=$$. When the command string is executed, the value of QA is pushed on the expression stack, then the operator "+" is pushed on the expression stack, and then the value "50" is pushed on the expression stack. Whenever a full expression that can be reduced is on the expression stack, it is reduced. For the above example, the stack is reduced when the value "50" is pushed. The expression stack is implemented in the following variables: EStack the stack itself, containing saved operators and operands EStTop index of the top element in EStack EStBot index of the current "bottom" of the stack in EStack The "bottom" of the expression stack can change because an expression can include a macro invocation. For example, the command QA+M3=$$ causes the value of "QA" to be pushed on the expression stack, then the "+" is Page 12 pushed, and then the macro contained in q-register 3 is executed. The macro in q-register 3 returns a value to be used in the expression. When the macro is entered, a new expression stack "bottom" is established. This allows the macro to have a "local" expression stack bottom while maintaining the stack outside the macro. 8.2 Loop Stack The loop stack contains the loop count and the address of the first command in the loop. For example, in the command 5$$, the loop stack contains the loop count (5) and the address of the first command in the loop (F). Whenever the end-of-loop character (>) is encountered, the loop count is decremented. If the loop count is still greater than zero after it has been decremented, then the command string pointer is reset to point to the first character in the loop (F). The loop stack is implemented in the following variables: LStack the stack itself, containing saved counts and addresses LStTop index of the top element in LStack LStBot index of the current "bottom" of the stack in LStack The loop stack needs a "floating" bottom for the same reason that the expression stack needs one: macros. Consider the command string 4$$. When the "<" in is encountered, the loop count (4) and the address of the first character in the loop (S) are placed on the loop stack. Command execution continues, and the "M7" command is encountered. Suppose that q-register 7 contains the erroneous command string 10>DL>$$. When the ">" command is encountered in the macro, TECO expects the loop stack to contain a loop count and an address for the first character in the loop. In this example, there is no matching "<" command in the macro which would have set up the loop stack. It would be very bad if TECO were to think that the loop count was 4 and the first command in the loop was "S". In this situation, what TECO should do is generate the error message "BNI > not in iteration". In order to implement this, the variable LStBot is adjusted each time a macro is entered or exited. LStBot represents the bottom of the loop stack for the current macro level. 8.3 Macro Stack The macro stack is used to preserve context each time a macro is entered. All important values are pushed onto the stack before a macro is entered and popped off the stack when the macro is exited. The macro stack is also used by the EI command, which means it's used when executing initialization files and mung files. 9 HELP This section discusses on-line HELP, which is available only under Page 13 VAX/VMS. The HELP command is not documented in the TECO manual distributed by DEC., even though it is supported in TECO-11 and TECO-32. To get help, simply type "HELP" followed by a carriage return. HELP is the only TECO command that is not terminated by double escapes. Help in TECOC is different than help in TECO-11. In TECO-C, interactive help mode is entered, so that a user can browse through a help tree, as he can from DCL. In TECO-C, access is provided to only two libraries: the library specific to TECO-C (pointed to by logical name TEC$INIT) and the system help library. To get help on TECO-C, just say "HELP", with or without arguments. To get help from the system library, say "HELP/S". I find this easier to use than TECO-11's syntax. The help library for TECO-C is contained in file TECOC.HLB, which is generated from TECOC.HLP, which is generated from TECOC.RNH. See file TECOC.RNH for a description of how to do it. This help library is far broader than the library for TECO-11, but much of it has yet to be filled in. The help library is also the repository for verbose error messages, which are displayed when the help flag (EH) is set to 3. For systems other than VMS, the ZHelp function displays verbose text contained in static memory (see file ZHELP.C). 10 FILE INPUT TECO has an elegant design that allows high speed input. There are no linked list data structures to keep track of, and most file input goes directly to the end of the edit buffer. TECO-C takes advantage of this by reading normal file input directly to the end of the edit buffer. After each input call, nothing needs to be moved; the pointer to the end of the edit buffer is simply adjusted to point to the end of the new record. The pointer to the end of the edit buffer (EBfEnd) serves two purposes: it points to the end of the edit buffer and to the beginning of the input buffer. A side effect of this scheme is the sharing of memory between the edit buffer and the input buffer. When the edit buffer is empty, it can be made smaller by shrinking the edit buffer gap in order to make the input buffer larger. Obviously, if the edit buffer needs to be expanded, the input buffer can suffer before more memory is actually requested from the operating system. This is easily achieved by moving the pointer to the "end-of-the-edit-buffer"/ "beginning-of-the-input-buffer". This scheme works, but provides no support for the other forms of file input. The EP and ER$ commands provide a complete secondary input stream which can be open at the same time as the primary stream (two input files at once). The EI command reads and executes files containing TECO commands, and is used to execute the initialization file, if one exists. The EQq command, if implemented, reads the entire contents of a file Page 14 directly into a Q-register. A second problem arises: on each of the open files, the quantum unit of input is not standard. For A, Y and P commands, a form feed or end-of-file "terminate" the read. For n:A commands, form feed, end-of-line or end-of-file "terminate" each read. For EI commands, two escapes or end-of-file "terminate" the read. The input code must "save" the portion of an input record following a special character and yield the saved text when the next command for the file is executed. The scheme used in TECO-C is to read text from the current input stream directly to the end of the edit buffer. When the input stream is switched via a EP or ER$ command, the obvious switching of file descriptors happens, and any text that's "leftover" from the last read is explicitly saved elsewhere. Note that this happens VERY rarely, so a malloc/free is acceptable. For EI and EQq commands, the input memory following the edit buffer is used as a temporary input buffer. After the file is read, the text is copied to a Q-register in the case of EQq and to a separate buffer in the case of EI. 11 VIDEO As of 18-Feb-1991, TECO-C supports video only under Unix. The code was written by Mark Henderson, using the CURSES package. See file VIDEO.TXT for a discussion of how it works. 12 PORTABILITY TECO-C was written with portability in mind. The first development machine was "minimal": a SAGE IV (68000) running CP/M-68k. In that environment, there was no "make" utility. Initially, the system-independent code (files that don't start with a "Z") had absolutely no calls to standard C runtime functions. This was because I had several problems with the "standard" functions not being "standard" on different machines. With the onset of ANSI C I've grown less timid, but the main code still references almost no standard functions. This is less of a limitation than you might think: TECO-C doesn't use null-terminated strings. It also doesn't use unions, floating point or bit fields. 13 PORTING TO A NEW ENVIRONMENT 1. Move the source code to the target machine. Page 15 2. Inspect file ZPORT.H. You need to select the compiler you want the code compiled for. For instance, if you are porting to a Unix system, then fix ZPORT.H so that the unix identifier is defined (it is usually defined by default by the compiler). If your compiler is nothing like anything supported by ZPORT.H, then set the UNKNOWN identifier. 3. Compile and link. See file AAREADME.TXT for descriptions of how TECO-C is built in supported environments, and steal like mad. The problem here is that you need a "Z" file for your environment, containing all the "Z" functions needed by TECO-C. The easiest thing to do is copy ZUNKN.C to your own "Z" file and link against that. For instance, if I ever port TECO-C to a Macintosh, I'll copy ZUNKN.C to ZMAC.C. 4. Fix things so the compile/link is successful. If you have compiled with UNKNOWN set, you should get an executable file that displays a message and dies when the first system-dependent function is called. The strategy is to fix that function (often by stealing from the code for other operating systems), relink and deal with the next message until you have something that works. Functions should be implemented in roughly the following order: ZInit, ZTrmnl, ZExit, ZDspCh, ZAlloc, ZRaloc, ZFree, ZChin. This will give you a TECO with everything but file I/O. You can run it, add text to the edit buffer, delete text, search, use expressions and the = sign command (a calculator). Then do file input: ZOpInp, ZRdLin, ZIClos. Then do file output: ZOpout, ZWrBfr, ZOClos, ZOClDe. Use the test macros (*tst*.tec) to test how everything works (see Testing). 14 TESTING Testing of TECO-C is performed by executing macros. The macros are contained in files named TSTxxx.TEC, where XXX is some kind of indication as to what is tested. For instance, TSTQR.TEC tests q-registers. The test macros do not test all the functions provided by TECO. They were originally used to verify that TECO-C performs exactly the same as TECO-11 under the VMS operating system. When I needed to test a chunk of code, I sometimes did it the right way and wrote a macro. 15 DEBUGGING A debugging system (very ugly, very useful) is imbedded within the code. It is conditionally complied into the code by turning on or off an identifier (DEBUGGING) defined in the TECOC.H file. When debugging code is compiled in, you can access it using the ^P command, which is not used by regular TECO. The ^P command with no argument will display help about how to use ^P. Page 16 If you are working under VMS, it sometimes helps to compare the execution of TECO-C with TECO-11. Put a test command string into a file. Use DEFINE/USER_MODE to redirect the output of TECO-C to a file and execute the macro with TECO-C. Then do the same thing with TECO-11. Use the DIFFERENCES command to compare the two output files. They should be 100 percent identical.