secondo/Tools/pd/docu

/*
//paragraph [1] Title: [{\Large \bf] [}]
//[ue] [\"{u}]
//[us] [\_]
//[_] [\_]


[1] The PD System: Integrating Programs and Documentation

Ralf Hartmut G[ue]ting

May 1995

2002-2004, Markus Spiekermann. Changes of makefiles and shell scripts.


1 Overview

The purpose of ~PDSystem~ is to allow a programmer to write ASCII program files with embedded documentation (~PD~ stands for ~Programs with Documentation~). Such files are called ~PD files~. Essentially a PD file consists of alternating ~documentation sections~ and ~program sections~. Within documentation sections, one can describe a number of paragraph formats (such as headings, displayed material, etc.), character formats (e.g. italics), and special characters (e.g. ``[ue]''). How this is done, is described in the document ``Integrating Programs and Documentation'' [G[ue]95].

The main component of PDSystem is an executable program ~maketex~ which converts a PD file into a LaTeX file. More precisely, given a file ~pdfile~, a LaTeX file ~pdfile.tex~ is created as follows: First, a standard header for LaTeX is copied from a file ~pd.header~ into ~pdfile.tex~. Then, ~pdfile~ is first run through a filter program called ~pdtabs~ which converts tabulator symbols into corresponding sequences of blanks. The output of this filter is fed into ~maketex~ (which can therefore be sure not to encounter any tab symbols) which converts material in documentation sections into LaTeX code and encloses program sections by ``verbatim'' commands to force TeX to typeset them exactly as they have been written.

A PD file may contain very long lines, because CR (end of line) symbols need only be present to delimit paragraphs. To make the output file ~pdfile.tex~ easily printable, the output of ~maketex~ is run through a further filter called ~linebreaks~ which introduces CR symbols such that no lines with more than 80 characters are in the output. In total, we have the processing steps illustrated in Figure 1.

		Figure 1: Construction of a TeX file from a PD file
				[pddocu.Figure1.eps]

The file ~pd.header~ is shown in Section 6.1. ~Cat~ is a UNIX system command. Executables ~pdtabs~ and ~linebreaks~ are created from the corresponding C programs ~pdtabs.c~ and ~linebreaks.c~ shown in Section 7. The programs leading to ~maketex~ are described below. The complete process shown in Figure 1 is executed by a command procedure called ~pd2tex~ given in Section 8.4.

There are further command procedures:

  * ~pdview~ allows one to view a PD file under unix using the ~xdvi~ viewer (after it has been processed by LaTeX). On MS-Windows the previewer is called ~yap~. The viewer is defined in the environment variable ~PD[_]DVI[_]VIEWER~.

  * ~pdshow~ shows a PD file using a postscript viewer defined in the environment variable ~PD[us]PS[us]VIEWER~. This shows embedded figures. However, the quality of the display is not as good as with ~xdvi~.

  * ~pd2pdf~ converts a PD file into the portable document format using the program ~dvipdfm~.

At a lower level there are three scripts which convert PD to other formats:

  * ~pd2tex~ is called by every script to convert PD into a LaTeX file.

  * ~pd2dvi~ is called by ~pdview~ to create a DVI file.

  * ~pd2ps~  is called by ~pdshow~ and creates a postscript file.

These scripts may be useful to recreate a DVI file while previewing an older version. The preview will detect automatically
that there exists a newer DVI file and reloads it. Another application may be to use Adobe Distiller in order to create a PDF starting
from a postscript file.

As an example, the processing steps used in ~pdshow~ are illustrated in Figure 2.

		Figure 2: Steps of ~pdshow~ [pddocu.Figure2.eps]

We now consider the construction of ~maketex~, the central component of the PD system. ~Maketex~ depends on the components shown in Figure 3.

		Figure 3: Components for building ~maketex~
			[pddocu.Figure3.eps]

These components play the following roles:

  * ~Lex.l~ is a ~lex~ specification of a lexical analyser (which is transformed by the UNIX tool ~lex~ into a C program ~lex.yy.c~). The lexical analyser reads the input PD file and produces a stream of tokens for the parser.

  * ~Parser.y~ is a ~yacc~ specification of a parser (transformed by the UNIX tool ~yacc~ into a C program ~y.tab.c~). The parser consumes the tokens produced by the lexical analyser. On recognizing parts of the structure of a PD file, it writes corresponding LaTeX code to the output file.

  * ~NestedText.c~ is a ``module'' in C providing a data structure for ``nested text'' together with a number of operations. This is needed because text for the output file cannot always be created sequentially. Sometimes it is necessary, for example, to collect a piece of text from the source file into a data structure and then to create enclosing pieces of LaTeX code before and after it. The ~NestedText~ data structure corresponds to a LISP ``list expression'' and is a binary tree with character strings in its leaves. There are operations available to create a leaf from a string, to concatenate two trees, or to write the contents of a tree in tree order to the output.

  * ~ParserDS.c~ contains a number of data structures needed by the parser. These data structures are used to keep definitions of special paragraph formats, special character formats, and special characters, which can be defined in header documentation sections of PD files.

  * ~Maketex.c~ is the main program. It does almost nothing. It just calls the parser; after completion of parsing (which includes translation to Latex) a final piece of Latex code is written to the output.

Figure 3 describes the dependencies among these components at a logical level. An edge describes the ``uses'' relationship. For example, the ~NestedText~ module is used in the lexical analyser, the parser and in the parser data structure component. Figure 4 shows these dependencies at a more technical level, as they are reflected in the ~makefile~ (see Section 9).

		Figure 4: Technical dependencies in the construction of the executable ~maketex~ [pddocu.Figure4.eps]

Here each box corresponds to a file. An unlabeled arrow means the ``include'' relationship (for example, ~NestedText.h~ is included into ~Lex.l~, ~Parser.y~, and ~ParserDS.c~. Edges labeled with ~lex~, ~yacc~, and ~cc~ mean that the tools ~lex~ and ~yacc~ or the C compiler, respectively, produce the result files. Fat edges connecting files indicate that these files are compiled together.

The rest of this document is structured as follows: Section 2 describes the ~NestedText~ module (files ~NestedText.h~ and ~NestedText.c~), Section 3 the lexical analyser (~Lex.l~), Section 4 ~ParserDS.c~, Section 5 the parser itself (~Parser.y~). Section 6 shows the header file for Latex and the rather trivial program ~Maketex.c~. Section 7 contains the auxiliary functions ~pdtabs.c~ and ~linebreaks.c~. Section 8 gives the command procedures ~pdview~, ~pdshow~, etc. Finally, Section 9 contains the ~makefile~.

*/
/**************************************************
----
This file is part of the PD system
Copyright (C) 1998 Ralf Hartmut Gueting, Fachbereich Informatik, FernUniversitaet Hagen

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
----

//[x] [$\times $]
//[->] [$\rightarrow $]
//paragraph [2] verse:	[\begin{verse}]	[\end{verse}]

2 The Module NestedText

2.1 Definition Part

(File ~PDNestedText.h~)

This module allows one to create nested text structures and to write them to standard output. It provides in principle a data type ~listexpr~ (representing such structures) and operations:

[2]	atom: string [x] int [->] listexpr		\\
	atomc: string [->] listexpr			\\
	concat: listexpr [x] listexpr [->] listexpr	\\
	print: listexpr [->] e				\\
	copyout: listexpr [->] string [x] int		\\
	release-storage					\\

However, for use by ~lex~ and ~yacc~ generated lexical analysers and parsers which only allow to associate integer values with grammar symbols, we represent a ~listexpr~ by an integer (which is, in fact, an index into an array for nodes). Hence we have a signature:

[2]	atom: string [x] int [->] int			\\
	atomc: string [->] int				\\
	concat: int [x] int [->] int			\\
	print: int [->] e				\\
	copyout: int [->] string [x] int		\\
	release-storage					\\

The module uses two storage areas. The first is a buffer for text characters, it can take up to STRINGMAX characters, currently set to 30000. The second provides nodes for the nested list structure; currently up to NODESMAX = 30000 nodes can be created.

The operations are defined as follows:

----	int atom(char *string, int length)
----

List expressions, that is, values of type ~listexpr~ are either atoms or lists. The function ~atom~ creates from a character string ~string~ of length ~length~ a list expression which is an atom containing this string. Possible errors: The text buffer or storage space for nodes may overflow.

----	int atomc(char *string)
----

The function ~atomc~ works like ~atom~ except that the parameter should be a null-terminated string. It determines the length itself. To be used in particular for string constants written directly into the function call.

----	int concat(int list1, int list2)
----

Concats the two lists; returns a list expression representing the concatenation. Possible error: the storage space for nodes may be exceeded.

----	print(int list)
----

Writes the character strings from all atoms in ~list~ in the right order to standard output.

----	copyout(int list, char *target, int lengthlimit)
----

Copies the character strings from all atoms in ~list~ in the right order into a string variable ~target~. Parameter ~lengthlimit~ ensures that the maximal available space in ~target~ is respected; an error occurs if the list expression ~list~ contains too many characters.

----	release_storage()
----

Destroys the contents of the text and node buffers. Should be used only when a complete piece of text has been recognized and written to the output. Warning: Must not be used after pieces of text have been recognized for which the parser depends on reading a look-ahead token! This token will be in the text and node buffers already and be lost. Currently this applies to lists.

The following is what is technically exported from this file:

***************************************/

int atom(const char* string, int length);
int atomc(const char* string);
int  concat(int list1, int list2);
void print(const int list);
void  copyout(int list, char* target, int lengthlimit);

void  release_storage();
void show_storage(); 		/* show_storage used only for testing */
/****************************************************************************

2.2 Implementation Part

(File ~PDNestedText.c~)

*****************************************************************************/

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "PDNestedText.h"

#define AND &&
#define TRUE 1
#define FALSE 0
#undef NULL
#define NULL -1


extern void yyerror(const char* msg);

#define STRINGMAX 300000
/*
Maximal number of characters in buffer ~text~.

*/

#define NODESMAX 30000
/*
Maximal number of nodes available from ~nodespace~.

*/

struct listexpr {
    int left;
    int right;
    int atomstring;		/* index into array text */
    int length;			/* no of chars in atomstring*/
};

/*
If ~left~ is NULL then this represents an atom, otherwise it is a list in which case ~atomstring~ must be NULL.

*/
struct listexpr nodespace[NODESMAX];
int first_free_node = 0;

char text[STRINGMAX];
int first_free_char = 0;

/***************************************

The function ~atom~ creates from a character string ~string~ of length ~length~ a list expression which is an atom containing this string. Possible errors: The text buffer or storage space for nodes may overflow.

***************************************/

int atom(const char *string, int length)
{
    int newnode;
    int i;

    /* put string into text buffer: */

    	if (first_free_char + length > STRINGMAX)
	    {fprintf(stderr, "Error: too many characters.\n"); yyerror("too many characters");exit(1);}

    	for (i = 0; i< length; i++)
	    text[first_free_char + i] = string[i];

    /* create new node */

    	newnode = first_free_node++;

    	if (first_free_node > NODESMAX)
	    {fprintf(stderr, "Error: too many nodes!!!!.\n");show_storage(); yyerror("too many nodes"); exit(1);}

   	nodespace[newnode].left = NULL;
  	nodespace[newnode].right = NULL;
  	nodespace[newnode].atomstring = first_free_char;
	    first_free_char = first_free_char + length;
	nodespace[newnode].length = length;
    	return(newnode);
}

/****************************************

The function ~atomc~ works like ~atom~ except that the parameter should be a null-terminated string. It determines the length itself. To be used in particular for string constants written directly into the function call.

******************************************/

int atomc(const char *string)
{   int length;

    length = strlen(string);
    return(atom(string, length));
}

/**************************************

The function ~concat~ concats two lists; it returns a list expression representing the concatenation. Possible error: the storage space for nodes may be exceeded.

****************************************/

int concat(int list1, int list2)
{
    int newnode;

    newnode = first_free_node++;

    if (first_free_node > NODESMAX)
	{fprintf(stderr, "Error: too many nodes.\n"); yyerror("too many nodes");exit(1);}

    nodespace[newnode].left = list1;
    nodespace[newnode].right = list2;
    nodespace[newnode].atomstring = NULL;
    nodespace[newnode].length = 0;
    return(newnode);
}


/******************************************

Function ~isatom~ obviously checks whether a list expression ~list~ is an atom.

*******************************************/

int isatom(int list)
{
    if (nodespace[list].left == NULL) return(TRUE);
    else return(FALSE);
}

/***************************************

Function ~print~ writes the character strings from all atoms in ~list~ in the right order to standard output.

*****************************************/

void print(int list)
{
    int i;

    if (isatom(list))
	for (i = 0; i < nodespace[list].length; i++)
	    putchar(text[nodespace[list].atomstring + i]);
    else
	{print(nodespace[list].left); print(nodespace[list].right);};
}

/*******************************************

The function ~copyout~ copies the character strings from all atoms in ~list~ in the right order into a string variable ~target~. Parameter ~lengthlimit~ ensures that the maximal available space in ~target~ is respected; an error occurs if the list expression ~list~ contains too many characters. ~Copyout~ just calls an auxiliary recursive procedure ~copylist~ which does the job.

*******************************************/


int copylist(int list, char *target, int lengthlimit)
{   int i, j;

    if (isatom(list))
	if (nodespace[list].length <= lengthlimit - 1)
	    {for (i = 0; i < nodespace[list].length; i++)
		target[i] = text[nodespace[list].atomstring + i];
	    return nodespace[list].length;
	    }
	else
	    {fprintf(stderr, "Error in copylist: too long text.\n"); print(list);
       yyerror("too long text");
	    exit(1);
	    }
    else
	{i = copylist(nodespace[list].left, target, lengthlimit);
	j = copylist(nodespace[list].right, &target[i], lengthlimit - i);
	return (i+j);
	}
}

void copyout(int list, char *target, int lengthlimit)
{   int i;

    i = copylist(list, target, lengthlimit);
    target[i] = '\0';
}

/****************************************

Function ~release-storage~ destroys the contents of the text and node buffers. Should be used only when a complete piece of text has been recognized and written to the output. Do not use it for text pieces whose recognition needs look-ahead!

*****************************************/

void release_storage()
{   first_free_char = 0;
    first_free_node = 0;
}

/****************************************

Function ~show-storage~ writes the contents of the text and node buffers to standard output; only used for testing.

*****************************************/

void show_storage()
{   int i;
    fprintf(stderr,"first_free_char %d\n",first_free_char);
    for (i = 0; i < first_free_char; i++) putchar(text[i]);

    fprintf(stderr,"first_free_node %d\n",first_free_node);

    for (i = 0; i < first_free_node; i++)
	 fprintf(stderr,"node: %d, left: %d, right: %d, atomstring: %d, length: %d\n",
		i, nodespace[i].left, nodespace[i].right,
		nodespace[i].atomstring, nodespace[i].length);
}
/*
//[|]	[$\mid $]
//[\]	[$\setminus $]

3 Lexical Analysis

3.1 Introduction

The file ~PDLex.l~ contains a specification of a lexical analyser. A description of the structure of ~lex~ specifications can be found in [ASU86, Section 3.5]. More detailed information is given in [SUN88, Section 10]. From such a specification, the UNIX tool ~lex~ creates a file ~lex.yy.c~ which can then be compiled to obtain a lexical analyser. This analyser is given as a function

----	int yylex()
----

On each call, this function matches a piece of the input string and returns a ~token~, which is an integer constant.

A lex specification has the following structure:

----	declarations
	%%
	translation rules
	%%
	auxiliary procedures
----

The ~declarations~ section provides definitions of tokens to be returned (within brackets of the form \%\{ ... \}\%. The remainder of this section contains definitions of regular symbols (nonterminals of a regular grammar). For example, the definitions

----	digit		[0-9]
	num		({digit}{digit}|{digit})
----

introduce two nonterminals called ~digit~ and ~num~ respectively, by giving regular expressions on the right hand side for them. Terminal symbols can be described by character classes such as [0-9] (for digits), [[\]n] (matches just the end of line symbol), [ab?!] (contains just these four characters), etc. Nonterminal symbols used in regular expressions have to be enclosed by braces. Hence here a number is defined to consist of either one or two digits. The characters ``[|]'', ``+'', and ``[*]'' have the usual meaning for the composition of regular expressions.

The ~translation rules~ section contains a list of rules. Each rule describes some action to be taken whenever a piece of the input string has been matched. For example, the rule

----	^{head1}	{yylval = atom(yytext, yyleng); return(HEAD1);}
----

states the action to be taken when a ~head1~ regular symbol has been recognized which is defined in the ~declarations~ section by:

----	head1		{num}" "
----

So the input string matching ~head1~ consists of one or two digits followed by a blank. A translation rule consists of a regular symbol or regular expression on the left hand side, and some C code in braces on the right hand side. The C code says which token is to be returned (if any, one can also just skip a part of the input string and not return a token). The rule above says that a ~HEAD1~ token is to be returned from lexical analysis on recognizing a ~head1~ regular symbol which must occur at the beginning of a line (this is specified by the leading
caret character).

In addition to returning a token, one often wants to give further information. For the communication with a ~yacc~ generated parser, a global integer variable ~yylval~ is predefined. A value assigned to this variable can be accessed in parsing (see Section 5). The input string matched by a regular expression is given by ~lex~ variables

----	char *yytext;
	int yyleng;
----

where ~yytext~ points to the first character and ~yyleng~ gives the number of characters matched. The lexical analyser specified here usually returns the text that has been matched by creating a corresponding atom in the ~NestedText~ data structure and returning its node index. This happens also in the rule shown above.

The last section ~auxiliary procedures~ contains C code that is copied directly into the program ~lex.yy.c~. Here one can define support variables or functions for the action parts of translation rules.

Some specific comments about the definitions given below:

  * The regular symbols ~open~, ~open2~, and ~close~ consume preceding and following empty lines (consisting of space and end-of-line characters). Similarly, ~epar~ (paragraph end symbol) consumes subsequent empty lines. This is to avoid superfluous space in ``verbatim'' sections of the target Latex document.

  * A ~close~ followed by an ~open~ commentary bracket is omitted completely. This means that adjacent documentation sections are merged into one. This is needed in particular for documents obtained by concatenating several files. Be aware of this in testing lexical analysis! This is the only case when parts of the input string are ``swallowed'' without returning tokens.

Note that the characters

----	~, *, ", [, ], :
----

are returned directly to the parser (each is its own token) because they are used there directly in grammar rules. These characters are matched by the rule with a ``.'' on the left hand side. The dot matches everything unmatched otherwise.


3.2 The Specification

(File ~PDLex.l~)

*/

/*
----
This file is part of the PD system
Copyright (C) 1998 Ralf Hartmut Gueting, Fachbereich Informatik, FernUniversitaet Hagen

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
----

*/

%option yylineno
%option debug

%{
#include "PDNestedText.h"

extern int yylex();

%}


/* regular definitions */

lbracket	("(*"|"/*")
rbracket	("*)"|"*/")
star		[*]
other		[-;,?!`<60>'()/@#$%_\^{}+=|<>\n&<26><><EFBFBD><EFBFBD><EFBFBD><EFBFBD>ߵ<EFBFBD>]
open		{lbracket}{star}*(" "*[\n])+
open2		([\n]" "*)*[\n]{lbracket}{star}*(" "*[\n])+
close		{star}*{rbracket}(" "*[\n])+
epar		[\n]" "*[\n](" "*[\n])*
defline1	[\n]" "*"//"
defline2	" "*"//"
digit		[0-9]
num		({digit}{digit}|{digit})
ref		"["{num}"]"
verbatim	"----"
tt              "__"

head1		{num}" "
head2		{num}"."{head1}
head3		{num}"."{head2}
head4		{num}"."{head3}
head5		{num}"."{head4}

enum1		"  "{digit}" "|" "{digit}{digit}" "
enum2		"  "{enum1}
bullet1		"  * "
bullet2		"  "{bullet1}
follow1		"    "
follow2		"  "{follow1}

display		"        "
figure		"                "

%x VERB
%%

<INITIAL>^{open}		{return(OPEN);}
<INITIAL>{open2}		{return(OPEN);}
<INITIAL>^{close}		{return(CLOSE);}
<INITIAL>^{close}{open}		{ }

<INITIAL>^{verbatim}		{
                                  BEGIN(VERB);
                                  return(VERBATIM);
                                }
<VERB>.|\n			{
                                  const char* v1="\\verb!{!";
                                  const char* v2="\\verb!}!";
                                  const char* v3="\\verb!\\!";

                                  switch (yytext[0]) {
                                    case '{'  : { yylval = atom(v1,8); break; }
                                    case '}'  : { yylval = atom(v2,8); break; }
                                    case '\\' : { yylval = atom(v3,8); break; }
                                    default   : { yylval = atom(yytext, yyleng); }
                                  }
                                  return(VCHAR);
                                }
<VERB>^{verbatim}		{
                                  BEGIN(INITIAL);
                                  return(ENDVERBATIM);
                                }

<INITIAL>{epar}			{yylval = atom(yytext, yyleng); return(EPAR);}
<INITIAL>{defline1}		{yylval = atom(yytext, yyleng); return(DEFLINE);}
<INITIAL>^{defline2}		{yylval = atom(yytext, yyleng); return(DEFLINE);}
<INITIAL>[A-Za-z]		{yylval = atom(yytext, yyleng); return(LETTER);}
<INITIAL>^{head1}		{yylval = atom(yytext, yyleng); return(HEAD1);}
<INITIAL>^{head2}		{yylval = atom(yytext, yyleng); return(HEAD2);}
<INITIAL>^{head3}		{yylval = atom(yytext, yyleng); return(HEAD3);}
<INITIAL>^{head4}		{yylval = atom(yytext, yyleng); return(HEAD4);}
<INITIAL>^{head5}		{yylval = atom(yytext, yyleng); return(HEAD5);}
<INITIAL>^{enum1}		{yylval = atom(yytext, yyleng); return(ENUM1);}
<INITIAL>^{enum2}		{yylval = atom(yytext, yyleng); return(ENUM2);}
<INITIAL>^{bullet1}		{yylval = atom(yytext, yyleng); return(BULLET1);}
<INITIAL>^{bullet2}		{yylval = atom(yytext, yyleng); return(BULLET2);}
<INITIAL>^{follow1}		{yylval = atom(yytext, yyleng); return(FOLLOW1);}
<INITIAL>^{follow2}		{yylval = atom(yytext, yyleng); return(FOLLOW2);}
<INITIAL>^{display}		{yylval = atom(yytext, yyleng); return(DISPLAY);}
<INITIAL>^{figure}		{yylval = atom(yytext, yyleng); return(FIGURE);}
<INITIAL>^({ref}" "|"[] ")	{yylval = atom(yytext, yyleng); return(STARTREF);}
<INITIAL>{ref}			{yylval = atom(yytext, yyleng); return(REF);}
<INITIAL>[0-9]			{yylval = atom(yytext, yyleng); return(DIGIT);}
<INITIAL>"[~]"			{yylval = atom(yytext, yyleng); return(TILDE);}
<INITIAL>"[*]"			{yylval = atom(yytext, yyleng); return(STAR);}
<INITIAL>"[__]"			{yylval = atom(yytext, yyleng); return(DUS);}
<INITIAL>"[\"]"			{yylval = atom(yytext, yyleng); return(QUOTE);}
<INITIAL>" ~ "			{yylval = atom(yytext, yyleng); return(BLANKTILDE);}
<INITIAL>" * "			{yylval = atom(yytext, yyleng); return(BLANKSTAR);}
<INITIAL>" __ "			{yylval = atom(yytext, yyleng); return(BLANKDUS);}
<INITIAL>" \" "			{yylval = atom(yytext, yyleng); return(BLANKQUOTE);}
<INITIAL>{other}		{yylval = atom(yytext, yyleng); return(OTHER);}
<INITIAL>.			{yylval = atom(yytext, yyleng); return(yytext[0]);}
<INITIAL>"paragraph" 		{yylval = atom(yytext, yyleng); return(PARFORMAT);}
<INITIAL>"characters" 		{yylval = atom(yytext, yyleng); return(CHARFORMAT);}
<INITIAL>{tt}          	        {yylval = atom(yytext, yyleng); return(TTFORMAT);}


%%


/*
<INITIAL>"'"	        	{ yylval = atom(yytext, yyleng); return(OTHER);}


3.3 Testing the Lexical Analyser

One can test lexical analysis separately from the rest of the system. The files ~PDTokens.h~ and ~PDLexTest.c~ are needed. The file ~PDTokens.h~ needs to be included in the ~declarations~ section of ~Lex.l~:

----	%{
	#include "PDTokens.h"
	}%
----

This file just defines each token as an integer constant:

*/
/*
----
This file is part of the PD system
Copyright (C) 1998 Ralf Hartmut Gueting, Fachbereich Informatik, FernUniversitaet Hagen

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
----

*/
# define OPEN 257
# define CLOSE 258
# define EPAR 259
# define DEFLINE 260
# define LETTER 261
# define DIGIT 262
# define OTHER 263
# define TILDE 264
# define STAR 265
# define QUOTE 266
# define BLANKTILDE 267
# define BLANKSTAR 268
# define BLANKQUOTE 269
# define HEAD1 270
# define HEAD2 271
# define HEAD3 272
# define HEAD4 273
# define HEAD5 274
# define ENUM1 275
# define ENUM2 276
# define BULLET1 277
# define BULLET2 278
# define FOLLOW1 279
# define FOLLOW2 280
# define DISPLAY 281
# define FIGURE 282
# define STARTREF 283
# define REF 284
# define VERBATIM 285
# define PARFORMAT 286
# define CHARFORMAT 287
# define TTFORMAT 288
# define DUS 289
# define BLANKDUS 289
/*
----
This file is part of the PD system
Copyright (C) 1998 Ralf Hartmut Gueting, Fachbereich Informatik, FernUniversitaet Hagen

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
----

The file ~PDLexTest.c~ contains the main function for testing lexical analysis. It prints all letters directly and prints the other tokens as integers.

*/

#include "PDTokens.h"
#include "PDNestedText.h"

main()
{

	int token;
	yylval = 0;

	token = yylex();
	while (token != 0) {

		if (token == LETTER) print(yylval);
		else printf("%d \n", token);

		token = yylex();
	}
}

/*
To produce a lexical analyser for testing one can issue the following commands (after including ~PDTokens.h~ in ~PDLex.l~):

----	lex PDLex.l
	cc PDLexTest.c lex.yy.c PDNestedText.o -ll
----

The file ~a.out~ will then contain the analyser.

*/
/*
----
This file is part of the PD system
Copyright (C) 1998 Ralf Hartmut Gueting, Fachbereich Informatik, FernUniversitaet Hagen

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
----

4 Data Structures for the Parser

(File ~PDParserDS.c~)

This file contains data structures for special paragraph and character formats (Section 4.2) and special characters (Section 4.3), also some auxiliary functions for the parser (Section 4.4).

*/
#include "PDNestedText.h"
#include <string.h>

#define AND &&


/**************************************************************************
4.1 Global Constants and Variables

**************************************************************************/

#define BRACKETLENGTH 500

/*****************************
Length of text that may occur in square brackets within normal text.

******************************/


int pindex = -1;

/*****************************
Index (in array ~definitions~, see below) of the most recently used special paragraph format.

*****************************/


int cindex = -1;

/*****************************
Index (in array ~definitions~, see below) of the most recently used special character format.

*****************************/


/**************************************************************************
4.2 Data Structure for Definitions of Special Paragraph or Character Formats

***************************************************************************/

#define DEFMAX 100
#define NAMELENGTH 30
#define COMLENGTH 100

struct def {
	int index;
	char name[NAMELENGTH];
	char open[COMLENGTH];
	char close[COMLENGTH];
} definitions[DEFMAX];

int first_free_def = 0;
int last_global_def = -1;

/******************************************
Contains definitions of special paragraph or character formats, such as

----	paragraph [1] Title: [{\bf \Large \begin{center}] [\end{center} }]
----

Here 1 would become the ~index~, ~Title~ would be the ~name~, the material enclosed in the first pair of square brackets would be the ~open~, and that in the second pair of brackets the ~close~ component.

Into this array are put first definitions from the header documentation section. When these are complete, the variable ~last-global-def~ is set to the array index of the last entry. Then, for each paragraph which has annotation lines further definitions may be appended. The lookup procedure ~lookup-def~ (see below) searches from the end so that if finds first paragraph annotations. (However, paragraph annotations are not yet implemented.)

The following function ~enter-def~ puts a quadruple into this array. The last three parameters are (indices of) list expressions.

****************************************/

void enter_def(int index, int name, int open, int close)
{   definitions[first_free_def].index = index;
    copyout(name, definitions[first_free_def].name, NAMELENGTH);

	/* This function copies the first parameter list expression into a
        string given as a second parameter. Part of NestedText.*/

    copyout(open, definitions[first_free_def].open, COMLENGTH);
    copyout(close, definitions[first_free_def].close, COMLENGTH);

    first_free_def++;

    if (first_free_def >= DEFMAX)
	{fprintf(stderr, "Error in enter_def: table full.\n");
	exit(1);
	}
}

/*****************************************

Function ~lookup-def~ finds an array index in array ~definitions~ such that its ~index~ component has value ~i~. Starts at the end and searches backwards in order to find paragraph annotations first. Returns either the array index, or -1 if entry was not found.

*******************************************/

int lookup_def(int i)
{   int j;
    j = first_free_def;
    do j--;
    while ((j>=0) AND (definitions[j].index != i));
    return j;
}

/**************************************************************************
4.3 Data Structure for Definitions of Special Characters

***************************************************************************/

#define CODELENGTH 31		/* 30 usable characters + 0C */
#define SCMAX 100

struct schar {
	char code[CODELENGTH];
	char command[COMLENGTH];
} schars[SCMAX];

int first_free_schar = 0;

/*********************************************
Contains definitions of special characters such as:

----	[ue]	[\"{u}]
----

The function ~enter-schar~ puts such a pair into the data structure. Parameters are again indices of list expressions in array ~nodespace~.

********************************************/

void enter_schar(int code, int command)
{   copyout(code, schars[first_free_schar].code, CODELENGTH);
    copyout(command, schars[first_free_schar].command, COMLENGTH);

    first_free_schar++;

    if (first_free_schar >= SCMAX)
	{printf("Error in enter_schar: table full.\n");
	exit(1);
	}
}


/**************************************
The function ~lookup-schar~ tries to find a parameter ~string~ as a ~code~ component under some index ~j~ in the array ~schars~ with special character definitions. It returns this index. If such an entry was not found, it returns a negative index value.

**************************************/

int lookup_schar(char *string)
{   int j;
    j = first_free_schar;
    do j--;
    while ((j>=0) AND (strcmp(string, schars[j].code) != 0));
    return j;
}


/**************************************************************************
4.4 Auxiliary Functions

The function ~get-startref-index~ gets as a parameter a list expression ~listexpr~ containing a special paragraph format number in square brackets followed by a blank (the number can have one or two digits), or an empty pair of square brackets (as a reference to a special format used previously). Hence examples are:

----	[15] [7] or []
----

The function returns either the numeric value (the format number) or 0 for an empty pair of brackets.

**************************************/


int get_startref_index(int listexpr)
{   char ref[6];
    int length;

    copyout(listexpr, ref, 6);
    length = strlen(ref);

    switch(length) {
	case 3:					/* empty brackets */
		return 0;
	case 4: 				/* one digit */
		return (ref[1] - '0');
	case 5: 				/* two digits */
		return (10 * (ref[1] - '0') + (ref[2] - '0'));
	default:
		{fprintf(stderr, "Error in get_startref_index: length is %d.\n",
			length);
		exit(1);
		}
    }
}

/*************************************

Function ~get-ref-index~ does the same for a reference without the trailing blank.

***************************************/

int get_ref_index(int listexpr)
{   return(get_startref_index( concat(listexpr, atom(" ", 1)) ));
}
/*
//[\]	[$\setminus $]


5 The Parser

(File ~PDParser.y~)

5.1 Introduction

This file contains a ~yacc~ specification of a parser which is transformed by the UNIX tool ~yacc~ into a program file ~y.tab.c~ which in turn is compiled to produce a parser. For an introduction to ~yacc~ specifications see [ASU86, Section 4.9]. Detailed information is given in [SUN88, Section 11].

In fact, the specification contains ``semantic rules'' (or ~actions~); hence, the generated program is a compiler (from PD files into Latex). The parsing technique used is bottom-up LR parsing. Let us briefly consider two example rules:

----	heading		: heading1
			| heading2
			| heading3
			| heading4
			| heading5
			;

	heading1	:  HEAD1 paragraph_rest	{printf("\\section {");
						print($2);
						printf("}\n\n");}
			;
----

The first is a grammar rule stating that a ~heading~ (nonterminal) can be derived into either a ~heading1~ or a ~heading2~ or a ~heading3~ (nonterminal). The second rule says that a ~heading1~ consists of a ~HEAD1~ token followed by a ~paragraph-rest~. This rule has an associated action which is C code enclosed by braces.

Attached to each grammar symbol (nonterminal, or terminal token) is an integer-valued ~attribute~. One can refer to these attributes in a grammar rule by the names ~\$\$~, ~\$1~, ~\$2~, etc. where ~\$\$~ refers to the attribute of the nonterminal on the left hand side and ~\$1~, ~\$2~, etc. to the grammar symbols on the right hand side. Hence, in the second rule,  ~\$\$~ is the attribute attached to ~heading1~, ~\$1~ belongs to ~HEAD1~, and ~\$2~ to ~paragraph-rest~.

For the terminal tokens which are generated in lexical analysis the attribute value is set by an assignment to the variable ~yylval~. The lexical analyser used here assigns always the index of a node of the ~NestedText~ data structure containing the character string matching the token.

For the nonterminals, the attribute value is set by an assignment to ~\$\$~ in the action part of a grammar rule. The bottom-up parser basically works as follows. Terminal tokens and nonterminals recognized earlier are kept on a stack. Whenever the top symbols on the stack correspond to the right hand side of a grammar rule which is applicable with the current derivation, a ~reduction~ is made: The right hand side symbols are removed from the stack and the left hand side nonterminal is put on the stack. In addition, the action associated with the rule (the C code in braces) is executed.

Therefore, in our example rule the ~HEAD1~ and ~paragraph-rest~ symbols are removed from the stack and a ~heading1~ symbol is put on top. The action is executed, namely:

  1 The text ``[\]section \{'' is written to the output file.

  2 The text associated with ~paragraph-rest~ (a ~NestedText~ node whose index is given in ~\$2~) is written (by the ~print~ function from ~NestedText~).

  3 The text ``\}[\]n[\]n'' (a closing brace followed by two end-of-line characters) is written to the output.

Hence for a heading described in the PD file by

----	5 The Parser
----

a piece of Latex code

----	\section {The Parser}
----

followed by an empty line is written to the output file. To understand the output created in the grammar rules you need to know (some) \LaTeX, see [La86].


5.2 Declaration Section: Definition of Tokens

*/

%{
#include <stdio.h>
#include <stdlib.h>
#include "PDNestedText.h"
#include "PDParserDS.c"

extern char* startProgram;
extern char* endProgram;
extern char* startVerbatim;
extern char* endVerbatim;

#define YYERROR_VERBOSE
#define YYDEBUG 1

extern void yyerror(const char *msg);
extern int yylex();

%}


%token OPEN CLOSE EPAR DEFLINE LETTER DIGIT OTHER TILDE STAR
	QUOTE BLANKTILDE BLANKSTAR BLANKQUOTE DUS BLANKDUS
	HEAD1 HEAD2 HEAD3 HEAD4 HEAD5 ENUM1 ENUM2 BULLET1 BULLET2
	FOLLOW1 FOLLOW2 DISPLAY FIGURE STARTREF REF VCHAR VERBATIM ENDVERBATIM
	PARFORMAT CHARFORMAT TTFORMAT
%%

/*
5.3 Document Structure and Program Sections

*/

document 	: doc
		| doc program_section
		;

doc		: space doc_section
		| doc program_section doc_section
		;

doc_section 	: OPEN elements CLOSE
		;

program_section : 	{ printf("%s", startProgram); }
		  chars	{ printf("%s", endProgram);   }
		;

chars 		:
		| chars text_char	{print($2);}
		| chars TILDE		{print($2);}
		| chars STAR		{print($2);}
		| chars DUS 		{print($2);}
		| chars QUOTE		{print($2);}
		| chars '\"'		{print($2);}
		| chars '*'		{print($2);}
		| chars '~'		{print($2);}
		| chars '['		{print($2);}
		| chars ']'		{print($2);}
		| chars follow_elem	{print($2);}
		| chars EPAR		{print($2);}
		| chars DEFLINE		{print($2);}
                | chars TTFORMAT        {print($2);}
		;

elements	:
		| elements definitions
		| elements element	{release_storage();}
		| elements list		/* storage cannot be released after
					lists because of look-ahead */
		;


/*
5.4 Definitions of Special Formats and Characters

*/

definitions	: defs EPAR
		;

defs		: defline
		| defs defline
		;

defline		: DEFLINE par_format
		| DEFLINE char_format
		| DEFLINE special_char_def
		;

par_format	: PARFORMAT space REF space ident space ':'
			space bracketed2 space bracketed2
				{
				/* test only:
				print($3);
				printf("paragraph definition: %d ",
					get_ref_index($3));
				print($5);
				print($9);
				print($11);
				*/
				enter_def(get_ref_index($3), $5, $9, $11);}
		;

char_format	: CHARFORMAT space REF space ident space ':'
			space bracketed2 space bracketed2
				{
				/*
				printf("characters definition: %d ",
					get_ref_index($3));
				print($5);
				print($9);
				print($11);
				*/

				enter_def(get_ref_index($3), $5, $9, $11);}
		;

special_char_def: bracketed2 space bracketed2
					{enter_schar($1, $3);}
		;

space		:
		| space ' '
		| space FOLLOW1
		| space FOLLOW2
		| space DISPLAY
		| space FIGURE
		;

ident 		: LETTER		{$$ = $1;}
		| ident LETTER		{$$ = concat($1, $2);}
		| ident DIGIT		{$$ = concat($1, $2);}
		;

/*
5.5 Text Elements

5.5.1 Predefined Paragraph Types

*/

element 	: standard_paragraph
		| heading
		| verb
		| display
		| figure
		| special_paragraph
		;

standard_paragraph : paragraph_rest	{print($1); printf("\n\n");}
		;

heading		: heading1
		| heading2
		| heading3
		| heading4
		| heading5
		;

heading1	: HEAD1 paragraph_rest	{printf("\\section {");
					print($2);
					printf("}\n\n");}
		;

heading2	: HEAD2 paragraph_rest	{printf("\\subsection {");
					print($2);
					printf("}\n\n");}
		;

heading3	: HEAD3 paragraph_rest	{printf("\\subsubsection {");
					print($2);
					printf("}\n\n");}
		;

heading4	: HEAD4 paragraph_rest	{printf("\\paragraph {");
					print($2);
					printf("}\n\n");}
		;

heading5	: HEAD5 paragraph_rest	{printf("\\subparagraph {");
					print($2);
					printf("}\n\n");}
		;

verb		: verb_start verb_end
		;

verb_start	: VERBATIM	  { printf("%s", startVerbatim); }
		;

verb_end	: vchars ENDVERBATIM  { printf("%s", endVerbatim); }
		;

vchars          : vchars VCHAR  { print($2); }
                | VCHAR         { print($1); }
                ;


display		: DISPLAY paragraph_rest {printf("\\begin{quote}\n");
					printf("        ");
					print($2);
					printf("\n\\end{quote}\n\n");}
		;

/*
5.5.2 Figures

*/

figure 		: FIGURE figure_text optional_caption
		  bracketed2 annotations
					{printf("\\begin{figure}[htb]\n");
					printf("\\begin{center}\n");
					printf("\\leavevmode\n");
					printf("        \\epsfbox{Figures/");
						print($4);
					printf("}\n");
					printf("\\end{center}\n");
					printf("        \\caption{");
						print($3);
					printf("}\n\\label{fig:");
						print($4);
					printf("}\n\\end{figure}\n");}
		;

optional_caption:			{$$ = atomc("");}
		| ':' figure_text	{$$ = $2;}
		;

figure_text 	:			{$$ = atomc("");}
		| figure_text ftext_char {$$ = concat($1, $2);}
		| figure_text TILDE 	{$$ = concat($1, atomc("~"));}
		| figure_text STAR 	{$$ = concat($1, atomc("*"));}
		| figure_text DUS 	{$$ = concat($1, atomc("__"));}
		| figure_text QUOTE 	{$$ = concat($1, atomc("\""));}
		| figure_text emphasized {$$ = concat($1, $2);}
    | figure_text typewriter {$$ = concat($1, $2);}
		| figure_text bold_face	{$$ = concat($1, $2);}
		| figure_text special_char_format {$$ = concat($1, $2);}
		| figure_text follow_elem {$$ = concat($1, $2);}
		;

/*
5.5.4 Special Paragraph Formats

*/

special_paragraph: STARTREF paragraph_rest

		{int i;

		i = get_startref_index($1);
		if (i > 0)			/* not an empty start ref */
		    pindex = lookup_def(i);
						/* otherwise use previous
						pindex value */
		if (pindex >= 0)		/* def was found */
		    {printf("%s ", definitions[pindex].open);
		    print($2);
		    printf("%s \n\n", definitions[pindex].close);
		    }
		else print($2);		/* make it a standard paragraph */
		}
		;


/*
5.6 Lists

*/

list 		: itemized1	{printf("\\begin{itemize}\n");
				print($1);
				printf("\n\n\\end{itemize}\n\n");}
		| enum1		{printf("\\begin{enumerate}\n");
				print($1);
				printf("\n\n\\end{enumerate}\n\n");}
		;

itemized1 	: bulletitem1		{$$ = $1;}
		| itemized1 bulletitem1	{$$ = concat($1, $2);}
		;

bulletitem1 	: bulletpar1		{$$ = $1;}
		| bulletitem1 followup1	{$$ = concat($1, $2);}
		| bulletitem1 list2	{$$ = concat($1, $2);}
		;

bulletpar1 	: BULLET1 paragraph_rest
				{$$ = concat(atomc("\n  \\item "),
					concat($2, atomc("\n\n")));}
		;

followup1 	: FOLLOW1 paragraph_rest
					{$$ = concat($1,
						concat($2, atomc("\n\n")));}
		;

enum1 		: enumitem1		{$$ = $1;}
		| enum1 enumitem1	{$$ = concat($1, $2);}
		;

enumitem1 	: enumpar1		{$$ = $1;}
		| enumitem1 followup1	{$$ = concat($1, $2);}
		| enumitem1 list2	{$$ = concat($1, $2);}
		;

enumpar1 	: ENUM1 paragraph_rest
				{$$ = concat(atomc("\n  \\item "),
					concat($2, atomc("\n\n")));}
		;

list2 		: itemized2 {$$ = concat(atomc("\n    \\begin{itemize}\n"),
					concat($1,
					atomc("\n    \\end{itemize}\n\n")));}
		| enum2	{$$ = concat(atomc("\n    \\begin{enumerate}\n"),
					concat($1,
					atomc("\n    \\end{enumerate}\n\n")));}
		;

itemized2 	: bulletitem2		{$$ = $1;}
		| itemized2 bulletitem2	{$$ = concat($1, $2);}
		;

bulletitem2 	: bulletpar2		{$$ = $1;}
		| bulletitem2 followup2	{$$ = concat($1, $2);}
		;

bulletpar2 	: BULLET2 paragraph_rest
				{$$ = concat(atomc("\n      \\item "),
					concat($2, atomc("\n\n")));}
		;

followup2 	: FOLLOW2 paragraph_rest
				{$$ = concat($1, concat($2, atomc("\n\n")));}
		;

enum2 		: enumitem2		{$$ = $1;}
		| enum2 enumitem2	{$$ = concat($1, $2);}
		;

enumitem2 	: enumpar2		{$$ = $1;}
		| enumitem2 followup2	{$$ = concat($1, $2);}
		;

enumpar2 	: ENUM2 paragraph_rest
				{$$ = concat(atomc("\n      \\item "),
					concat($2, atomc("\n\n")));}

		;

/*
5.7 Text Structure


Unfortunately, this gets a bit complex because we must take care of the following:

  * Most tokens that are recognized in lexical analysis may occur in normal text; we must make sure that they can be reduced there. In particular, the tokens defining paragraph formats must be allowed to occur in the middle of a paragraph and not be interpreted there.

  * We must take care of escaping characters with a special meaning. This concerns the ~TILDE~, ~STAR~, and ~QUOTE~ tokens which are formed in LA from
the corresponding characters in square brackets. In normal text, the square brackets must be stripped off. On the other hand, in program text or in definitions (given in square brackets themselves) the character strings should be left untouched.

  * Emphasized text (enclosed by tilde characters) and bold face text (enclosed by stars) may be nested, but we must make sure through the grammar rules that emphasized cannot occur within emphasized etc. Otherwise a second tilde meaning a closing bracket of the emphasized text would be shifted on the stack by the parser rather than reduced as we want.

*/

paragraph_rest 	: text annotations EPAR	{$$ = $1;}
		;

text 		: 			{$$ = atomc("");}
		| netext		{$$ = $1;}
		;

netext		: start_elem		{$$ = $1;}
		| netext start_elem	{$$ = concat($1, $2);}
		| netext follow_elem	{$$ = concat($1, $2);}
		;

start_elem	: text_char		{$$ = $1;}
		| TILDE 		{$$ = atomc("~");}
		| STAR 			{$$ = atomc("*");}
		| DUS 			{$$ = atomc("__");}
		| QUOTE 		{$$ = atomc("\"");}
		| emphasized		{$$ = $1;}
		| bold_face		{$$ = $1;}
    		| typewriter            {$$ = $1;}
		| special_char_format	{$$ = $1;}
		| bracketed		{$$ = $1;}
		;

follow_elem	: HEAD1			{$$ = $1;}
		| HEAD2			{$$ = $1;}
		| HEAD3			{$$ = $1;}
		| HEAD4			{$$ = $1;}
		| HEAD5			{$$ = $1;}
		| ENUM1			{$$ = $1;}
		| ENUM2			{$$ = $1;}
		| FOLLOW1		{$$ = $1;}
		| FOLLOW2		{$$ = $1;}
		| BULLET1		{$$ = $1;}
		| BULLET2		{$$ = $1;}
		| DISPLAY		{$$ = $1;}
		| FIGURE		{$$ = $1;}
		| STARTREF		{$$ = $1;}
		| REF			{$$ = $1;}
		;

text_char	: ftext_char		{$$ = $1;}
		| ':'			{$$ = $1;}
		;

ftext_char 	: LETTER 		{$$ = $1;}
		| DIGIT 		{$$ = $1;}
		| OTHER			{$$ = $1;}
		| BLANKTILDE 		{$$ = $1;}
		| BLANKSTAR 		{$$ = $1;}
		| BLANKDUS 		{$$ = $1;}
		| BLANKQUOTE		{$$ = $1;}
		| '\\'			{$$ = $1;}
		| ' '			{$$ = $1;}
		| '.'			{$$ = $1;}
		| PARFORMAT		{$$ = $1;}
		| CHARFORMAT		{$$ = $1;}
		;

emphasized 	: '~' unemph_list '~'	{$$ = concat(atomc("{\\em "),
						concat($2,
						atomc("\\/}")));
            }
		;

bold_face 	: '*' unbold_list '*'	{$$ = concat(atomc("{\\bf "),
						concat($2,
						atomc("}")));
            }
		;

typewriter 	: TTFORMAT untt_list TTFORMAT	{$$ = concat(atomc("{\\tt "),
						concat($2,
						atomc("}")));
           }
		;

unemph_list	:			{$$ = atomc("");}
		| unemph_list unemph	{$$ = concat($1, $2);}
		;

untt_list	:			{$$ = atomc("");}
		| untt_list untt	{$$ = concat($1, $2);}
		;

unemph		: text_char		{$$ = $1;}
		| TILDE 		{$$ = atomc("~");}
		| STAR 			{$$ = atomc("*");}
		| DUS 			{$$ = atomc("__");}
		| QUOTE 		{$$ = atomc("\"");}
		| follow_elem		{$$ = $1;}
		| special_char_format	{$$ = $1;}
		| bracketed		{$$ = $1;}
		;

untt		: text_char		{$$ = $1;}
		| TILDE 		{$$ = atomc("~");}
		| STAR 			{$$ = atomc("*");}
		| DUS 			{$$ = atomc("__");}
		| QUOTE 		{$$ = atomc("\"");}
		| follow_elem		{$$ = $1;}
		| '*' unboldemph_list '*'	{$$ = concat(atomc("{\\bf "),
						concat($2,
						atomc("}")));}
		| special_char_format	{$$ = $1;}
		| bracketed		{$$ = $1;}
		;

unbold_list	:			{$$ = atomc("");}
		| unbold_list unbold	{$$ = concat($1, $2);}
		;

unbold		: text_char		{$$ = $1;}
		| TILDE 		{$$ = atomc("~");}
		| STAR 			{$$ = atomc("*");}
		| DUS 			{$$ = atomc("__");}
		| QUOTE 		{$$ = atomc("\"");}
		| follow_elem		{$$ = $1;}
		| '~' unboldemph_list '~'	{$$ = concat(atomc("{\\em "),
						concat($2,
						atomc("\\/}")));}
		| special_char_format	{$$ = $1;}
		| bracketed		{$$ = $1;}
		;

unboldemph_list	:			{$$ = atomc("");}
		| unboldemph_list unboldemph	{$$ = concat($1, $2);}
		;

unboldemph	: text_char		{$$ = $1;}
		| TILDE 		{$$ = atomc("~");}
		| STAR 			{$$ = atomc("*");}
		| DUS 			{$$ = atomc("__");}
		| QUOTE 		{$$ = atomc("\"");}
		| follow_elem		{$$ = $1;}
		| special_char_format	{$$ = $1;}
		| bracketed		{$$ = $1;}
		;

plain_list	:			{$$ = atomc("");}
		| plain_list plain	{$$ = concat($1, $2);}
		;

plain		: text_char		{$$ = $1;}
		| TILDE 		{$$ = atomc("~");}
		| STAR 			{$$ = atomc("*");}
		| DUS 			{$$ = atomc("__");}
		| QUOTE 		{$$ = atomc("\"");}
		| bracketed		{$$ = $1;}
		;

/*
5.8 Special Character Formats

*/

special_char_format : 	'\"' plain_list '\"' REF

			{int i;

			i = get_ref_index($4);
			cindex = lookup_def(i);
			if (cindex >= 0)	/* def was found */
			    {$$ = concat(
				   atomc(definitions[cindex].open),
				   concat($2,
				     atomc(definitions[cindex].close)  ));
			    }
			else			/* ignore special format */
			    $$ = $2;
			}

		| '\"' plain_list '\"'

			{if (cindex >= 0)	/* def exists */
			    {$$ = concat(
				   atomc(definitions[cindex].open),
				   concat($2,
				     atomc(definitions[cindex].close)  ));
			    }
			else			/* ignore special format */
			    $$ = $2;
			}

		;

/*
5.9 Text in Square Brackets: Checking for Special Characters

*/

bracketed 	: '[' btext ']'

		{char bracketstring[BRACKETLENGTH];
		int i;
		int length;

		copyout($2, bracketstring, BRACKETLENGTH);
		length = strlen(bracketstring);

		if (length <= CODELENGTH - 1)
		    {i = lookup_schar(bracketstring);
		    if (i >= 0)			/* found */
			$$ = atomc(schars[i].command);
		    else
			$$ = concat($1,
				concat($2, $3));
		    }
		else
		    $$ = concat($1, concat($2, $3));
		}
		;

btext 		:			{$$ = atomc("");}
		| btext text_char	{$$ = concat($1, $2);}
		| btext TILDE 		{$$ = concat($1, atomc("~"));}
		| btext STAR 		{$$ = concat($1, atomc("*"));}
		| btext DUS 		{$$ = concat($1, atomc("__"));}
		| btext QUOTE 		{$$ = concat($1, atomc("\""));}
		| btext follow_elem	{$$ = concat($1, $2);}
		| btext '\"'		{$$ = concat($1, $2);}
		| btext '*'		{$$ = concat($1, $2);}
		| btext '~'		{$$ = concat($1, $2);}
		| btext '[' btext ']'

		{char bracketstring[BRACKETLENGTH];
		int i;
		int length;

		copyout($3, bracketstring, BRACKETLENGTH);
		length = strlen(bracketstring);

		if (length <= CODELENGTH - 1)
		    {i = lookup_schar(bracketstring);
		    if (i >= 0)			/* found */
			$$ = concat($1, atomc(schars[i].command));
		    else
			{$$ = concat($1,
				concat($2,
					concat($3, $4)));}
		    }
		else
		    {$$ = concat($1,
				concat($2,
					concat($3, $4)));}
		}
		;

/*
5.10 Uninterpreted Square Brackets (Used in Definitions)

*/

bracketed2 	: '[' btext2 ']'	{$$ = $2;}
		;

btext2 		:			{$$ = atomc("");}
		| btext2 text_char	{$$ = concat($1, $2);}
		| btext2 TILDE		{$$ = concat($1, $2);}
		| btext2 STAR		{$$ = concat($1, $2);}
		| btext2 DUS		{$$ = concat($1, $2);}
		| btext2 QUOTE		{$$ = concat($1, $2);}
		| btext2 follow_elem	{$$ = concat($1, $2);}
		| btext2 '\"'		{$$ = concat($1, $2);}
		| btext2 '*'		{$$ = concat($1, $2);}
		| btext2 '~'		{$$ = concat($1, $2);}
		| btext2 '[' btext2 ']'	{$$ = concat($1,
						concat($2,
						concat($3, $4)));}
		;

annotations 	:
		;


%%

#include "PDLex.c"

/*

6 \LaTeX\  Header and Main Program

6.1 The \LaTeX\  Header File: pd.header

*/
\documentclass[11pt,a4paper]{article}
\usepackage{epsfig, times, color, alltt}
\usepackage[a4paper, left=2.5cm, right=2.5cm, top=2.5cm, bottom=2.5cm]{geometry}
\parindent0em
\parskip0.8ex plus0.4ex minus 0.4ex
\setcounter{secnumdepth}{4}
\setcounter{tocdepth}{3}
\fussy
\oddsidemargin0mm
\evensidemargin0mm

\newenvironment{pdverbatim}{\pagebreak[0]\begin{alltt}\nopagebreak[4]\hspace{0.9cm}\rule[1.0eX]{2in}{0.1pt}\nopagebreak[4]\small}
                       {\nopagebreak[4]\hspace{0.9cm}\rule[2.0eX]{2in}{0.1pt}\nopagebreak[4]\end{alltt}}

%% --- uncomment to set up the text body area for landscape mode --
%%\usepackage{portland}
%%\topmargin-2cm
%%\textwidth26cm
%%\textheight17cm

\begin{document}
%%\landscape
%% --- It seems that the paper dimension is not kept in the dvi file.
%% --- Hence other tools (dvi viewer, dvips, etc.) must be informed about
%% --- the "unusal" paper dimensions otherwise they will use their default
%% --- one; e.g. use dvips -t landscape for setting up correct paper dimensions.
/*
----
This file is part of the PD system

Copyright (C) 1998 Ralf Hartmut Gueting,
          (C) 2006 Markus Spiekermann

Fachbereich Informatik, FernUniversitaet in Hagen

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
----

6.2 Main Program: Maketex.c

Use the parser to transform from implicitly formatted text to TeX. The auxiliary
functions CheckDebugEnv, InitHeader and PrintTail are implemented in file PDLib.c.

*/

#include <stdio.h>
#include <stdlib.h>
#include "PDLib.h"

extern int yyparse();

int main()
{

  CheckDebugEnv();
  InitHeader();

  int error=0;
  error = yyparse();

  PrintTail();
  return error;
}


/*

7 Two Auxiliary Programs

*/
/*
----
This file is part of the PD system
Copyright (C) 1998 Ralf Hartmut Gueting, Fachbereich Informatik, FernUniversitaet Hagen

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
----

7.1 Program pdtabs.c

This program converts tab symbols into corresponding sequences of blanks. With the standard text editor, each tab corresponds to 8 blanks.

*/
#include<stdio.h>


#define TABLENGTH 8
#ifndef EOF
#define EOF -1
#endif

int
main()
{
        int c, position, nblanks, i;

        position = 0;
        while ((c = getchar()) != EOF)
                if (c == '\n')
                        {position = 0; putchar(c);}
                else if (c == '\t') {
                        nblanks = TABLENGTH - (position % TABLENGTH);
                        for (i = 0; i < nblanks; i++)
                                {position++; putchar(' ');}
                        }
                else {position++; putchar(c);}
       return 0;
}
/*
----
This file is part of the PD system
Copyright (C) 1998 Ralf Hartmut Gueting, Fachbereich Informatik, FernUniversitaet Hagen

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
----

7.2 Program ~linebreaks.c~

This program reads a file from standard input and writes it to standard output. Whenever lines longer than LINELENGTH (which is 80 characters) occur, it puts a line break to the position of the last blank read before character 80 and continues in a new line. If there was no blank in such a line, it introduces a line break anyway (possibly in the middle of a word).

*/

#include <stdio.h>


#define LINELENGTH 80
#ifndef EOF
#define EOF -1
#endif

#define OR ||

int
main()
{
	int position, c, lastblank, i;
	int line[LINELENGTH];

	position = 0;
	lastblank = -1;

	while ((c = getchar()) != EOF) {
		line[position] = c;
		if (c == ' ') lastblank = position;
		position++;

		if ((c == '\n') OR (position == LINELENGTH)){
			if (c == '\n') {

				/* output a complete line */

				for (i = 0; i < position; i++)
					putchar(line[i]);
				position = 0;
				lastblank = -1;
                        } else if (lastblank > 0) { 	/* a blank exists */

				/* output line up to blank */

				for (i = 0; i < lastblank; i++)
					putchar(line[i]);
				putchar('\n');

				/* move rest of line to the front */

				for (i = lastblank + 1; i < position; i++)
					line[i - (lastblank + 1)] = line[i];
				position = position - (lastblank + 1);
				lastblank = -1;
			} else {			/* no blank exists */

				/* output line anyway */

       				for (i = 0; i < position; i++)
					putchar(line[i]);
				putchar('\n');
				position = 0;
				lastblank = -1;
			}
                 }
	}

        return 0;
}
/*

8 Command Procedures

8.1 Procedure ~pdview~

*/
# This file is part of the PD system
# Copyright (C) 1998 Ralf Hartmut Gueting, Fachbereich Informatik, FernUniversitaet Hagen

# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.

# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#!/bin/sh
#
# pdview <file> - previews an PD file with LaTeX's DVI previewer
#
# May 2004, M. Spiekermann

pd2dvi $1;

pdpreview $1 dvi $PD_DVI_VIEWER &
/*

8.2 Procedure ~pdshow~

*/
# This file is part of the PD system
# Copyright (C) 1998 Ralf Hartmut Gueting, Fachbereich Informatik, FernUniversitaet Hagen

# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.

# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.

#
#!/bin/sh
#
# pdshow <file> - previews an PD file using a Postscript previewer
#
# May 2004, M. Spiekermann

if { ! pd2ps $1; }; then
   exit 1
fi

pdpreview $1 ps $PD_PS_VIEWER &
/*

8.3 Procedure ~pd2tex~

*/
# This file is part of the PD system
# Copyright (C) 1998 Ralf Hartmut Gueting, Fachbereich Informatik, FernUniversitaet Hagen

# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.

# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#!/bin/sh
#
# pd2tex - Converts a PD file to a TEX file
#
# May 2004, M. Spiekermann

pdFile="$1"
texFile="$1.tex"
tmpTexFile=".tmp.$texFile"
errFile=".pd.err"

rm -f $errFile
pdtabs < $pdFile | maketex >> $tmpTexFile 2>$errFile
rc_maketex=$?

cat $PD_HEADER $tmpTexFile > $texFile

if [ $rc_maketex -ne 0 ]; then

 printf "\n PD-Error: Could not create $texFile completely!\n\n"
 echo "\end{document}" >> $texFile

else

  rm -f $errFile

fi

rm -f $tmpTexFile
/*

8.4 Procedure ~pd2dvi~

*/
# This file is part of the PD system
# Copyright (C) 1998 Ralf Hartmut Gueting, Fachbereich Informatik, FernUniversitaet Hagen

# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.

# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#!/bin/sh
#
# pd2dvi <file> - Converts a PD file to a TEX file
#
# May 2004, M. Spiekermann

function showError()
{
  local errFile=".pd.err"
  if [ -e $errFile ]; then
    echo -e "Error in PD-File detected! \n"
    cat $errFile
  fi
}


pdFile="$1"
texFile="$1.tex"
dviFile="$1.dvi"
logFile="$1.log"
auxFile="$1.aux"

pd2tex $1;

showError

for dummy in 1 2 3
do
  if { ! latex --interaction="nonstopmode" $texFile; }; then

     printf "\n LaTeX-Error: Could not create $dviFile! "
     printf "\n              Study LaTeX's messages above and correct"
     printf "\n              the error in the PD-file $pdFile. \n\n"
     rm -f $texFile $auxFile $dviFile
     exit 2
fi
done

showError
/*

8.5 Procedure ~pd2ps~

*/
# This file is part of the PD system
# Copyright (C) 1998 Ralf Hartmut Gueting, Fachbereich Informatik, FernUniversitaet Hagen

# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.

# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#!/bin/sh
#
# pd2ps <file> - Converts a PD file to a postscript file
#
# May 2004, M. Spiekermann

if { ! pd2dvi $1; }; then
   exit 1
fi

dvips -o "$1.ps" "$1.dvi"
/*

8.6 Procedure ~pd2pdf~

*/
# This file is part of the PD system
# Copyright (C) 1998 Ralf Hartmut Gueting, Fachbereich Informatik, FernUniversitaet Hagen

# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.

# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#!/bin/sh
#
# pd2pdf <file> - Converts a PD file to a PDF file
#
# May 2004, M. Spiekermann

if { ! pd2dvi $1; }; then
   exit 1
fi

dvipdfm -p a4 -o "$1.pdf" "$1.dvi"
/*

9 The Makefile

*/
# This file is part of the PD system
# Copyright (C) 1998 Ralf Hartmut Gueting, Fachbereich Informatik, FernUniversitaet Hagen

# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.

# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
# File: makefile
#
# 05/03 Markus Spiekermann. Makefile revised to make it Windows/Unix
#    compatible. Automatic enumeration of pd files is now done by a
#    lex generated scanner, this is a replacement for the perl solution.
#
# 10/03 Markus Spiekermann. Target install introduced.
#
# 05/04 Markus Spiekermann. Changes in documentation and script files.
# 12/06 Markus Spiekermann. More generic rules introduced.

# Uncomment next line to switch on debug mode
#OPTIONS = -g

# The first line is used when compiling with FLEX (2.5.4a) and BISON (1.35)
# The second line when compiling with LEX and YACC
LINKLIBS =  -lfl

# The first line is used when compiling with FLEX and BISON
# The second line when compiling with LEX and YACC
LEX = flex
#LEX = lex

# The first line is used when compiling with FLEX and BISON
# The second line when compiling with LEX and YACC
YACC = bison --yacc
#YACC = yacc

# specify your C-compiler
ifeq ($(platform),android)
CC := $(HOME)/toolchain-standalone/bin/arm-linux-androideabi-gcc
else
CC := gcc
endif

DOCU_FILES = PD1 PDNestedText.h PDNestedText.c \
	     PD3 PDLex.l PDTokens.h PDLexTest.c PDParserDS.c PDParser.y \
	     PD6 pd.header PDMaketex.c \
	     PD7 pdtabs.c linebreaks.c \
	     PD8 pdview \
	     PD8.2 pdshow \
	     PD8.3 pd2tex \
	     PD8.4 pd2dvi \
	     PD8.5 pd2ps \
             PD8.6 pd2pdf \
	     PD9 makefile \
	     PDRefs

DOCU_HTML_FILES = HTML1 PDParserHTML.y PDMakeHTML.c \
		  HTML4 pd2html \
		  HTML5 makefile \
		  PDRefsHTML

OBJECTS = PDNestedText.o \
	  PDLib.o

APPS = pdtabs \
	  linebreaks \
	  enumerate \
	  maketex \
          linecheck \
	  tabcheck \
	  filterpd \
	  makehtml \
	  makeascii

DOCU = docu docuhtml

SCRIPTS = pd2ascii pd2html pd2tex pd2dvi pd2ps pd2pdf pdshow pdview checkpd pdpreview

TEMPORARYS = PDLex.c enumerate.c linecheck.c tabcheck.c y.tab.c


.PHONY: all
all: $(OBJECTS) $(APPS) $(DOCU)
	@chmod ugo+x $(SCRIPTS)

maketex: PDMaketex.c PDParser.y PDLex.c PDParserDS.c $(OBJECTS)
	$(YACC) PDParser.y -o PDParser.tab.c
	$(CC) $(OPTIONS) -o $@ $< PDParser.tab.c $(OBJECTS) $(LINKLIBS)

makehtml: PDMakeHTML.c PDParserHTML.y PDLex.c PDParserDS.c $(OBJECTS)
	$(YACC) PDParserHTML.y -o PDParserHTML.tab.c
	$(CC) $(OPTIONS) -o $@ $< PDParserHTML.tab.c $(OBJECTS) $(LINKLIBS)

makeascii: PDMakeASCII.c PDParserASCII.y PDLex.c PDParserDS.c $(OBJECTS)
	$(YACC) PDParserASCII.y -o PDParserASCII.tab.c
	$(CC) $(OPTIONS) -o $@ $< PDParserASCII.tab.c $(OBJECTS) $(LINKLIBS)

pdtabs: pdtabs.c
	$(CC) -o $@ $<

linebreaks: linebreaks.c
	$(CC) -o $@ $<


# some dependencies
PDNestedText.o: PDNestedText.h

PDLib.o: PDLib.h


# some generic translation rules

%.c: %.l
	$(LEX) -o$@ $<

%.o: %.c
	$(CC) -c -g $<

%: %.c
	$(CC) -o $@ $< $(LINKLIBS)


docu: $(DOCU_FILES)
	cat $^ > $@

docuhtml: $(DOCU_HTML_FILES)
	cat $^ > $@

.PHONY: dist
dist: pd.tar.gz

pd.tar.gz:
	cvs export -r$(tag) pd
	tar -czf  pd.tgz pd/*
	rm -r pd

INST_DIR = $(prefix)/pd
EPS_FILES = $(shell find ./Figures -name "*.eps")

.PHONY: install
install: $(OBJECTS) $(SCRIPTS)
	install -d $(INST_DIR)
	install -m744 -d $(INST_DIR)/Figures
	install $(OBJECTS) $(INST_DIR)
	install -m444 $(DOCU) $(INST_DIR)
	install -m444 $(EPS_FILES) $(INST_DIR)/Figures
	install $(SCRIPTS) $(INST_DIR)
	install pd.header $(INST_DIR)

.PHONY: clean
clean:
	rm -f $(OBJECTS) $(APPS) $(DOCU) $(TEMPORARYS) *.exe *.tab.c

/*
//[ae]	[\"{a}]
//[oe]	[\"{o}]
//[ue]	[\"{u}]
//[ss]	[\ss]


\begin{thebibliography}{ABCD99}

\bibitem[ASU86]{ASU86}
Aho, A.V., R. Sethi, and J.D. Ullman, Compilers: Principles, Techniques, and Tools. Addison-Wesley, 1986.

\bibitem[G[ue]95]{Gue95} G[ue]ting, R.H., Integrating Programs and Documentation. FernUniversit[ae]t Hagen, Informatik-Report 182, May 1995.

\bibitem[La86]{La86} Lamport, L., \LaTeX : A Document Preparation System. User's Guide \& Reference Manual. Addison-Wesley, 1986.

\bibitem[SUN88]{SUN88} Sun Microsystems, Programming Utilities and Libraries. User Manual. Sun Microsystems, 1988.

\end{thebibliography}

*/