На главную
Вы находитесь в Хранилище файлов Белорусской цифровой библиотеки

Constantin Knizhnik. Pascal to C++ compiler

# Moscow Software Center # Pascal to C++ compiler. # Constantin Knizhnik. Email: knizhnik@cecmow.enet.dec.com This is yet another Pascal to C/C++ converter. The primary idea of this converter is to produce readable and supportable code which preserves style of original code as far as possible. Converter recognizes Pascal dialects which are compatible with new ISO Pascal standard - IEC 7185:1990(E) (including conformant arrays). Now it is tuned for Oregon Pascal-2 V2.1 which has few extensions to standard Pascal. Converter can produce both C++ and C output. Using of C++ language allows to encapsulate some Pascal types and constructions into C++ classes. So mapping between Pascal and C++ becomes more direct then between Pascal and C. We use C++ templates to implement pascal arrays and files. Special template classes are used for conformant arrays. C++ like streams are used to implement Pascal IO routines. There is single runtime library for C and C++ code. Below there is a short description of converter itself: - Scanner (lex.l) is written using LEX. It produce list of all tokens including comments and whitespaces. - Parser (parser.y) is written using YACC. Parser takes from list of tokens created by scanner all tokens except separators and create object tree (classes are described in trnod.h) which nodes contain references to tokens. All names are inserted in global name table (nmtbl.h). - Attributes are assigned to tree nodes by executing virtual method 'attrib' (trnod.cxx). At this step symbol table (bring.h) is created. Classes for type expressions are implemented in tpexpr.cxx. - Virtual method 'translate' is recursively called for all nodes in tree (trnod.cxx). This methods perform conversion of input tokens (modify value, swap tokens, add new tokens) and as a result prepare output list of tokens. - All tokens from output list of tokens are printed to target file with intelligent preserving position and layout of tokens (token.cxx). Converter can perform global call graph analyze in order to recognize non-recursive functions and making static variables of such functions which are accessed by nested functions. If you specify '-analyze' option, converter append to file "call.grp" information about callers and callees. After conversion of all files special utility 'cganal' can be used to produced transitive closure of call graph and output list of recursive procedures in file "recur.prc". When you run converter once again (with '-analyze' option) information from this file is used to mark recursive procedures. This approach greatly increase readability of program as no extra arguments need to be passed to nested functions. Resolving of names conflicts is controlled by file "ptoc.cfg" which is redden by converter at startup. This file specifies reserved symbols (C and C++ keywords), names of functions from C standard library, names of macros defined by converter, and mapping of names for some functions from pascal runtime. Description of Pascal runtime library emulationcan be found in file "paslib.doc". When converter produces C code, it doesn't copy arrays which are passed by value. Instead of this converter declare such arrays as 'const', so any attempt to modify contents of such array cause C compiler warning or error. It seems to me, that there are usually few places in program where procedure modifies array which is passed by value. As a rule absence of VAR qualifier means that procedure only access but not modify contents of the array. So we decide that efficient generation of this most common is more important then some amount of manual job which is necessary to correct places where array has to be copied. (You should only rename formal parameter, create local variable with original name and copy value to it: foo(str20 const name) { ... } => foo(str20 name_) { str20 name; memcpy(name, name_, sizeof(name)); ... } (There is no such problem with C++) Some C++ compilers doesn't allow classes with any assignment operators to be members of unions (for correct implementation it is only necessary that such classes should not redefine DEFAULT assignment operator). More over some compilers (DEC C++ for example) do not generate default assignment method for template classes. As far as arrays can be member of variant components in Pascal, converter can generate code without using of assignment operator for arrays. If your specify '-assign' option, converter will use 'assign' method of array instead of '=' operator. To compile such produce code pass -DNO_ARRAY_ASSIGN_OPERATOR option to C++ compiler. When your are porting application from 16-bit architecture platform you may want to preserve integer size (2 bytes). In this case you can face with two problems: one is that pointers will not more fit into such integers. Converter can't help your in this case. You should change types of some variables and records fields. And second problem is less obvious. In language C short and char operands are converted to int type before operation takes place. So if you you compare for equality variables of signed and unsigned type declared in Pascal as word : -32768..32767 uword : 0..65535 containing the same value (for example 40000) then result will be false (unlike original application) ! This is because variable with signed type will be converted to integer with sign extension, and variable with unsigned type - without sign extension. To help to deal with this problem converter provides option "-unsigned" which force converter to insert implicit type conversion in such operations. Sometimes it is significant to preserve original size of data structure. For example if structure is mapped to another structure by means of union (record with variants in Pascal) or is extracted from file. There are two options in converter which can help you in this case. First option is "-intset", which order converter to generate short sets (2 or 4 bytes) for sets of enumeration types, Operations with short sets are implemented by macros using bit arithmetic. (so they are significantly faster than operations with universal sets). Disadvantage of using short sets is that adding elements to enumeration may cause problems in future. And another option is "-smallenum". The problem is that "enum" type in C is treated by many compilers as integers and there are no ways to make compiler use less bytes for their representation. When you specify option "-smallenum" converter replace original enumeration type definition with "unsigned char" or "unsigned short" definitions according to number of elements in enumeration. So construction colors = (red, green, blue); will be translated to typedef unsigned char colors; enum {red, green, blue}; As was mentioned above converter tries to preserve original indentation of converted sources. But if Pascal sources are not properly aligned you can reformat produced C code using some indentation utility (for example GNU indent, which is freely distributed). This converter was used in project of portation of manufacturing management system from Pascal-2/RSX to C/OpenVMS. There are more than 100.000 lines in Pascal which were converted to C with minimum manual changes. Directory "vms" contains VMS specific versions of pascal runtime library emulation module "io.c". PTOC is distributed in the hope that it will be useful. Your are free to use this converter, modify the sources and do with this converter everything else you want. Also feel free to ask any questions about the converter.

Last-modified: Fri, 27 Dec 1996 14:39:38 GMT
World LibraryРеклама в библиотекеПроект для детей старше 12 лет!
Проект Либмонстра, партнеры БЦБ - Украинская цифровая библиотека и Либмонстр Россия
https://database.library.by