Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for detecting left recursion in rules like: #2

Open
wants to merge 95 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
95 commits
Select commit Hold shift + click to select a range
d4094c8
Fix for detecting left recursion in rules like:
mingodad May 28, 2021
d890d78
Fix segfault due to bad format string parameters
mingodad Jun 1, 2021
93fc7f9
Replace instantiations using 'new' with RAII stack instances
mingodad Jun 3, 2021
375e702
Avoid unnecessary string copy/leak in 'Comment' creation
mingodad Jun 3, 2021
8bcf918
Replace instatiation with 'new' by RAII, also remove unnecessary stri…
mingodad Jun 3, 2021
35a18bf
Replace instatiation with 'new' by RAII
mingodad Jun 3, 2021
0712997
Replace multiple copies of bitwise expression by a macro
mingodad Jun 3, 2021
94860db
Remove unecessary string copy/delete from Scanner
mingodad Jun 3, 2021
0df8d32
Hash table now makes a copy of the key to avoid dangling pointers, al…
mingodad Jun 3, 2021
8009b1c
add missing cleanup
mingodad Jun 3, 2021
2fe0330
Replace instantiation with 'new' by RAII
mingodad Jun 3, 2021
ed2ecca
Add destructor for cleanup
mingodad Jun 3, 2021
13107cc
Added cleanup
mingodad Jun 3, 2021
a2cfa19
Added cleanup
mingodad Jun 3, 2021
7ee5eda
Add cleanup
mingodad Jun 3, 2021
d608f8d
Fix several memory leaks
mingodad Jun 3, 2021
5e9f36a
Fix several memory leaks
mingodad Jun 3, 2021
fa03b86
Fix memory leaks, and change function 'DetachAction' to return an ind…
mingodad Jun 3, 2021
8f68e61
Fix memory leaks
mingodad Jun 3, 2021
2b93528
Fix several memory leaks
mingodad Jun 3, 2021
839e7de
Fix memory leaks
mingodad Jun 3, 2021
170138f
Fix several memory leaks
mingodad Jun 3, 2021
dd477f2
Cleanup and fix several memory leaks
mingodad Jun 3, 2021
eac5f1e
Convert ArrayList to a templated one for future simplifications
mingodad Jun 3, 2021
4f22dea
Add a basic AST generator based on https://github.com/rochus-keller/E…
mingodad Jun 3, 2021
8fe04c0
Convert ArrayList to TArrayList<T>
mingodad Jun 3, 2021
1750865
Add 'const' qualifier in several places
mingodad Jun 4, 2021
076d923
Replace recursive calls to 'Scanner::NextToken()' with iteration
mingodad Jun 4, 2021
2c69a1d
Allow till 8 characters for multiline comment delimiters
mingodad Jun 4, 2021
e0a955f
Add a limited semantic action to TokenDecl to allow for example parsi…
mingodad Jun 4, 2021
42922cc
Add column info to Node and Symbol to create better diagnostics, also…
mingodad Jun 4, 2021
6b258ca
Small code reformat
mingodad Jun 4, 2021
f2e7af5
Replace constants for node kinds by enum
mingodad Jun 4, 2021
3b4c868
Initial implementation of a kind of TreeView for LL1 errors
mingodad Jun 4, 2021
a658cca
Add the token names between comments in several places to make easier…
mingodad Jun 4, 2021
d13715b
Start the refactoring to allow compile with and without wchar_t
mingodad Jun 4, 2021
28d9809
Remove several unneeded calls to 'printf' family functions
mingodad Jun 5, 2021
d795f69
Move 'ArrayList' to Scanner.frame to use in the AST (parser tree) gen…
mingodad Jun 5, 2021
88dc67f
Replace STRL and CHL by the unified _SC macro
mingodad Jun 5, 2021
079cfa1
Close to achieve build with and without wchar
mingodad Jun 5, 2021
5184aad
Fix the scanner generation to work without wchar_t
mingodad Jun 5, 2021
f1d4df6
Fix other places that can cause trouble when compiling without char_t
mingodad Jun 5, 2021
a262593
Fix AST generation to work with and without wchar_t
mingodad Jun 5, 2021
1fa17eb
Add the Taste example with memory leaks fixed
mingodad Jun 5, 2021
268e32c
Minor code layout fix
mingodad Jun 5, 2021
4ea34e7
Another memory leak fixed
mingodad Jun 5, 2021
31c62f3
Replace some magic numbers
mingodad Jun 6, 2021
b9359ff
Remove unnecessary function and it's usages
mingodad Jun 6, 2021
ae044ac
Remove unnecessary string allocation/deallocation
mingodad Jun 6, 2021
ef40822
Remove unnecessary string allocation/deallocation
mingodad Jun 6, 2021
9dc7b76
Remove unnecessary string allocation/deallocation
mingodad Jun 6, 2021
0c739db
Remove unnecessary string allocation/deallocation
mingodad Jun 6, 2021
17e2ab3
Refactor code removing unnecessary layer that could leak memory
mingodad Jun 6, 2021
a05edfe
Fix memory leak
mingodad Jun 6, 2021
92f46df
Fix memory leak
mingodad Jun 6, 2021
87aed46
Add filename to error messages based on https://github.com/cviehb/Coc…
mingodad Jun 6, 2021
987595c
Put braces around token declaration demantic actions
mingodad Jun 6, 2021
ea0ff02
Add stub code to allow build CocoR parsers without dependency on libs…
mingodad Jun 6, 2021
110c390
Fix to cross compile on linux with mingw64 compiler
mingodad Jun 6, 2021
9e5a932
Start playing with compiling CocoR-CPP to wasm
mingodad Jun 6, 2021
72d6035
Implement the generation of an EBNF grammar understood by https://www…
mingodad Jun 8, 2021
b8c95fe
Small code change without functionality change
mingodad Jun 8, 2021
672e3c2
Add missing Taste.cpp and fixes for latest changes
mingodad Jun 9, 2021
f6cb7b2
Remove unnecessary 'while' loop because it's using 'goto' to loop ins…
mingodad Jun 9, 2021
25ec536
Add 'ANY' when generating RREBNF
mingodad Jun 9, 2021
60beabb
Reorganize the code removing duplication
mingodad Jun 10, 2021
f16cbd1
Remove unused include
mingodad Jun 10, 2021
d028315
Finally the last known memory leak is fixed
mingodad Jun 10, 2021
8e9f19f
Fix narrow signed char conversion when 'wcahr_t' == 'char'
mingodad Jun 10, 2021
010462e
Add the TestSuite
mingodad Jun 10, 2021
5f1d5d3
Add an overview of my main changes
mingodad Jun 10, 2021
07c3244
Fix for possible narrow conversion when wchar_t == char
mingodad Jun 10, 2021
c9e56bf
Fix my mistake by forget to wrap a literal string used as wchar_t *
mingodad Jun 10, 2021
0c151cb
Add reference to the Java and CSharp versions
mingodad Jun 10, 2021
e6a2b21
Fix typo
mingodad Jun 10, 2021
5a04a9c
My last fix for left recursion detection didn't worked for any depth,…
mingodad Jun 11, 2021
2f2beee
Fix SynTree.dump2 that is supposed to show a pruned tree
mingodad Jun 12, 2021
3ecb057
Rename SynTree::dump to SynTree::dump_all and SynTree::dump to SynTre…
mingodad Jun 14, 2021
b80f2e0
Fix to make it behave the same as the Java/CSharp version
mingodad Jul 1, 2021
ec65db3
Fix for endless loop with some ill grammars
mingodad Jul 1, 2021
530714c
Fix for when 'wchar_t' is 'char'
mingodad Jul 6, 2021
0efd1ec
Remove unused variable
mingodad Jul 6, 2021
01b226c
Add examples folder and an initial bison grammar
mingodad Jul 9, 2021
223c079
Add the suffix "_NT" to non terminal generated functions to minimize …
mingodad Aug 14, 2021
8fd041a
Add token inheritance from https://github.com/Lercher/CocoR
mingodad Aug 14, 2021
1e5715c
Add the extra features description from last commits
mingodad Aug 14, 2021
9182e47
Add column info to non terminals
mingodad Sep 4, 2021
f3b3e15
Add missing code for proper handling token inheritance
mingodad Sep 4, 2021
f3f29f5
Fix railroad EBNF generation and other fixes
mingodad Dec 25, 2021
b8b387e
Fix genRREBNF when outputting ANY
mingodad Dec 27, 2021
8bc8539
Fix trace output
mingodad Jul 11, 2022
b36a884
Change Node/Symbol type/kind to an independent header
mingodad Jul 14, 2022
84e4333
Fix memory leak
mingodad Jul 14, 2022
cde7538
Fix my mistake of calling "First" before testing, introduced here d60…
mingodad Jul 14, 2022
cd146b6
Fixes to build with https://github.com/jart/cosmopolitan
mingodad Sep 27, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add a basic AST generator based on https://github.com/rochus-keller/E…
mingodad committed Jun 3, 2021
commit 4f22deafabf11a8d25184a273af470f643552426
8 changes: 8 additions & 0 deletions src/Coco.atg
Original file line number Diff line number Diff line change
@@ -35,6 +35,7 @@ $namespace=Coco
#include "Tab.h"
#include "DFA.h"
#include "ParserGen.h"
#define COCO_FRAME_PARSER

COMPILER Coco

@@ -124,6 +125,13 @@ Coco (. Symbol *sym; Graph *g, *g1, *g2; wchar_t* gra
.)
{ ANY } (. tab->semDeclPos = new Position(beg, la->pos, 0, line); .)
[ "IGNORECASE" (. dfa->ignoreCase = true; .) ] /* pdt */
[ "TERMINALS" { ident (. sym = tab->FindSym(t->val);
if (sym != NULL) SemErr(L"name declared twice");
else {
sym = tab->NewSym(Node::t, t->val, t->line);
sym->tokenKind = Symbol::fixedToken;
}.)
} ] /*from cocoxml*/
[ "CHARACTERS" { SetDecl }]
[ "TOKENS" { TokenDecl<Node::t> }]
[ "PRAGMAS" { TokenDecl<Node::pr> }]
700 changes: 558 additions & 142 deletions src/Parser.cpp

Large diffs are not rendered by default.

126 changes: 114 additions & 12 deletions src/Parser.frame
Original file line number Diff line number Diff line change
@@ -5,24 +5,24 @@ extended by M. Loeberbauer & A. Woess, Univ. of Linz
ported to C++ by Csaba Balazs, University of Szeged
with improvements by Pat Terry, Rhodes University

This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2, or (at your option) any
This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2, or (at your option) any
later version.

This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.

You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

As an exception, it is allowed to write an extension of Coco/R that is
used as a plugin in non-free software.

If not otherwise stated, any source code generated by Coco/R (other than
If not otherwise stated, any source code generated by Coco/R (other than
Coco/R itself) does not fall under the GNU General Public License.
-------------------------------------------------------------------------*/

@@ -41,6 +41,21 @@ Parser.h Specification

-->namespace_open

#ifdef PARSER_WITH_AST

struct SynTree {
SynTree(Token *t ): tok(t){}
~SynTree();

Token *tok;
ArrayList children;

void dump(int indent=0, bool isLast=false);
void dump2(int maxT, int indent=0, bool isLast=false);
};

#endif

class Errors {
public:
int count; // number of errors detected
@@ -61,6 +76,11 @@ private:
int errDist;
int minErrDist;

#ifdef PARSER_WITH_AST
void AstAddTerminal();
bool AstAddNonTerminal(eNonTerminals kind, const char *nt_name, int line);
void AstPopNonTerminal();
#endif
void SynErr(int n);
void Get();
void Expect(int n);
@@ -105,6 +125,30 @@ Parser.cpp Specification

-->namespace_open

#ifdef PARSER_WITH_AST

void Parser::AstAddTerminal() {
SynTree *st_t = new SynTree( t->Clone() );
((SynTree*)ast_stack.Top())->children.Add(st_t);
}

bool Parser::AstAddNonTerminal(eNonTerminals kind, const char *nt_name, int line) {
Token *ntTok = new Token();
ntTok->kind = kind;
ntTok->line = line;
ntTok->val = coco_string_create(nt_name);
SynTree *st = new SynTree( ntTok );
((SynTree*)ast_stack.Top())->children.Add(st);
ast_stack.Add(st);
return true;
}

void Parser::AstPopNonTerminal() {
ast_stack.Pop();
}

#endif

void Parser::SynErr(int n) {
if (errDist >= minErrDist) errors.SynErr(la->line, la->col, n);
errDist = 0;
@@ -176,7 +220,7 @@ struct ParserInitExistsRecognizer {
struct InitIsMissingType {
char dummy1;
};

struct InitExistsType {
char dummy1; char dummy2;
};
@@ -200,7 +244,7 @@ struct ParserDestroyExistsRecognizer {
struct DestroyIsMissingType {
char dummy1;
};

struct DestroyExistsType {
char dummy1; char dummy2;
};
@@ -280,8 +324,14 @@ bool Parser::StartOf(int s) {
Parser::~Parser() {
ParserDestroyCaller<Parser>::CallDestroy(this);
delete dummyToken;
#ifdef PARSER_WITH_AST
delete ast_root;
#endif

#ifdef COCO_FRAME_PARSER
coco_string_delete(noString);
coco_string_delete(tokenString);
#endif
}

Errors::Errors() {
@@ -319,8 +369,60 @@ void Errors::Warning(const wchar_t *s) {
}

void Errors::Exception(const wchar_t* s) {
wprintf(L"%ls", s);
wprintf(L"%ls", s);
exit(1);
}

#ifdef PARSER_WITH_AST

static void printIndent(int n) {
for(int i=0; i < n; ++i) wprintf(L" ");
}

SynTree::~SynTree() {
//wprintf(L"Token %ls : %d : %d : %d : %d\n", tok->val, tok->kind, tok->line, tok->col, children.Count);
delete tok;
for(int i=0; i<children.Count; ++i) delete ((SynTree*)children[i]);
}

void SynTree::dump(int indent, bool isLast) {
int last_idx = children.Count;
if(tok->col) {
printIndent(indent);
wprintf(L"%s\t%d\t%d\t%d\t%ls\n", ((isLast || (last_idx == 0)) ? "= " : " "), tok->line, tok->col, tok->kind, tok->val);
}
else {
printIndent(indent);
wprintf(L"%d\t%d\t%d\t%ls\n", children.Count, tok->line, tok->kind, tok->val);
}
if(last_idx) {
for(int idx=0; idx < last_idx; ++idx) ((SynTree*)children[idx])->dump(indent+4, idx == last_idx);
}
}

void SynTree::dump2(int maxT, int indent, bool isLast) {
int last_idx = children.Count;
if(tok->col) {
printIndent(indent);
wprintf(L"%s\t%d\t%d\t%d\t%ls\n", ((isLast || (last_idx == 0)) ? "= " : " "), tok->line, tok->col, tok->kind, tok->val);
}
else {
if(last_idx == 1) {
if(((SynTree*)children[0])->tok->kind < maxT) {
printIndent(indent);
wprintf(L"%d\t%d\t%d\t%ls\n", children.Count, tok->line, tok->kind, tok->val);
}
}
else {
printIndent(indent);
wprintf(L"%d\t%d\t%d\t%ls\n", children.Count, tok->line, tok->kind, tok->val);
}
}
if(last_idx) {
for(int idx=0; idx < last_idx; ++idx) ((SynTree*)children[idx])->dump2(maxT, indent+4, idx == last_idx);
}
}

#endif

-->namespace_close
51 changes: 49 additions & 2 deletions src/Parser.h
Original file line number Diff line number Diff line change
@@ -33,13 +33,29 @@ Coco/R itself) does not fall under the GNU General Public License.
#include "Tab.h"
#include "DFA.h"
#include "ParserGen.h"
#define COCO_FRAME_PARSER


#include "Scanner.h"

namespace Coco {


#ifdef PARSER_WITH_AST

struct SynTree {
SynTree(Token *t ): tok(t){}
~SynTree();

Token *tok;
ArrayList children;

void dump(int indent=0, bool isLast=false);
void dump2(int maxT, int indent=0, bool isLast=false);
};

#endif

class Errors {
public:
int count; // number of errors detected
@@ -62,15 +78,42 @@ class Parser {
_string=3,
_badString=4,
_char=5,
_ddtSym=42,
_optionSym=43
_ddtSym=43,
_optionSym=44
};
#ifdef PARSER_WITH_AST
enum eNonTerminals{
_Coco=0,
_SetDecl=1,
_TokenDecl=2,
_TokenExpr=3,
_Set=4,
_AttrDecl=5,
_SemText=6,
_Expression=7,
_SimSet=8,
_Char=9,
_Sym=10,
_Term=11,
_Resolver=12,
_Factor=13,
_Attribs=14,
_Condition=15,
_TokenTerm=16,
_TokenFactor=17
};
#endif
int maxT;

Token *dummyToken;
int errDist;
int minErrDist;

#ifdef PARSER_WITH_AST
void AstAddTerminal();
bool AstAddNonTerminal(eNonTerminals kind, const char *nt_name, int line);
void AstPopNonTerminal();
#endif
void SynErr(int n);
void Get();
void Expect(int n);
@@ -85,6 +128,10 @@ class Parser {
Token *t; // last recognized token
Token *la; // lookahead token

#ifdef PARSER_WITH_AST
SynTree *ast_root;
ArrayList ast_stack;
#endif
int id;
int str;

39 changes: 36 additions & 3 deletions src/ParserGen.cpp
Original file line number Diff line number Diff line change
@@ -196,11 +196,17 @@ void ParserGen::GenCode (Node *p, int indent, BitArray *isChecked) {
} else if (p->typ == Node::t) {
Indent(indent);
// assert: if isChecked[p->sym->n] is true, then isChecked contains only p->sym->n
if ((*isChecked)[p->sym->n]) fwprintf(gen, L"Get();\n");
if ((*isChecked)[p->sym->n]) {
fwprintf(gen, L"Get();\n");
//copy and pasted bellow
fwprintf(gen, L"#ifdef PARSER_WITH_AST\n\tAstAddTerminal();\n#endif\n");
}
else {
fwprintf(gen, L"Expect(");
WriteSymbolOrCode(gen, p->sym);
fwprintf(gen, L");\n");
//copy and pasted from above
fwprintf(gen, L"#ifdef PARSER_WITH_AST\n\tAstAddTerminal();\n#endif\n");
}
} if (p->typ == Node::wt) {
Indent(indent);
@@ -337,6 +343,19 @@ void ParserGen::GenTokensHeader() {
}

fwprintf(gen, L"\n\t};\n");

// nonterminals
fwprintf(gen, L"#ifdef PARSER_WITH_AST\n\tenum eNonTerminals{\n");
isFirst = true;
for (i=0; i<tab->nonterminals.Count; i++) {
sym = (Symbol*)tab->nonterminals[i];
if (isFirst) { isFirst = false; }
else { fwprintf(gen , L",\n"); }

fwprintf(gen , L"\t\t_%ls=%d", sym->name, sym->n);
}
fwprintf(gen, L"\n\t};\n#endif\n");

}

void ParserGen::GenCodePragmas() {
@@ -380,9 +399,19 @@ void ParserGen::GenProductions() {
CopySourcePart(sym->attrPos, 0);
fwprintf(gen, L") {\n");
CopySourcePart(sym->semPos, 2);
fwprintf(gen, L"#ifdef PARSER_WITH_AST\n");
if(i == 0) fwprintf(gen, L"\t\tToken *ntTok = new Token(); ntTok->kind = eNonTerminals::_%ls; ntTok->line = 0; ntTok->val = coco_string_create(\"%ls\");ast_root = new SynTree( ntTok ); ast_stack.Clear(); ast_stack.Add(ast_root);\n", sym->name, sym->name);
else {
fwprintf(gen, L"\t\tbool ntAdded = AstAddNonTerminal(eNonTerminals::_%ls, \"%ls\", la->line);\n", sym->name, sym->name);
}
fwprintf(gen, L"#endif\n");
ba.SetAll(false);
GenCode(sym->graph, 2, &ba);
fwprintf(gen, L"}\n"); fwprintf(gen, L"\n");
fwprintf(gen, L"#ifdef PARSER_WITH_AST\n");
if(i == 0) fwprintf(gen, L"\t\tAstPopNonTerminal();\n");
else fwprintf(gen, L"\t\tif(ntAdded) AstPopNonTerminal();\n");
fwprintf(gen, L"#endif\n");
fwprintf(gen, L"}\n\n");
}
}

@@ -405,6 +434,10 @@ void ParserGen::InitSets() {
fwprintf(gen, L"\t};\n\n");
}

void ParserGen::CheckAstGen() {
fwprintf(gen, L"#ifdef PARSER_WITH_AST\n\tSynTree *ast_root;\n\tArrayList ast_stack;\n#endif\n");
}

void ParserGen::WriteParser () {
Generator g(tab, errors);
int oldPos = buffer->GetPos(); // Pos is modified by CopySourcePart
@@ -437,7 +470,7 @@ void ParserGen::WriteParser () {
g.CopyFramePart(L"-->constantsheader");
GenTokensHeader(); /* ML 2002/09/07 write the token kinds */
fwprintf(gen, L"\tint maxT;\n");
g.CopyFramePart(L"-->declarations"); CopySourcePart(tab->semDeclPos, 0);
g.CopyFramePart(L"-->declarations"); CheckAstGen(); CopySourcePart(tab->semDeclPos, 0);
g.CopyFramePart(L"-->productionsheader"); GenProductionsHeader();
g.CopyFramePart(L"-->namespace_close");
GenNamespaceClose(nrOfNs);
1 change: 1 addition & 0 deletions src/ParserGen.h
Original file line number Diff line number Diff line change
@@ -90,6 +90,7 @@ class ParserGen
void WriteParser();
void WriteStatistics();
void WriteSymbolOrCode(FILE *gen, const Symbol *sym);
void CheckAstGen();
ParserGen (Parser *parser);
~ParserGen();

108 changes: 60 additions & 48 deletions src/Scanner.cpp
Original file line number Diff line number Diff line change
@@ -244,6 +244,17 @@ Token::Token() {
next = NULL;
}

Token *Token::Clone() {
Token *tk = new Token();
tk->kind = kind;
tk->pos = pos;
tk->col = col;
tk->line = line;
tk->val = coco_string_create(val);
tk->next = next;
return tk;
}

Token::~Token() {
coco_string_delete(val);
}
@@ -264,7 +275,7 @@ Buffer::Buffer(FILE* s, bool isUserStream) {
fileLen = bufLen = bufStart = 0;
}
bufCapacity = (bufLen>0) ? bufLen : COCO_MIN_BUFFER_LENGTH;
buf = new unsigned char[bufCapacity];
buf = new unsigned char[bufCapacity];
if (fileLen > 0) SetPos(0); // setup buffer to position 0 (start)
else bufPos = 0; // index 0 is already after the file, thus Pos = 0 is invalid
if (bufLen == fileLen && CanSeek()) Close();
@@ -294,7 +305,7 @@ Buffer::Buffer(const unsigned char* buf, int len) {
}

Buffer::~Buffer() {
Close();
Close();
if (buf != NULL) {
delete [] buf;
buf = NULL;
@@ -469,8 +480,8 @@ Scanner::~Scanner() {
void Scanner::Init() {
EOL = '\n';
eofSym = 0;
maxT = 41;
noSym = 41;
maxT = 42;
noSym = 42;
int i;
for (i = 65; i <= 90; ++i) start.set(i, 1);
for (i = 95; i <= 95; ++i) start.set(i, 1);
@@ -495,21 +506,22 @@ void Scanner::Init() {
start.set(Buffer::EoF, -1);
keywords.set(L"COMPILER", 6);
keywords.set(L"IGNORECASE", 7);
keywords.set(L"CHARACTERS", 8);
keywords.set(L"TOKENS", 9);
keywords.set(L"PRAGMAS", 10);
keywords.set(L"COMMENTS", 11);
keywords.set(L"FROM", 12);
keywords.set(L"TO", 13);
keywords.set(L"NESTED", 14);
keywords.set(L"IGNORE", 15);
keywords.set(L"PRODUCTIONS", 16);
keywords.set(L"END", 19);
keywords.set(L"ANY", 23);
keywords.set(L"WEAK", 29);
keywords.set(L"SYNC", 36);
keywords.set(L"IF", 37);
keywords.set(L"CONTEXT", 38);
keywords.set(L"TERMINALS", 8);
keywords.set(L"CHARACTERS", 9);
keywords.set(L"TOKENS", 10);
keywords.set(L"PRAGMAS", 11);
keywords.set(L"COMMENTS", 12);
keywords.set(L"FROM", 13);
keywords.set(L"TO", 14);
keywords.set(L"NESTED", 15);
keywords.set(L"IGNORE", 16);
keywords.set(L"PRODUCTIONS", 17);
keywords.set(L"END", 20);
keywords.set(L"ANY", 24);
keywords.set(L"WEAK", 30);
keywords.set(L"SYNC", 37);
keywords.set(L"IF", 38);
keywords.set(L"CONTEXT", 39);


tvalLength = 128;
@@ -729,14 +741,14 @@ Token* Scanner::NextToken() {
{t->kind = 5; break;}
case 10:
case_10:
recEnd = pos; recKind = 42;
recEnd = pos; recKind = 43;
if ((ch >= L'0' && ch <= L'9') || (ch >= L'A' && ch <= L'Z') || ch == L'_' || (ch >= L'a' && ch <= L'z')) {AddCh(); goto case_10;}
else {t->kind = 42; break;}
else {t->kind = 43; break;}
case 11:
case_11:
recEnd = pos; recKind = 43;
recEnd = pos; recKind = 44;
if ((ch >= L'-' && ch <= L'.') || (ch >= L'0' && ch <= L':') || (ch >= L'A' && ch <= L'Z') || ch == L'_' || (ch >= L'a' && ch <= L'z')) {AddCh(); goto case_11;}
else {t->kind = 43; break;}
else {t->kind = 44; break;}
case 12:
case_12:
if (ch <= 9 || (ch >= 11 && ch <= 12) || (ch >= 14 && ch <= L'!') || (ch >= L'#' && ch <= L'[') || (ch >= L']' && ch <= 65535)) {AddCh(); goto case_12;}
@@ -745,70 +757,70 @@ Token* Scanner::NextToken() {
else if (ch == 92) {AddCh(); goto case_14;}
else {goto case_0;}
case 13:
recEnd = pos; recKind = 42;
recEnd = pos; recKind = 43;
if ((ch >= L'0' && ch <= L'9')) {AddCh(); goto case_10;}
else if ((ch >= L'A' && ch <= L'Z') || ch == L'_' || (ch >= L'a' && ch <= L'z')) {AddCh(); goto case_15;}
else {t->kind = 42; break;}
else {t->kind = 43; break;}
case 14:
case_14:
if ((ch >= L' ' && ch <= L'~')) {AddCh(); goto case_12;}
else {goto case_0;}
case 15:
case_15:
recEnd = pos; recKind = 42;
recEnd = pos; recKind = 43;
if ((ch >= L'0' && ch <= L'9')) {AddCh(); goto case_10;}
else if ((ch >= L'A' && ch <= L'Z') || ch == L'_' || (ch >= L'a' && ch <= L'z')) {AddCh(); goto case_15;}
else if (ch == L'=') {AddCh(); goto case_11;}
else {t->kind = 42; break;}
else {t->kind = 43; break;}
case 16:
{t->kind = 17; break;}
{t->kind = 18; break;}
case 17:
{t->kind = 20; break;}
case 18:
{t->kind = 21; break;}
case 18:
{t->kind = 22; break;}
case 19:
case_19:
{t->kind = 22; break;}
{t->kind = 23; break;}
case 20:
{t->kind = 25; break;}
{t->kind = 26; break;}
case 21:
case_21:
{t->kind = 26; break;}
{t->kind = 27; break;}
case 22:
case_22:
{t->kind = 27; break;}
case 23:
{t->kind = 28; break;}
case 23:
{t->kind = 29; break;}
case 24:
{t->kind = 31; break;}
case 25:
{t->kind = 32; break;}
case 26:
case 25:
{t->kind = 33; break;}
case 27:
case 26:
{t->kind = 34; break;}
case 28:
case 27:
{t->kind = 35; break;}
case 28:
{t->kind = 36; break;}
case 29:
case_29:
{t->kind = 39; break;}
{t->kind = 40; break;}
case 30:
case_30:
{t->kind = 40; break;}
{t->kind = 41; break;}
case 31:
recEnd = pos; recKind = 18;
recEnd = pos; recKind = 19;
if (ch == L'.') {AddCh(); goto case_19;}
else if (ch == L'>') {AddCh(); goto case_22;}
else if (ch == L')') {AddCh(); goto case_30;}
else {t->kind = 18; break;}
else {t->kind = 19; break;}
case 32:
recEnd = pos; recKind = 24;
recEnd = pos; recKind = 25;
if (ch == L'.') {AddCh(); goto case_21;}
else {t->kind = 24; break;}
else {t->kind = 25; break;}
case 33:
recEnd = pos; recKind = 30;
recEnd = pos; recKind = 31;
if (ch == L'.') {AddCh(); goto case_29;}
else {t->kind = 30; break;}
else {t->kind = 31; break;}

}
AppendVal(t);
44 changes: 28 additions & 16 deletions src/Scanner.frame
Original file line number Diff line number Diff line change
@@ -5,24 +5,24 @@ extended by M. Loeberbauer & A. Woess, Univ. of Linz
ported to C++ by Csaba Balazs, University of Szeged
with improvements by Pat Terry, Rhodes University

This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2, or (at your option) any
This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2, or (at your option) any
later version.

This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.

You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

As an exception, it is allowed to write an extension of Coco/R that is
used as a plugin in non-free software.

If not otherwise stated, any source code generated by Coco/R (other than
If not otherwise stated, any source code generated by Coco/R (other than
Coco/R itself) does not fall under the GNU General Public License.
-----------------------------------------------------------------------*/

@@ -95,7 +95,7 @@ char* coco_string_create_char(const wchar_t *value);
void coco_string_delete(char* &data);


class Token
class Token
{
public:
int kind; // token kind
@@ -107,6 +107,7 @@ public:
Token *next; // ML 2005-03-11 Peek tokens are kept in linked list

Token();
Token *Clone();
~Token();
};

@@ -125,18 +126,18 @@ private:
int bufPos; // current position in buffer
FILE* stream; // input stream (seekable)
bool isUserStream; // was the stream opened by the user?

int ReadNextStreamChunk();
bool CanSeek(); // true if stream can be seeked otherwise false

public:
static const int EoF = COCO_WCHAR_MAX + 1;

Buffer(FILE* s, bool isUserStream);
Buffer(const unsigned char* buf, int len);
Buffer(Buffer *b);
virtual ~Buffer();

virtual void Close();
virtual int Read();
virtual int Peek();
@@ -284,7 +285,7 @@ private:

public:
Buffer *buffer; // scanner buffer

Scanner(const unsigned char* buf, int len);
Scanner(const wchar_t* fileName);
Scanner(FILE* s);
@@ -523,6 +524,17 @@ Token::Token() {
next = NULL;
}

Token *Token::Clone() {
Token *tk = new Token();
tk->kind = kind;
tk->pos = pos;
tk->col = col;
tk->line = line;
tk->val = coco_string_create(val);
tk->next = next;
return tk;
}

Token::~Token() {
coco_string_delete(val);
}
@@ -543,7 +555,7 @@ Buffer::Buffer(FILE* s, bool isUserStream) {
fileLen = bufLen = bufStart = 0;
}
bufCapacity = (bufLen>0) ? bufLen : COCO_MIN_BUFFER_LENGTH;
buf = new unsigned char[bufCapacity];
buf = new unsigned char[bufCapacity];
if (fileLen > 0) SetPos(0); // setup buffer to position 0 (start)
else bufPos = 0; // index 0 is already after the file, thus Pos = 0 is invalid
if (bufLen == fileLen && CanSeek()) Close();
@@ -573,7 +585,7 @@ Buffer::Buffer(const unsigned char* buf, int len) {
}

Buffer::~Buffer() {
Close();
Close();
if (buf != NULL) {
delete [] buf;
buf = NULL;
11 changes: 6 additions & 5 deletions src/Scanner.h
Original file line number Diff line number Diff line change
@@ -91,7 +91,7 @@ char* coco_string_create_char(const wchar_t *value);
void coco_string_delete(char* &data);


class Token
class Token
{
public:
int kind; // token kind
@@ -103,6 +103,7 @@ class Token
Token *next; // ML 2005-03-11 Peek tokens are kept in linked list

Token();
Token *Clone();
~Token();
};

@@ -121,18 +122,18 @@ class Buffer {
int bufPos; // current position in buffer
FILE* stream; // input stream (seekable)
bool isUserStream; // was the stream opened by the user?

int ReadNextStreamChunk();
bool CanSeek(); // true if stream can be seeked otherwise false

public:
static const int EoF = COCO_WCHAR_MAX + 1;

Buffer(FILE* s, bool isUserStream);
Buffer(const unsigned char* buf, int len);
Buffer(Buffer *b);
virtual ~Buffer();

virtual void Close();
virtual int Read();
virtual int Peek();
@@ -282,7 +283,7 @@ class Scanner {

public:
Buffer *buffer; // scanner buffer

Scanner(const unsigned char* buf, int len);
Scanner(const wchar_t* fileName);
Scanner(FILE* s);
1 change: 0 additions & 1 deletion src/Tab.h
Original file line number Diff line number Diff line change
@@ -88,7 +88,6 @@ class Tab {
ArrayList classes;
int dummyName;


Tab(Parser *parser);
~Tab();