Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in walking through AST due to comment statements #98

Open
sundar-sarvam opened this issue Aug 21, 2023 · 5 comments
Open

Error in walking through AST due to comment statements #98

sundar-sarvam opened this issue Aug 21, 2023 · 5 comments

Comments

@sundar-sarvam
Copy link

I am trying to run the parser with this COBOL file (with the full repo downloaded): https://github.com/aws-samples/aws-mainframe-modernization-carddemo/blob/main/app/cbl/CBACT02C.cbl ( I think this is IBM dialect only). But I get the below error due to a comment line: * You may obtain a copy of the License at ):

Full error:

Exception in thread "main" io.proleap.cobol.asg.exception.CobolParserException: syntax error in line 12:33 mismatched input 'the' expecting {IN, OF, ON, REPLACING, SUPPRESS, '.', NEWLINE}
        at io.proleap.cobol.asg.runner.ThrowingErrorListener.syntaxError(ThrowingErrorListener.java:22)
        at org.antlr.v4.runtime.ProxyErrorListener.syntaxError(ProxyErrorListener.java:41)
        at org.antlr.v4.runtime.Parser.notifyErrorListeners(Parser.java:544)
        at org.antlr.v4.runtime.DefaultErrorStrategy.reportInputMismatch(DefaultErrorStrategy.java:327)
        at org.antlr.v4.runtime.DefaultErrorStrategy.reportError(DefaultErrorStrategy.java:139)
        at io.proleap.cobol.CobolPreprocessorParser.copyStatement(CobolPreprocessorParser.java:3609)
        at io.proleap.cobol.CobolPreprocessorParser.startRule(CobolPreprocessorParser.java:326)
        at io.proleap.cobol.preprocessor.sub.document.impl.CobolDocumentParserImpl.processWithParser(CobolDocumentParserImpl.java:90)
        at io.proleap.cobol.preprocessor.sub.document.impl.CobolDocumentParserImpl.processLines(CobolDocumentParserImpl.java:59)
        at io.proleap.cobol.preprocessor.impl.CobolPreprocessorImpl.parseDocument(CobolPreprocessorImpl.java:66)
        at io.proleap.cobol.preprocessor.impl.CobolPreprocessorImpl.process(CobolPreprocessorImpl.java:86)
        at io.proleap.cobol.preprocessor.impl.CobolPreprocessorImpl.process(CobolPreprocessorImpl.java:78)
        at io.proleap.cobol.asg.runner.impl.CobolParserRunnerImpl.parseFile(CobolParserRunnerImpl.java:197)
        at io.proleap.cobol.asg.runner.impl.CobolParserRunnerImpl.analyzeFile(CobolParserRunnerImpl.java:97)
        at io.proleap.cobol.asg.runner.impl.CobolParserRunnerImpl.analyzeFile(CobolParserRunnerImpl.java:106)
        at CobolParserDemo2.parseCobolFile(App2.java:26)
        at CobolParserDemo2.main(App2.java:16)

This isn't expected right?

My Java app:

import java.io.File;
import io.proleap.cobol.asg.metamodel.Program;
import io.proleap.cobol.asg.metamodel.CompilationUnit;
import io.proleap.cobol.CobolBaseVisitor;
import io.proleap.cobol.CobolParser;
import io.proleap.cobol.asg.metamodel.data.datadescription.DataDescriptionEntry;
import io.proleap.cobol.asg.runner.impl.CobolParserRunnerImpl;
import io.proleap.cobol.preprocessor.CobolPreprocessor;
import java.io.IOException;

class CobolParserDemo2 {

public static void main(String[] args) {
        try {
           Program program = parseCobolFile("/Users/.../aws-mainframe-modernization-carddemo/app/cbl/CBACT02C.cbl"); 
           walkAST(program);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    public static Program parseCobolFile(String filePath) throws IOException {
        File inputFile = new File(filePath);
        CobolPreprocessor.CobolSourceFormatEnum format = CobolPreprocessor.CobolSourceFormatEnum.TANDEM;
        return new CobolParserRunnerImpl().analyzeFile(inputFile, format);
    }
    public static void walkAST(Program program) {
        CobolBaseVisitor<Boolean> visitor = new CobolBaseVisitor<Boolean>() {
            @Override
            public Boolean visitDataDescriptionEntryFormat1(final CobolParser.DataDescriptionEntryFormat1Context ctx) {
                DataDescriptionEntry entry = (DataDescriptionEntry) program.getASGElementRegistry().getASGElement(ctx);
                String name = entry.getName();
                System.out.println("DataDescriptionEntry Name: " + name); // This will print the name, if you want to see it.
                return visitChildren(ctx);
            }
        };

        for (final CompilationUnit compilationUnit : program.getCompilationUnits()) {
            visitor.visit(compilationUnit.getCtx());
        }
    }
}

Any clue why this might happen and workarounds for same? My aim is to find line range in a code file for different constructs like if-else, perform end-perform, etc.

@uwol
Copy link
Owner

uwol commented Aug 21, 2023

https://github.com/aws-samples/aws-mainframe-modernization-carddemo/blob/main/app/cbl/CBACT02C.cbl#L1 is not line format TANDEM (which would start with line indicator in column 1), but seems to be line format FIXED.

Please consult

Best
Ulrich

@sundar-sarvam
Copy link
Author

Thanks this helped! To get the PERFORM and CALL statements (which change the control flow of the program), I was trying to use the below code:

        Program program = parseCobolFile("..aws-mainframe-modernization-carddemo/app/cbl/CBACT02C.cbl"); 

        List<CompilationUnit> programUnit3List = new ArrayList<>();

        programUnit3List = program.getCompilationUnits();


        for (CompilationUnit programUnit3 : programUnit3List) {
                        final ProgramUnit programUnit = programUnit3.getProgramUnit();
                        final ProcedureDivision procedureDivision = programUnit.getProcedureDivision();
                        final List<Paragraph> paragraphList = procedureDivision.getParagraphs();
                        for (Paragraph paragraph : paragraphList) {   
                                System.out.println("Name: " + paragraph.getName());
                                System.out.println("Statements: " + paragraph.getStatements() );
                                System.out.println("Calls: "+ paragraph.getCalls());

                            }
                        }

But it prints the below as output:

Name: 9910-DISPLAY-IO-STATUS
Statements: [io.proleap.cobol.asg.metamodel.procedure.ifstmt.impl.IfStatementImpl@7a606260, io.proleap.cobol.asg.metamodel.procedure.exit.impl.ExitStatementImpl@5dbab232]
Calls: [name=[9910-DISPLAY-IO-STATUS], paragraph=[name=[9910-DISPLAY-IO-STATUS]], name=[9910-DISPLAY-IO-STATUS], paragraph=[name=[9910-DISPLAY-IO-STATUS]], name=[9910-DISPLAY-IO-STATUS], paragraph=[name=[9910-DISPLAY-IO-STATUS]]]

I need to get which PERFORM statements call which and which PROGRAMs CALL which other PROGRAMs (kind of like a dictionary with key being the caller and value being the called). How can I do this using Proleap COBOL parser? I tried different functions of paragraph like getCalls, etc. but none of them helps

@uwol
Copy link
Owner

uwol commented Aug 27, 2023

Hello @sundar-sarvam , so in my understanding you want to navigate in the ASG (1) from a called Paragraph to all ProcedureCalls calling the Paragraph, and then (2) from each ProcedureCall to the containing PerformStatement.

(1) already works in your example.

(2) probably should work by calling ASGElement.getParent() on each ProcedureCall, which might give you a PerformProcedureStatement, and a second getParent() might give you the PerformStatement. You could write a helper function which calls getParent recursively until a certain condition is met or certain ancestor class has been found.

I did not implement this to try it out, but I am quite sure this should work. Else you can paste your code and I can take a look.

Best

@sundar-sarvam
Copy link
Author

Thanks @uwol . I will implement whatever you have mentioned. Also can you point to some examples where CALL <PROGRAM> is parsed out? Typically a program can call another program as well right (which are in two different files)? To analyse these calls, I will need to parse out CALL statements. Also, will I be able to get the (start -> end) line numbers of a PERFORM statement in a file using the parser? Visual Studio Code (and other editors) using LSP offer folding ranges using which also you get the line numbers but was just wondering if it's possible through the parser itself!

@uwol
Copy link
Owner

uwol commented Aug 31, 2023

Hi,

  • regarding calls, here you find a test: CallStatementTest.java

  • regarding line numbers, ASGElement.getCtx() returns the ParserRuleContext, i.e. the ANTLR AST element for every ASG element. ANTLR might offer some functionality to achieve this. However, COBOL has a preprocessor, which modifies line numbers in case of COPY preprocessor statements -> this would be a problem.

Best
Ulrich

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants