COBOL program analysis is the systematic examination of COBOL code to understand its structure, identify issues, assess quality, measure complexity, analyze performance, and plan maintenance. Effective program analysis helps developers understand legacy code, identify problems before they cause issues, optimize performance, and make informed decisions about refactoring and modernization.
Program analysis involves examining COBOL programs through various techniques:
| Analysis Area | Focus | Key Techniques |
|---|---|---|
| Code Structure | Program organization and layout | Division analysis, paragraph structure, data organization |
| Complexity Metrics | Maintainability and testability | Cyclomatic complexity, nesting depth, decision points |
| Performance | Execution efficiency | I/O analysis, algorithm efficiency, profiling |
| Code Quality | Standards and best practices | Naming conventions, documentation, error handling |
| Maintenance Risk | Future maintainability | Technical debt assessment, refactoring needs |
Analyzing program structure helps understand how a program is organized and identify organizational issues.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596*> CODE STRUCTURE ANALYSIS CHECKLIST *> *> IDENTIFICATION DIVISION *> - Clear program identification *> - Author and date information *> - Purpose and description *> - Change history *> *> ENVIRONMENT DIVISION *> - Proper file assignments *> - Appropriate file organizations *> - Correct access modes *> *> DATA DIVISION *> - Logical grouping of data items *> - Clear naming conventions (WS-, FD- prefixes) *> - Appropriate data types and sizes *> - Proper initialization *> *> PROCEDURE DIVISION *> - Logical paragraph organization *> - Meaningful paragraph names *> - Consistent numbering (1000-, 2000-) *> - Clear control flow *> - Minimal use of GO TO IDENTIFICATION DIVISION. PROGRAM-ID. STRUCTURE-ANALYSIS-EXAMPLE. *AUTHOR. Development Team *DATE-WRITTEN. 2024-01-15 *PURPOSE. Demonstrates well-structured COBOL program *NOTES. Example for structure analysis ENVIRONMENT DIVISION. INPUT-OUTPUT SECTION. FILE-CONTROL. SELECT CUSTOMER-FILE ASSIGN TO "CUSTOMER.DAT" ORGANIZATION IS SEQUENTIAL ACCESS MODE IS SEQUENTIAL FILE STATUS IS WS-CUST-STATUS. DATA DIVISION. FILE SECTION. FD CUSTOMER-FILE. 01 CUSTOMER-RECORD. 05 CUST-ID PIC X(8). 05 CUST-NAME PIC X(40). 05 CUST-BALANCE PIC S9(7)V99. WORKING-STORAGE SECTION. *> Control fields with clear naming 01 WS-CONTROL-FIELDS. 05 WS-END-OF-FILE PIC X VALUE 'N'. 88 WS-EOF VALUE 'Y'. 05 WS-RECORD-COUNT PIC 9(6) VALUE 0. *> File status with proper error handling 01 WS-FILE-STATUS PIC X(2). PROCEDURE DIVISION. *> Main program flow - clear and logical PERFORM 1000-INITIALIZE PERFORM 2000-PROCESS-DATA PERFORM 3000-FINALIZE STOP RUN. *> Initialization section 1000-INITIALIZE. DISPLAY "Program initialization" OPEN INPUT CUSTOMER-FILE IF WS-FILE-STATUS NOT = "00" DISPLAY "Error opening file: " WS-FILE-STATUS STOP RUN END-IF. *> Main processing section 2000-PROCESS-DATA. PERFORM UNTIL WS-EOF READ CUSTOMER-FILE AT END SET WS-EOF TO TRUE NOT AT END ADD 1 TO WS-RECORD-COUNT PERFORM 2100-PROCESS-RECORD END-READ END-PERFORM. 2100-PROCESS-RECORD. *> Process individual record CONTINUE. *> Finalization section 3000-FINALIZE. CLOSE CUSTOMER-FILE DISPLAY "Records processed: " WS-RECORD-COUNT.
Complexity metrics help assess how difficult a program is to understand, test, and maintain.
Cyclomatic complexity measures the number of linearly independent paths through a program. It's calculated by counting decision points (IF, EVALUATE, PERFORM UNTIL, etc.) plus 1.
| Complexity Range | Assessment | Recommendation |
|---|---|---|
| 1-10 | Simple - easy to understand and test | Maintain current structure |
| 11-20 | Moderate - acceptable but watch for growth | Consider breaking into smaller modules |
| 21-50 | Complex - difficult to test and maintain | Refactor into smaller, simpler modules |
| 50+ | Very complex - high maintenance risk | Major refactoring required |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859*> COMPLEXITY ANALYSIS EXAMPLE *> This example demonstrates how to analyze program complexity PROCEDURE DIVISION. MAIN-PROCESS. *> Base complexity: 1 IF WS-STATUS = 'A' *> +1 for IF PERFORM PROCESS-ACTIVE ELSE IF WS-STATUS = 'I' *> +1 for nested IF PERFORM PROCESS-INACTIVE ELSE IF WS-STATUS = 'S' *> +1 for nested IF PERFORM PROCESS-SUSPENDED END-IF END-IF END-IF *> Total complexity so far: 4 (1 base + 3 decision points) PERFORM VARYING WS-INDEX FROM 1 BY 1 UNTIL WS-INDEX > WS-MAX-COUNT *> +1 for PERFORM UNTIL IF WS-DATA(WS-INDEX) > WS-THRESHOLD *> +1 for IF inside loop PERFORM PROCESS-HIGH-VALUE ELSE IF WS-DATA(WS-INDEX) < WS-MIN-VALUE *> +1 for nested IF PERFORM PROCESS-LOW-VALUE END-IF END-IF END-PERFORM *> Total complexity: 7 *> Assessment: Moderate complexity - acceptable but could be simplified *> IMPROVED VERSION - Lower complexity MAIN-PROCESS-IMPROVED. *> Base complexity: 1 EVALUATE WS-STATUS *> +1 for EVALUATE WHEN 'A' PERFORM PROCESS-ACTIVE WHEN 'I' PERFORM PROCESS-INACTIVE WHEN 'S' PERFORM PROCESS-SUSPENDED END-EVALUATE *> Complexity: 2 (simpler than nested IFs) PERFORM PROCESS-DATA-ARRAY *> Total complexity: 3 *> Assessment: Simple - much better maintainability.
Performance analysis identifies bottlenecks and optimization opportunities in COBOL programs.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465*> PERFORMANCE ANALYSIS EXAMPLE *> Identifying and fixing performance issues *> PROBLEMATIC CODE - Performance issues PROCEDURE DIVISION. PROCESS-RECORDS. *> Issue 1: Reading file multiple times PERFORM VARYING WS-INDEX FROM 1 BY 1 UNTIL WS-INDEX > 100 *> Each iteration opens and reads the file - very inefficient! OPEN INPUT CUSTOMER-FILE READ CUSTOMER-FILE AT END CONTINUE NOT AT END *> Issue 2: Nested loop with file operations PERFORM VARYING WS-J FROM 1 BY 1 UNTIL WS-J > 50 *> Issue 3: Reading same file again inside nested loop OPEN INPUT PRODUCT-FILE READ PRODUCT-FILE CLOSE PRODUCT-FILE END-PERFORM END-READ CLOSE CUSTOMER-FILE END-PERFORM. *> Problems: Multiple file opens/closes, nested file I/O, redundant operations *> IMPROVED CODE - Better performance PROCESS-RECORDS-IMPROVED. *> Open files once before processing OPEN INPUT CUSTOMER-FILE OPEN INPUT PRODUCT-FILE *> Read customer file once, process all records PERFORM UNTIL WS-EOF-CUSTOMER READ CUSTOMER-FILE AT END SET WS-EOF-CUSTOMER TO TRUE NOT AT END *> Process customer record PERFORM PROCESS-CUSTOMER *> If needed, read product file efficiently *> (Only if necessary, not in nested loop) IF PRODUCT-NEEDED PERFORM READ-PRODUCT-ONCE END-IF END-READ END-PERFORM *> Close files once after processing CLOSE CUSTOMER-FILE CLOSE PRODUCT-FILE. *> Improvements: Single file open/close, eliminated nested I/O, efficient processing *> Performance measurement MEASURE-PERFORMANCE. ACCEPT WS-START-TIME FROM TIME PERFORM PROCESS-RECORDS-IMPROVED ACCEPT WS-END-TIME FROM TIME COMPUTE WS-DURATION = WS-END-TIME - WS-START-TIME DISPLAY "Processing duration: " WS-DURATION " centiseconds".
Code quality assessment evaluates how well code follows best practices and standards.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980*> CODE REVIEW CHECKLIST *> *> 1. NAMING CONVENTIONS *> ✓ Descriptive variable names (WS-, FD- prefixes) *> ✓ Consistent naming patterns *> ✓ Meaningful paragraph names *> ✓ Clear file and record names *> *> 2. ERROR HANDLING *> ✓ All file operations check file status *> ✓ Meaningful error messages *> ✓ Graceful error recovery *> ✓ Proper exception handling *> *> 3. DOCUMENTATION *> ✓ Comprehensive program header *> ✓ Clear data structure comments *> ✓ Algorithm explanations *> ✓ Maintenance notes *> *> 4. PROGRAM STRUCTURE *> ✓ Logical organization *> ✓ Clear separation of concerns *> ✓ Modular design *> ✓ Consistent formatting *> *> 5. PERFORMANCE *> ✓ Efficient file access *> ✓ Optimized loops *> ✓ Minimal data movement *> ✓ Appropriate data types *> *> 6. MAINTAINABILITY *> ✓ Low complexity *> ✓ Clear logic flow *> ✓ Minimal GO TO usage *> ✓ Easy to understand IDENTIFICATION DIVISION. PROGRAM-ID. QUALITY-EXAMPLE. *AUTHOR. Development Team *DATE-WRITTEN. 2024-01-15 *PURPOSE. Example demonstrating code quality standards *CHANGE-HISTORY. * 2024-01-15 Initial version DATA DIVISION. WORKING-STORAGE SECTION. *> Well-named control fields 01 WS-CONTROL-FIELDS. 05 WS-END-OF-FILE-FLAG PIC X VALUE 'N'. 88 WS-END-OF-FILE VALUE 'Y'. 88 WS-NOT-END-OF-FILE VALUE 'N'. 05 WS-RECORD-COUNT PIC 9(6) VALUE 0. 05 WS-ERROR-COUNT PIC 9(4) VALUE 0. *> File status with proper error handling 01 WS-FILE-STATUSES. 05 WS-INPUT-STATUS PIC X(2) VALUE SPACES. 05 WS-OUTPUT-STATUS PIC X(2) VALUE SPACES. PROCEDURE DIVISION. MAIN-PROCESS. PERFORM 1000-INITIALIZE IF WS-ERROR-COUNT = 0 PERFORM 2000-PROCESS-DATA PERFORM 3000-FINALIZE ELSE PERFORM 9000-ERROR-HANDLING END-IF STOP RUN. 1000-INITIALIZE. DISPLAY "Initializing program" OPEN INPUT INPUT-FILE IF WS-INPUT-STATUS NOT = "00" DISPLAY "ERROR: Cannot open input file" DISPLAY "File status: " WS-INPUT-STATUS ADD 1 TO WS-ERROR-COUNT END-IF.
Static analysis tools automatically examine code without executing it to identify issues, measure metrics, and assess quality.
| Tool Type | Capabilities | Benefits |
|---|---|---|
| Static Analysis Tools | Code quality metrics, complexity analysis, security scanning | Automated issue detection, consistent analysis, comprehensive reports |
| Interactive Debuggers | Step-by-step execution, breakpoints, variable inspection | Detailed runtime analysis, issue identification, testing support |
| Performance Profilers | Execution time measurement, resource usage tracking | Bottleneck identification, optimization guidance |
| Documentation Generators | Automatic documentation, flowcharts, data flow analysis | Code understanding, maintenance support, knowledge transfer |
Maintenance risk assessment identifies areas of code that may be difficult to maintain or modify in the future.
1234567891011121314151617181920212223242526272829303132333435363738394041*> MAINTENANCE RISK ASSESSMENT *> *> High Risk Indicators: *> - High cyclomatic complexity (>20) *> - Deep nesting (>5 levels) *> - GO TO statements *> - Tightly coupled code *> - Poor documentation *> - Unclear variable names *> - Missing error handling *> - Duplicate code *> *> Risk Calculation Example: WORKING-STORAGE SECTION. 01 RISK-ASSESSMENT. 05 CYCLOMATIC-COMPLEXITY PIC 9(3) VALUE 0. 05 NESTING-DEPTH PIC 9(2) VALUE 0. 05 GO-TO-COUNT PIC 9(3) VALUE 0. 05 DOCUMENTATION-SCORE PIC 9(2) VALUE 0. 05 MAINTENANCE-RISK-SCORE PIC 9(4) VALUE 0. PROCEDURE DIVISION. CALCULATE-MAINTENANCE-RISK. *> Calculate risk score based on various factors COMPUTE MAINTENANCE-RISK-SCORE = (CYCLOMATIC-COMPLEXITY * 2) + (NESTING-DEPTH * 5) + (GO-TO-COUNT * 10) - (DOCUMENTATION-SCORE * 2) *> Assess risk level EVALUATE TRUE WHEN MAINTENANCE-RISK-SCORE < 20 DISPLAY "LOW RISK: Code is maintainable" WHEN MAINTENANCE-RISK-SCORE < 50 DISPLAY "MEDIUM RISK: Some maintenance concerns" WHEN MAINTENANCE-RISK-SCORE < 100 DISPLAY "HIGH RISK: Significant maintenance challenges" WHEN OTHER DISPLAY "CRITICAL RISK: Major refactoring required" END-EVALUATE.
Before making changes, measure current complexity, performance, and quality metrics to establish a baseline for comparison.
Leverage static analysis tools to automate routine checks and focus manual review on complex logic and business rules.
Prioritize analysis of frequently executed code, critical business logic, and areas with known issues or high complexity.
Document analysis findings, identified issues, recommended improvements, and risk assessments for future reference.
Perform program analysis regularly, not just when problems occur. Regular analysis helps prevent issues from accumulating.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950*> PROGRAM ANALYSIS WORKFLOW *> *> 1. INITIAL ASSESSMENT *> - Review program purpose and documentation *> - Understand business requirements *> - Identify key components *> *> 2. STRUCTURAL ANALYSIS *> - Examine program organization *> - Review data division structure *> - Analyze procedure division flow *> *> 3. COMPLEXITY ANALYSIS *> - Calculate cyclomatic complexity *> - Measure nesting depth *> - Count decision points *> *> 4. QUALITY ASSESSMENT *> - Review naming conventions *> - Check documentation quality *> - Assess error handling *> *> 5. PERFORMANCE ANALYSIS *> - Identify I/O operations *> - Analyze algorithm efficiency *> - Measure execution time *> *> 6. RISK ASSESSMENT *> - Calculate maintenance risk *> - Identify refactoring needs *> - Prioritize improvements *> *> 7. REPORTING *> - Document findings *> - Recommend improvements *> - Plan maintenance actions PROCEDURE DIVISION. ANALYSIS-WORKFLOW. DISPLAY "=== PROGRAM ANALYSIS WORKFLOW ===" PERFORM INITIAL-ASSESSMENT PERFORM STRUCTURAL-ANALYSIS PERFORM COMPLEXITY-ANALYSIS PERFORM QUALITY-ASSESSMENT PERFORM PERFORMANCE-ANALYSIS PERFORM RISK-ASSESSMENT PERFORM GENERATE-ANALYSIS-REPORT DISPLAY "=== ANALYSIS COMPLETE ===".
Imagine you have a big box of toys:
Program analysis is like organizing and checking your toy box. You look at all your toys (the code), see which ones are broken or messy (find problems), figure out which toys are your favorites (important code), and decide how to organize them better (improve the code).
Just like you might count your toys, check if they're working, and decide which ones need fixing, program analysis counts different parts of the code, checks if everything works correctly, and finds things that need to be fixed or improved!
Analyze the following code and calculate its cyclomatic complexity:
12345678910111213141516171819IF WS-STATUS = 'A' IF WS-BALANCE > 1000 PERFORM PROCESS-HIGH-VALUE ELSE PERFORM PROCESS-LOW-VALUE END-IF ELSE IF WS-STATUS = 'I' PERFORM PROCESS-INACTIVE END-IF END-IF PERFORM VARYING I FROM 1 BY 1 UNTIL I > 10 IF WS-DATA(I) > 0 ADD 1 TO WS-COUNT END-IF END-PERFORM *> Answer: Base (1) + IF (1) + nested IF (1) + ELSE IF (1) + PERFORM UNTIL (1) + IF in loop (1) = 6
Identify performance issues in this code:
123456789101112131415PERFORM VARYING I FROM 1 BY 1 UNTIL I > 100 OPEN INPUT CUSTOMER-FILE READ CUSTOMER-FILE CLOSE CUSTOMER-FILE PERFORM VARYING J FROM 1 BY 1 UNTIL J > 50 OPEN INPUT PRODUCT-FILE READ PRODUCT-FILE CLOSE PRODUCT-FILE END-PERFORM END-PERFORM *> Issues: *> - Opening/closing files inside loops (should open once) *> - Nested file I/O operations *> - Redundant file operations
1. What is the primary purpose of COBOL program analysis?
2. What does cyclomatic complexity measure?
3. What is a common performance issue in COBOL programs?
4. What should you check during a COBOL code review?
5. What indicates high maintenance risk in COBOL code?