--- name: compiler-design description: > Use this skill for designing and implementing compilers, assemblers, and language processors specifically for or targeting the Commodore 64 and 128. Covers lexical analysis, parsing, code generation for 6510, cross-compilation, and building assemblers in BASIC or ML. Sources: Compiler Design and Implementation (64 and 128), COMPUTE!'s SpeedScript source. --- # Compiler Design and Implementation for the C64/128 ## Overview Writing a compiler or assembler on (or for) the Commodore 64 requires special consideration for: - **Memory constraints**: 64KB total, ~38KB for BASIC programs, ~4KB free upper RAM - **Speed**: 1 MHz CPU — interpreter overhead is expensive; native code is essential - **Target architecture**: 6510 with its register-sparse, accumulator-centric ISA - **Output format**: C64 `.prg` files (2-byte load address header + raw binary) --- ## Assembler Design on the C64 ### Two-Pass Assembly Standard two-pass technique (used by all period C64 assemblers): **Pass 1 — Symbol collection**: 1. Scan source code line by line 2. Record each label and its current address in a symbol table 3. Count instruction bytes to advance the address counter 4. Do NOT produce output **Pass 2 — Code generation**: 1. Re-scan source code 2. Look up all labels/symbols in the symbol table 3. Emit object bytes to output buffer or file 4. Report unresolved symbols as errors ### Symbol Table Implementation For a C64 assembler written in BASIC or ML, the symbol table is typically: - A sorted array of name/value pairs (use binary search for speed) - Names limited to 6-8 characters to conserve memory - Values stored as 16-bit addresses ```basic ' Simple symbol table in BASIC (array-based) DIM SYMNAM$(100) ' symbol names DIM SYMVAL%(100) ' 16-bit values (integer) NSYM% = 0 ' Add symbol SYMNAM$(NSYM%) = NAME$ SYMVAL%(NSYM%) = VALUE% NSYM% = NSYM% + 1 ' Find symbol (linear search) FOR I = 0 TO NSYM%-1 IF SYMNAM$(I) = NAME$ THEN FOUND = SYMVAL%(I) : GOTO FOUND_LABEL NEXT I ' not found ``` ### 6510 Instruction Encoding Each instruction consists of an opcode byte followed by 0, 1, or 2 operand bytes. The assembler must map mnemonic + addressing mode → opcode byte. ``` Encoding pattern for most 6502 instructions: Bits 7-5: instruction group Bits 4-2: addressing mode Bits 1-0: instruction select within group Groups: aaa=000: BIT, JMP, JMP(), STY, LDY, CPY, CPX aaa=001: ORA, AND, EOR, ADC, STA, LDA, CMP, SBC aaa=010: ASL, ROL, LSR, ROR, STX, LDX, DEC, INC Addressing mode encoding (for group 01): 000: (zp,X) 001: zp 010: #imm 011: abs 100: (zp),Y 101: zp,X 110: abs,Y 111: abs,X ``` ### Expression Evaluator An assembler's expression evaluator handles operands like `LABEL+2`, `$C000+OFFSET`, `>ADDR`: ```basic ' Operators to support: ' + - * / (arithmetic) ' AND OR EOR NOT (bitwise) ' < (low byte), > (high byte) ' Precedence: NOT > */ > +- > AND > OR/EOR ``` ### Forward Reference Handling When a label is referenced before its definition: 1. Emit a placeholder byte (typically $00 $00) 2. Record the location and label name in a fixup table 3. After Pass 1 completes, apply fixups using the resolved symbol table --- ## Lexical Analysis (Tokenizer) The first stage of any compiler — breaking source into tokens. ### Token Types for an Assembler ``` LABEL — identifier followed by ':' OPCODE — recognized mnemonic (LDA, STA, etc.) DIRECTIVE — assembler directive (.BYTE, .WORD, .TEXT, *= etc.) NUMBER — decimal ($nnn hex, %nnn binary, 'c' char literal) STRING — quoted text "..." OPERATOR — + - * / < > = ( ) COMMA — , NEWLINE — end of logical line EOF — end of input ``` ### Efficient Lexer in ML ```asm ; Simple character classifier for assembler lexer ; Input: A = character ; Output: A = token class (0=whitespace, 1=alpha, 2=digit, 3=operator, 4=EOL) CLASSIFY: CMP #$20 ; space BEQ IS_SPACE CMP #$0D ; CR BEQ IS_EOL CMP #$30 ; '0' BMI IS_OP CMP #$3A ; past '9' BMI IS_DIGIT CMP #$41 ; 'A' BMI IS_OP CMP #$5B ; past 'Z' BMI IS_ALPHA ; default: operator IS_OP LDA #3 : RTS IS_SPACE LDA #0 : RTS IS_EOL LDA #4 : RTS IS_DIGIT LDA #2 : RTS IS_ALPHA LDA #1 : RTS ``` --- ## Parser Design (Recursive Descent) A recursive-descent parser is ideal for the C64's memory constraints because: - Small code size (each rule is a subroutine) - No separate parse table needed - Easy to hand-code in ML ### Grammar for a Simple BASIC-like Language ``` program → statement* statement → LET var '=' expr NEWLINE | PRINT expr NEWLINE | IF expr THEN statement | GOTO number | FOR var '=' expr TO expr [STEP expr] | NEXT [var] | END expr → term (('+' | '-') term)* term → factor (('*' | '/') factor)* factor → NUMBER | STRING | VAR | '(' expr ')' | '-' factor | NOT factor ``` ### Parser Subroutine Template (ML) ```asm ; Parse an expression; result in FAC1 (using BASIC math) ; Returns with carry set on error PARSE_EXPR: JSR PARSE_TERM ; parse first term BCS PERR EXPR_LOOP: JSR PEEK_TOKEN ; look at next token CMP #TOK_PLUS BEQ EXPR_ADD CMP #TOK_MINUS BEQ EXPR_SUB RTS ; done: no more + or - EXPR_ADD: JSR NEXT_TOKEN ; consume '+' JSR SAVE_FAC1 ; save left side JSR PARSE_TERM BCS PERR JSR FADD ; FAC1 = left + FAC1 BCC EXPR_LOOP PERR SEC : RTS ``` --- ## Code Generation for 6510 ### Register Allocation Strategy The 6510 has only 3 registers (A, X, Y) and no general-purpose registers. Effective code generation strategies: 1. **Accumulator-primary**: Keep the most recent value in A; use X/Y for indices and loop counters 2. **Zero-page variables**: Allocate frequently used compiler temporaries in zero page ($FB-$FE free) 3. **Stack-based expression evaluation**: For complex expressions, use the hardware stack (PHA/PLA) 4. **Inline vs. subroutine**: For short sequences (≤8 bytes), inline is faster; longer sequences justify JSR ### Expression Code Generation Example For `A + B * C` (where A, B, C are zero-page variables): ```asm ; Generated code for: A + B * C LDA B ; load B STA TEMP ; save LDA C ; load C ; multiply TEMP * A (need ML multiply routine) JSR MULTIPLY ; result in A (low byte) CLC ADC A_VAR ; add A STA RESULT ``` ### Peephole Optimization Common optimizations for 6510 code generation: | Pattern | Optimized | |---------|-----------| | `STA $xx; LDA $xx` | `STA $xx` (remove redundant load) | | `LDA #0` | `LDA #0` → prefer `AND #0` when flags needed | | `TAX; TXA` | Remove both (no-op) | | `PHA; PLA` | Remove both if A not changed | | Branch over branch | Convert to opposite-condition branch | --- ## Building the C64 `.PRG` File Format A C64 program file begins with a 2-byte load address (little-endian), followed by raw binary: ```asm ; Assembler output format for a file that loads at $C000: .BYTE $00, $C0 ; load address: $C000 (low byte first) .BYTE ; raw binary from $C000 onward ``` To generate from BASIC: ```basic ' Write PRG file with load address $C000 (49152) OPEN 1,8,1,"MYPRG,P,W" ' SA=1 for raw PRG output PRINT#1, CHR$(0) CHR$(192) ' load address low, high ' ... write code bytes ... CLOSE 1 ``` --- ## SpeedScript Architecture (Real-World Example) COMPUTE!'s SpeedScript is a complete word processor written in assembly — a model for structured C64 application design: **Memory layout**: - `$0200–$02FF`: I/O buffer and workspace - `$033C–$03FB`: Cassette buffer (used for printer spooling) - `$C000–$CFFF`: Main application code (4KB upper RAM) - `$0400–$07FF`: Screen display buffer - `$D800–$DBFF`: Color RAM (controlled directly) **Key architectural patterns**: 1. **IRQ for keyboard** — custom IRQ handler for responsive input 2. **Modular subroutines** — each function (cursor move, insert, delete) is a separate JSR 3. **Kernal for I/O** — uses standard Kernal OPEN/CLOSE/BASIN/BSOUT for file operations 4. **Direct VIC-II control** — manipulates screen and color RAM directly for speed 5. **Zero-page workspace** — all frequently used pointers and counters in zero page --- ## Cross-Compilation Considerations When building a compiler that **targets** the C64 (runs on a modern machine, produces C64 code): 1. **Little-endian 16-bit values** throughout 2. **Memory model**: Assume program loads at $0801 (BASIC stub) or $C000 (pure ML) 3. **Calling conventions** (for generated subroutines): - Parameters: A (1 byte), X/Y (2 bytes combined), or zero page - Return values: A (byte), A/Y (16-bit, A=low), FAC1 (float) 4. **Stack depth**: Max recursion ~20 levels (stack is only 256 bytes) 5. **ROM routines**: Can JSR to Kernal at $FFxx from generated code — document dependencies ### BASIC Line Header for Machine Code Stubs ``` ; BASIC stub that does SYS 49152 when RUN $0801: $0B,$08 ; link to next line = $080B $0A,$00 ; line 10 $9E ; SYS token $20,"49152",$00 ; " 49152" as text $080B: $00,$00 ; end of BASIC program ; Code follows at $080D or $C000 ``` Minimal BASIC stub loader (5-line BASIC): ```basic 2018 SYS 2074 ' must be at line 2018 for standard stub ```