Math Expression Evaluator — Design Document

Requirements Summary

  • Language: Go
  • Operators: +, -, *, / with parentheses
  • Numbers: floating point (e.g. 3.14, 42, 0.5)
  • Functions: user-defined with f(x) = x + 1 syntax
  • Interface: CLI REPL
  • Error handling: print error message, continue REPL

Function Definition Requirements

  • Syntax: f(x) = x + 1 — name, parenthesized params, =, body expression
  • Multiple params: f(x, y) = x + y
  • Function calls: Allowed anywhere a number can appear; arguments are arbitrary expressions
  • Persistence: Definitions persist across REPL lines
  • Cross-calling: Functions can call other user-defined functions (late binding)
  • Built-ins: None
  • Redefinition: Not allowed (error)
  • Output on define: Print "defined "

Approaches Considered

1. Recursive-Descent with AST (chosen)

  • Lexer → Parser → AST → Evaluator → REPL
  • Clean separation: each stage is independently testable
  • AST is a reusable intermediate representation
  • Easy to extend (new operators, pretty-printing, optimization)
  • Well-suited for 2 precedence levels + parentheses

2. Recursive-Descent with Direct Evaluation

  • Parser evaluates inline — no AST
  • Fewer types, less code
  • Couples parsing and evaluation — harder to test, extend

3. Shunting-Yard Algorithm

  • Converts to RPN then evaluates
  • Good for many precedence levels; overkill here
  • Harder to produce clear error messages

Decision: Approach 1. The AST adds minimal overhead but provides clean boundaries.

Architecture

Input string
    │
    ▼
 ┌───────┐
 │ Lexer │  string → []Token
 └───┬───┘
     │
     ▼
 ┌────────┐
 │ Parser │  []Token → Statement (ExprStmt | FuncDef)
 └───┬────┘
     │
     ▼
 ┌───────────┐
 │ Evaluator │  stateful: function registry + expression evaluation
 └───┬───────┘
     │
     ▼
 ┌──────┐
 │ REPL │  read line → parse → route (define or eval) → print
 └──────┘

Component Interfaces

Token (data type)

package token

type Type int

const (
    Number Type = iota // numeric literal
    Plus               // +
    Minus              // -
    Star               // *
    Slash              // /
    LParen             // (
    RParen             // )
    Ident              // identifier (e.g. f, x, myFunc)
    Comma              // ,
    Equals             // =
    EOF                // end of input
)

type Token struct {
    Type    Type
    Literal string  // raw text, e.g. "3.14", "+", "f"
    Pos     int     // position in input (for error messages)
}

Lexer

package lexer

// Tokenize converts an input string into a slice of tokens.
// Returns an error if the input contains invalid characters.
// Recognizes: numbers, operators, parens, identifiers, comma, equals.
func Tokenize(input string) ([]token.Token, error)

AST (data types)

package ast

// Node is the interface all expression AST nodes implement.
type Node interface {
    node() // sealed marker method
}

// NumberLit represents a numeric literal.
type NumberLit struct {
    Value float64
}

// BinaryExpr represents a binary operation (e.g. 1 + 2).
type BinaryExpr struct {
    Op    token.Type  // Plus, Minus, Star, Slash
    Left  Node
    Right Node
}

// Ident represents a variable reference (function parameter).
type Ident struct {
    Name string
}

// FuncCall represents a function call (e.g. f(1+2, 3)).
type FuncCall struct {
    Name string
    Args []Node
}

// Statement is the interface for top-level parsed constructs.
type Statement interface {
    stmt() // sealed marker method
}

// ExprStmt wraps an expression used as a statement.
type ExprStmt struct {
    Expr Node
}

// FuncDef represents a function definition: name(params) = body
type FuncDef struct {
    Name   string
    Params []string
    Body   Node
}

Parser

package parser

// Parse converts a slice of tokens into an expression AST.
// Kept for backward compatibility.
func Parse(tokens []token.Token) (ast.Node, error)

// ParseLine converts a slice of tokens into a Statement.
// Distinguishes function definitions from expressions.
func ParseLine(tokens []token.Token) (ast.Statement, error)

Grammar (extended):

line     → funcdef | expr
funcdef  → IDENT '(' params ')' '=' expr
params   → IDENT (',' IDENT)*
expr     → term (('+' | '-') term)*
term     → factor (('*' | '/') factor)*
factor   → NUMBER | IDENT '(' args ')' | IDENT | '(' expr ')'
args     → expr (',' expr)*

Definition detection: Scan token stream for Equals token. If present → parse as function definition. If absent → parse as expression. This works because = is not valid in expressions.

Evaluator

package evaluator

// Evaluator holds function definitions and evaluates expressions.
type Evaluator struct {
    funcs map[string]*ast.FuncDef
}

// New creates a new Evaluator with an empty function registry.
func New() *Evaluator

// Define registers a function definition.
// Returns an error if a function with the same name is already defined.
func (e *Evaluator) Define(def *ast.FuncDef) error

// Eval evaluates an expression AST node.
// env provides variable bindings (function parameters).
// Pass nil for top-level evaluation.
func (e *Evaluator) Eval(node ast.Node, env map[string]float64) (float64, error)

Function call evaluation:

  1. Look up function name in registry
  2. Evaluate each argument expression in caller's environment
  3. Check argument count matches parameter count
  4. Create new environment: param[i] → argValue[i]
  5. Evaluate function body in new environment

Late binding: Function body references are resolved at call time, not definition time. This naturally supports cross-function calls as long as the called function is defined before the call is evaluated.

REPL

package repl

// Run starts the read-eval-print loop, reading from r and writing to w.
// Maintains function registry across lines.
func Run(r io.Reader, w io.Writer)

Line processing flow:

  1. Tokenize line
  2. ParseLine()Statement
  3. Switch on statement type:
    • *ast.FuncDefevaluator.Define(def), print "defined "
    • *ast.ExprStmtevaluator.Eval(expr, nil), print result

Package Layout

matheval/
├── cmd/
│   └── matheval/
│       └── main.go          # entry point, calls repl.Run
├── token/
│   └── token.go             # Token type and constants
├── lexer/
│   ├── lexer.go             # Tokenize function
│   └── lexer_test.go
├── ast/
│   └── ast.go               # AST node types + Statement types
├── parser/
│   ├── parser.go            # Parse + ParseLine functions
│   └── parser_test.go
├── evaluator/
│   ├── evaluator.go         # Evaluator struct with Define + Eval
│   └── evaluator_test.go
├── repl/
│   ├── repl.go              # REPL loop with state
│   └── repl_test.go
├── docs/
│   ├── design.md
│   └── plan.md
├── go.mod
└── README.md

Error Handling

  • Lexer: returns error for invalid characters (e.g. @, #)
  • Parser: returns error for syntax errors (unexpected token, mismatched parens, malformed definitions)
  • Evaluator: returns error for division by zero, undefined function, undefined variable, argument count mismatch, function redefinition
  • REPL: catches any error, prints it, prompts for next input

Key Design Decisions

  1. Statement vs Node separationStatement interface separates top-level constructs (definitions vs expressions) from expression nodes. This keeps the expression evaluator clean.
  2. Stateful Evaluator struct — replaces the previous stateless Eval() function. Required to hold the function registry. The Eval method still takes an explicit environment for testability.
  3. Late binding — function bodies reference other functions by name; resolved at call time. Simple and correct.
  4. Backward-compatible Parse() — existing Parse() function kept. New ParseLine() added for the REPL.
  5. Sealed AST interface — unexported marker method prevents external implementations, keeping the node set closed.
  6. Position tracking in tokens — enables precise error messages.
  7. REPL takes io.Reader/io.Writer — makes it testable without stdin/stdout.
  8. Definition detection via Equals scan — simple and unambiguous since = cannot appear in expressions.