token: add Ident, Comma, Equals token types
diff --git a/docs/design.md b/docs/design.md
index 9b9ffd5..d538708 100644
--- a/docs/design.md
+++ b/docs/design.md
@@ -4,10 +4,20 @@
- Language: Go
- Operators: `+`, `-`, `*`, `/` with parentheses
- Numbers: floating point (e.g. `3.14`, `42`, `0.5`)
-- Variables: none
+- **Functions: user-defined with `f(x) = x + 1` syntax**
- Interface: CLI REPL
- Error handling: print error message, continue REPL
+### Function Definition Requirements
+- **Syntax:** `f(x) = x + 1` — name, parenthesized params, `=`, body expression
+- **Multiple params:** `f(x, y) = x + y`
+- **Function calls:** Allowed anywhere a number can appear; arguments are arbitrary expressions
+- **Persistence:** Definitions persist across REPL lines
+- **Cross-calling:** Functions can call other user-defined functions (late binding)
+- **Built-ins:** None
+- **Redefinition:** Not allowed (error)
+- **Output on define:** Print "defined <name>"
+
## Approaches Considered
### 1. Recursive-Descent with AST (chosen)
@@ -41,17 +51,17 @@
β
βΌ
ββββββββββ
- β Parser β []Token → AST (Node)
+ β Parser β []Token → Statement (ExprStmt | FuncDef)
βββββ¬βββββ
β
βΌ
βββββββββββββ
- β Evaluator β Node → float64
+ β Evaluator β stateful: function registry + expression evaluation
βββββ¬ββββββββ
β
βΌ
ββββββββ
- β REPL β read line → eval → print result or error
+ β REPL β read line → parse → route (define or eval) → print
ββββββββ
```
@@ -65,19 +75,22 @@
type Type int
const (
- Number Type = iota
- Plus // +
- Minus // -
- Star // *
- Slash // /
- LParen // (
- RParen // )
- EOF
+ Number Type = iota // numeric literal
+ Plus // +
+ Minus // -
+ Star // *
+ Slash // /
+ LParen // (
+ RParen // )
+ Ident // identifier (e.g. f, x, myFunc)
+ Comma // ,
+ Equals // =
+ EOF // end of input
)
type Token struct {
Type Type
- Literal string // raw text, e.g. "3.14", "+"
+ Literal string // raw text, e.g. "3.14", "+", "f"
Pos int // position in input (for error messages)
}
```
@@ -89,6 +102,7 @@
// Tokenize converts an input string into a slice of tokens.
// Returns an error if the input contains invalid characters.
+// Recognizes: numbers, operators, parens, identifiers, comma, equals.
func Tokenize(input string) ([]token.Token, error)
```
@@ -97,7 +111,7 @@
```go
package ast
-// Node is the interface all AST nodes implement.
+// Node is the interface all expression AST nodes implement.
type Node interface {
node() // sealed marker method
}
@@ -113,6 +127,34 @@
Left Node
Right Node
}
+
+// Ident represents a variable reference (function parameter).
+type Ident struct {
+ Name string
+}
+
+// FuncCall represents a function call (e.g. f(1+2, 3)).
+type FuncCall struct {
+ Name string
+ Args []Node
+}
+
+// Statement is the interface for top-level parsed constructs.
+type Statement interface {
+ stmt() // sealed marker method
+}
+
+// ExprStmt wraps an expression used as a statement.
+type ExprStmt struct {
+ Expr Node
+}
+
+// FuncDef represents a function definition: name(params) = body
+type FuncDef struct {
+ Name string
+ Params []string
+ Body Node
+}
```
### Parser
@@ -120,37 +162,77 @@
```go
package parser
-// Parse converts a slice of tokens into an AST.
-// Returns an error for malformed expressions (mismatched parens, etc.).
+// Parse converts a slice of tokens into an expression AST.
+// Kept for backward compatibility.
func Parse(tokens []token.Token) (ast.Node, error)
+
+// ParseLine converts a slice of tokens into a Statement.
+// Distinguishes function definitions from expressions.
+func ParseLine(tokens []token.Token) (ast.Statement, error)
```
-Grammar (recursive-descent):
+Grammar (extended):
```
-expr → term (('+' | '-') term)*
-term → factor (('*' | '/') factor)*
-factor → NUMBER | '(' expr ')'
+line → funcdef | expr
+funcdef → IDENT '(' params ')' '=' expr
+params → IDENT (',' IDENT)*
+expr → term (('+' | '-') term)*
+term → factor (('*' | '/') factor)*
+factor → NUMBER | IDENT '(' args ')' | IDENT | '(' expr ')'
+args → expr (',' expr)*
```
+**Definition detection:** Scan token stream for `Equals` token. If present → parse as function definition. If absent → parse as expression. This works because `=` is not valid in expressions.
+
### Evaluator
```go
package evaluator
-// Eval evaluates an AST node and returns the result.
-// Returns an error on division by zero.
-func Eval(node ast.Node) (float64, error)
+// Evaluator holds function definitions and evaluates expressions.
+type Evaluator struct {
+ funcs map[string]*ast.FuncDef
+}
+
+// New creates a new Evaluator with an empty function registry.
+func New() *Evaluator
+
+// Define registers a function definition.
+// Returns an error if a function with the same name is already defined.
+func (e *Evaluator) Define(def *ast.FuncDef) error
+
+// Eval evaluates an expression AST node.
+// env provides variable bindings (function parameters).
+// Pass nil for top-level evaluation.
+func (e *Evaluator) Eval(node ast.Node, env map[string]float64) (float64, error)
```
+**Function call evaluation:**
+1. Look up function name in registry
+2. Evaluate each argument expression in caller's environment
+3. Check argument count matches parameter count
+4. Create new environment: `param[i] → argValue[i]`
+5. Evaluate function body in new environment
+
+**Late binding:** Function body references are resolved at call time, not definition time. This naturally supports cross-function calls as long as the called function is defined before the call is evaluated.
+
### REPL
```go
package repl
// Run starts the read-eval-print loop, reading from r and writing to w.
+// Maintains function registry across lines.
func Run(r io.Reader, w io.Writer)
```
+**Line processing flow:**
+1. Tokenize line
+2. `ParseLine()` → `Statement`
+3. Switch on statement type:
+ - `*ast.FuncDef` → `evaluator.Define(def)`, print "defined <name>"
+ - `*ast.ExprStmt` → `evaluator.Eval(expr, nil)`, print result
+
## Package Layout
```
@@ -164,15 +246,15 @@
β βββ lexer.go # Tokenize function
β βββ lexer_test.go
βββ ast/
-β βββ ast.go # AST node types
+β βββ ast.go # AST node types + Statement types
βββ parser/
-β βββ parser.go # Parse function
+β βββ parser.go # Parse + ParseLine functions
β βββ parser_test.go
βββ evaluator/
-β βββ evaluator.go # Eval function
+β βββ evaluator.go # Evaluator struct with Define + Eval
β βββ evaluator_test.go
βββ repl/
-β βββ repl.go # REPL loop
+β βββ repl.go # REPL loop with state
β βββ repl_test.go
βββ docs/
β βββ design.md
@@ -183,12 +265,16 @@
## Error Handling
- Lexer: returns error for invalid characters (e.g. `@`, `#`)
-- Parser: returns error for syntax errors (unexpected token, mismatched parens)
-- Evaluator: returns error for division by zero
+- Parser: returns error for syntax errors (unexpected token, mismatched parens, malformed definitions)
+- Evaluator: returns error for division by zero, undefined function, undefined variable, argument count mismatch, function redefinition
- REPL: catches any error, prints it, prompts for next input
## Key Design Decisions
-1. **Functional API over structs** — `Tokenize()`, `Parse()`, `Eval()` are stateless functions. No need for struct receivers since there's no configuration or state to carry.
-2. **Sealed AST interface** — unexported marker method prevents external implementations, keeping the node set closed.
-3. **Position tracking in tokens** — enables precise error messages ("error at position 5").
-4. **REPL takes io.Reader/io.Writer** — makes it testable without stdin/stdout.
+1. **Statement vs Node separation** — `Statement` interface separates top-level constructs (definitions vs expressions) from expression nodes. This keeps the expression evaluator clean.
+2. **Stateful Evaluator struct** — replaces the previous stateless `Eval()` function. Required to hold the function registry. The `Eval` method still takes an explicit environment for testability.
+3. **Late binding** — function bodies reference other functions by name; resolved at call time. Simple and correct.
+4. **Backward-compatible Parse()** — existing `Parse()` function kept. New `ParseLine()` added for the REPL.
+5. **Sealed AST interface** — unexported marker method prevents external implementations, keeping the node set closed.
+6. **Position tracking in tokens** — enables precise error messages.
+7. **REPL takes io.Reader/io.Writer** — makes it testable without stdin/stdout.
+8. **Definition detection via Equals scan** — simple and unambiguous since `=` cannot appear in expressions.
diff --git a/docs/plan.md b/docs/plan.md
index 112d4bb..dff29d6 100644
--- a/docs/plan.md
+++ b/docs/plan.md
@@ -1,57 +1,73 @@
-# Math Expression Evaluator — Implementation Plan
+# Implementation Plan: Function Definitions
-## Phase: Implement
+## Overview
+Bottom-up implementation through the stack: token → ast → lexer → parser → evaluator → repl → integration tests. Each step maintains backward compatibility and follows TDD.
-Steps are ordered. Each step includes writing the code and its unit tests (TDD).
+## Steps
-### Step 1: Project Skeleton
-- `go mod init matheval`
-- Create directory structure: `cmd/matheval/`, `token/`, `lexer/`, `ast/`, `parser/`, `evaluator/`, `repl/`
-- Create placeholder `main.go`
+### Step 1: Token layer (`token/token.go`)
+- Add `Ident`, `Comma`, `Equals` constants to `Type` enum
+- Update `String()` for new types
+- No tests needed — pure data types
-### Step 2: Token Package
-- Define `Type` enum constants
-- Define `Token` struct
-- Add `String()` method on `Type` for debugging
+### Step 2: AST layer (`ast/ast.go`)
+- Add `Ident` struct: `Name string`; implements `Node`
+- Add `FuncCall` struct: `Name string`, `Args []Node`; implements `Node`
+- Add `Statement` interface with sealed `stmt()` marker
+- Add `ExprStmt` struct: `Expr Node`; implements `Statement`
+- Add `FuncDef` struct: `Name string`, `Params []string`, `Body Node`; implements `Statement`
+- No tests needed — pure data types
-### Step 3: Lexer
-- Implement `Tokenize(input string) ([]Token, error)`
-- Handle: whitespace skipping, number literals (integers and decimals), operators `+-*/`, parentheses `()`, EOF, invalid characters
-- **Tests:** valid expressions, decimal numbers, invalid chars, empty input, whitespace-only
+### Step 3: Lexer (`lexer/lexer.go`)
+- Add `isLetter(ch byte) bool` helper
+- Before the single-char switch, add branch: if `isLetter(ch)`, scan identifier (letter then letters/digits), emit `Ident` token
+- Add `','` → `Comma` and `'='` → `Equals` to single-char switch
+- **Tests:** identifiers (`x`, `foo`, `f1`), comma, equals, full definition `f(x) = x + 1`, call `f(1, 2)`, mixed with numbers
-### Step 4: AST Package
-- Define `Node` interface with sealed marker
-- Define `NumberLit` struct
-- Define `BinaryExpr` struct
+### Step 4: Parser (`parser/parser.go`)
+- Extend `factor()`:
+ - `Ident` followed by `LParen` → parse `FuncCall`: consume `(`, parse args as comma-separated exprs, consume `)`
+ - `Ident` not followed by `LParen` → return `&ast.Ident{Name}`
+- Add `parseFuncDef()`: expects `Ident(` params `) = expr`
+- Add `ParseLine(tokens) (Statement, error)`:
+ - Scan for `Equals` token (not inside parens)
+ - If found → `parseFuncDef()` → `*ast.FuncDef`
+ - If not → `expr()` → `*ast.ExprStmt{Expr}`
+- Keep `Parse()` unchanged for backward compat
+- **Tests:** ParseLine for defs and exprs, factor for ident and func call, error cases
-### Step 5: Parser
-- Implement recursive-descent parser following grammar:
- - `expr → term (('+' | '-') term)*`
- - `term → factor (('*' | '/') factor)*`
- - `factor → NUMBER | '(' expr ')'`
-- Internal parser struct to track position in token slice
-- Return error on: unexpected token, mismatched parens, trailing tokens
-- **Tests:** single number, simple binary, precedence, parentheses, nested parens, error cases
+### Step 5: Evaluator (`evaluator/evaluator.go`)
+- Add `Evaluator` struct with `funcs map[string]*ast.FuncDef`
+- `New() *Evaluator`
+- `Define(def *ast.FuncDef) error` — error on redefinition
+- `Eval(node ast.Node, env map[string]float64) (float64, error)`:
+ - `*ast.NumberLit` → return value
+ - `*ast.BinaryExpr` → recurse left/right with same env
+ - `*ast.Ident` → lookup in env, error if not found
+ - `*ast.FuncCall` → lookup func, eval args in caller env, bind params, eval body in new env
+- Keep package-level `Eval(node) (float64, error)` as backward-compat wrapper
+- **Tests:** all existing tests still pass, new tests for Ident, FuncCall, Define, errors
-### Step 6: Evaluator
-- Implement `Eval(node ast.Node) (float64, error)`
-- Recursively walk AST
-- Return error on division by zero
-- **Tests:** literals, all 4 operators, nested expressions, division by zero
+### Step 6: REPL (`repl/repl.go`)
+- In `Run()`: create `evaluator.New()` before loop
+- Replace `evalLine()` with inline logic using `ParseLine()`
+- `*ast.FuncDef` → `ev.Define(def)`, print `"defined <name>"`
+- `*ast.ExprStmt` → `ev.Eval(stmt.Expr, nil)`, print result
+- **Tests:** define + call across lines, redefine error, undefined func error
-### Step 7: REPL
-- Implement `Run(r io.Reader, w io.Writer)`
-- Read line, tokenize, parse, evaluate, print result or error
-- Loop until EOF
-- **Tests:** successful expression, error expression, multi-line session
+### Step 7: Integration tests (`integration_test.go`)
+- Update `eval()`/`evalErr()` helpers to use `Evaluator` struct
+- Add tests:
+ - Define and call single-param function
+ - Define and call multi-param function
+ - Cross-function calls
+ - Nested function calls in expressions
+ - Error: undefined function
+ - Error: wrong argument count
+ - Error: function redefinition
+ - Error: undefined variable
-### Step 8: main.go
-- Wire `repl.Run(os.Stdin, os.Stdout)`
-
-### Step 9: Integration Test
-- End-to-end test: feed expression string through all stages, verify result
-- Test edge cases: deeply nested parens, long expressions
-
-### Step 10: Final Commit & README
-- Write README.md with usage instructions
-- Final commit
+## Backward Compatibility
+- `Parse()` remains unchanged — returns `ast.Node`
+- Package-level `Eval()` remains — wraps `New().Eval(node, nil)`
+- Existing tests must continue to pass at every step
diff --git a/token/token.go b/token/token.go
index d50eb36..2c9f5d3 100644
--- a/token/token.go
+++ b/token/token.go
@@ -13,6 +13,9 @@
Slash // /
LParen // (
RParen // )
+ Ident // identifier (e.g. "f", "x")
+ Comma // ,
+ Equals // =
EOF // end of input
)
@@ -33,6 +36,12 @@
return "("
case RParen:
return ")"
+ case Ident:
+ return "Ident"
+ case Comma:
+ return ","
+ case Equals:
+ return "="
case EOF:
return "EOF"
default: