[WIP] Simple Interpreter in Python

22 March 2022

Background

This is a project that I did with my friend, Weng Yan, for our Data Structures and Algorithms class. We were tasked to make a CLI expression interpreter using python. There was a constraint where we could only use libraries that are built-in to the python standard libraries and anaconda. No external libraries were allowed.

Looking around

We were given a simple guide and a simple example to start with. I went on some googling spree and ditched the starter example (almost completely). I wanted to challenge myself and build something with a more solid concept. (It was a bit overengineered again, but it looks extensible.)

I found out that there are several components that I need for this project: Menu, Token, Lexer, Parser, and Interpreter (we called in Evaluator).

Plan

After reading around and watching a few youtube videos, we began to synchronize the terms in our team.

Menu is where the users can pick what they want to do and input their expressions.
Tokens are the type of characters for the lexer to recognize (see below)
Lexer is the component that breaks up the expression into tokens.
Parser is the component that takes the tokens and turns them into a tree.
Interpreter is the component that takes the tree and evaluates it.

Implementation

I cannot share the project link right now because it is a private repo on my friend's github account. I will share it as soon as I get the link.

In the meanwhile, I will post the project structure and some code snippets that I find to be the fundamentals of how this interpreter works.

Project Tree

Token & Token Types

We have 7 token types in this project:

NUMBER = 'NUMBER'
ADDITIVE_OPERATOR = 'ADDITIVE_OPERATOR'
MULTIPLICATIVE_OPERATOR = 'MULTIPLICATIVE_OPERATOR'
EXPONENTIAL_OPERATOR = 'EXPONENTIAL_OPERATOR'
OPEN_PARENTHESIS = 'OPEN_PARENTHESIS'
CLOSING_PARENTHESIS = 'CLOSING_PARENTHESIS'
EOF = 'EOF'

Lexer

Our Lexer Class is comprised of the following elements:

Constructors:
- __expression
- __input
- __current_char
Methods:
- error
- getNextToken
- __hasMoreTokens
- __number
- __isnum
- __isdot
- __isspace
- __isAdditiveOperator
- __isMultiplicativeOperator
- __isExponentialOperator
- __isAtom
- __isOpenParenthesis
- __isClosingParenthesis
- __advance

Parser

WIP

Interpreter

WIP

Testing

I wrote a simple testing script to test the evaluator

from compiler.Evaluator import Evaluator
import os, sys
os.system('color')

# binary expressions
b_exp = ['1+2', '1-2', '1*2', '1/2', '1%2', '1**2', '1//3']

# unary expressions
u_exp = ['1', '-1', '+1', '--1', '-+1']


# fully parenthesized expressions
fpe_exp = ['((23-2)*(342-2))', '((200+(4*3.14))/(2**3))']

# binary unary expressions
bu_exp = ['1+--1', '2----2', '--2 + 1', '-2 + 1']

# invalid expressions
iv_exp = ['1/0', '1 */ 2', '1 /* 2', '1 -* 2', '1 _ 2', '(1-2', '2-1)', 'abc', '-+']

def testing(expressions, is_valid):
    print('------------------------------------')
    for i, exp in enumerate(expressions):
        evaluator = Evaluator()
        result = evaluator.eval_expression(exp)

        if is_valid:
            sys.stdout.write('{} | {} \t\t| {}\t'.format(i, eval(exp), result))
            if eval(exp) == result:
                sys.stdout.write('\033[94m[PASSED]')
            else:
                sys.stdout.write('\033[33m[FAILED]')
        else:
            sys.stdout.write('{} | {} | {}\t'.format(i, exp, result))
            if None == result:
                sys.stdout.write('\033[94m[PASSED]')
            else:
                sys.stdout.write('\033[33m[FAILED]')
        print('\033[m\n')
    print('\n')

valid_test_cases = [b_exp, u_exp, fpe_exp, bu_exp]
invalid_test_cases = [iv_exp]

for case in valid_test_cases:
    testing(case, True)

for case in invalid_test_cases:
    testing(case, False)

Conclusion

WIP

Image source

Photo by Our Life in Pixels on Unsplash