[WIP] Simple Interpreter in Python
22 March 2022
Background
This is a project that I did with my friend, Weng Yan, for our Data Structures and Algorithms class. We were tasked to make a CLI expression interpreter using python. There was a constraint where we could only use libraries that are built-in to the python standard libraries and anaconda. No external libraries were allowed.
Looking around
We were given a simple guide and a simple example to start with. I went on some googling spree and ditched the starter example (almost completely). I wanted to challenge myself and build something with a more solid concept. (It was a bit overengineered again, but it looks extensible.)
I found out that there are several components that I need for this project: Menu, Token, Lexer, Parser, and Interpreter (we called in Evaluator).
Plan
After reading around and watching a few youtube videos, we began to synchronize the terms in our team.
- Menu is where the users can pick what they want to do and input their expressions.
- Tokens are the type of characters for the lexer to recognize (see below)
- Lexer is the component that breaks up the expression into tokens.
- Parser is the component that takes the tokens and turns them into a tree.
- Interpreter is the component that takes the tree and evaluates it.
Implementation
I cannot share the project link right now because it is a private repo on my friend's github account. I will share it as soon as I get the link.
In the meanwhile, I will post the project structure and some code snippets that I find to be the fundamentals of how this interpreter works.
Project Tree
Token & Token Types
We have 7 token types in this project:
NUMBER = 'NUMBER'
ADDITIVE_OPERATOR = 'ADDITIVE_OPERATOR'
MULTIPLICATIVE_OPERATOR = 'MULTIPLICATIVE_OPERATOR'
EXPONENTIAL_OPERATOR = 'EXPONENTIAL_OPERATOR'
OPEN_PARENTHESIS = 'OPEN_PARENTHESIS'
CLOSING_PARENTHESIS = 'CLOSING_PARENTHESIS'
EOF = 'EOF'
Lexer
Our Lexer Class is comprised of the following elements:
Constructors:
- __expression
- __input
- __current_char
Methods:
- error
- getNextToken
- __hasMoreTokens
- __number
- __isnum
- __isdot
- __isspace
- __isAdditiveOperator
- __isMultiplicativeOperator
- __isExponentialOperator
- __isAtom
- __isOpenParenthesis
- __isClosingParenthesis
- __advance
Parser
WIP
Interpreter
WIP
Testing
I wrote a simple testing script to test the evaluator
from compiler.Evaluator import Evaluator
import os, sys
os.system('color')
# binary expressions
b_exp = ['1+2', '1-2', '1*2', '1/2', '1%2', '1**2', '1//3']
# unary expressions
u_exp = ['1', '-1', '+1', '--1', '-+1']
# fully parenthesized expressions
fpe_exp = ['((23-2)*(342-2))', '((200+(4*3.14))/(2**3))']
# binary unary expressions
bu_exp = ['1+--1', '2----2', '--2 + 1', '-2 + 1']
# invalid expressions
iv_exp = ['1/0', '1 */ 2', '1 /* 2', '1 -* 2', '1 _ 2', '(1-2', '2-1)', 'abc', '-+']
def testing(expressions, is_valid):
print('------------------------------------')
for i, exp in enumerate(expressions):
evaluator = Evaluator()
result = evaluator.eval_expression(exp)
if is_valid:
sys.stdout.write('{} | {} \t\t| {}\t'.format(i, eval(exp), result))
if eval(exp) == result:
sys.stdout.write('\033[94m[PASSED]')
else:
sys.stdout.write('\033[33m[FAILED]')
else:
sys.stdout.write('{} | {} | {}\t'.format(i, exp, result))
if None == result:
sys.stdout.write('\033[94m[PASSED]')
else:
sys.stdout.write('\033[33m[FAILED]')
print('\033[m\n')
print('\n')
valid_test_cases = [b_exp, u_exp, fpe_exp, bu_exp]
invalid_test_cases = [iv_exp]
for case in valid_test_cases:
testing(case, True)
for case in invalid_test_cases:
testing(case, False)
Conclusion
WIP
Image source
Photo by Our Life in Pixels on Unsplash