Recursive Rowdy - redferret/rowdy GitHub Wiki
Project planning and problem break-down
Rowdy is a simple language and has a simple implementation. However it does not have a way to detect a recursive function. This could be a good way to warn the developer that a function may recurse forever, creating a sequence trap. Similar to a while true
loop with no break statement. Implement a simple analyzer that should construct a directed graph representation of function calls in a program and determine if that graph is strongly connected or has sub-graphs that are strongly connected.
In the example image below there is a program that enters always in main
but it enters an infinite loop once f1
is called. This program in a non-terminating program. This is of course the halting problem but we aren't trying to really determine if a program can halt or not but just to identify recursion.
The image below shows a program that could terminate but there are areas where it may cause a stack overflow as the program continues to execute. (keep in mind that the edges are showing a function call, that means if f1
calls f2
the sequence could return back to the original caller. Hence why this program may cause a stack overflow.
A new way to recurse
Problem statement
Each time a function is executed it is placed on a call stack. Each function has it's own symbol table and this is an issue as well. The stack itself is just a stack of functions. But this could get too large and memory could be completely eaten after just a few hundred function calls deep. There's no real easy way to analyze a program to determine if the program is self recursive as well. Is it possible that a program could be weakly connected yet still have a strongly connected sub-graph that never terminates and could cause a stack overflow? It's easy to determine that a series of functions are recursing by just analyzing the current call stack.
callstack = [f1, f2, f3, f4, f1, f2, f4, f1]
You can see here f1
has been called three times now and this means there are now three different frames that save the state of each call for f1
. This means there is some form of recursion happening here in the program. There is no way to determine though if this program is going to be guaranteed to cause a stack overflow or even a non-terminating state.
Solution 1
- Remove the symbol table from the
Function
class. This is too heavy-weighted for memory usage. - Instead of an actual stack of functions, create a
CallStack
class that allowsStackFrames
to be pushed on as a linked list. There will still be a representation of a stack but not as an array data structure either. Each frame does track a function and decouple the function from having a table of its own. - Optimize to find tail-recursive functions and execute them differently.
The basic purpose of a call stack is to track where the sequence currently is and to save state to the previous function call. The StackFrame
would track the sequence by caching variables in a single symbol table with a naming scheme.
When the program is booted up the first StackFrame
is created with the main
function and its variables.
Given a variable v1
declared in a function called f1
, when f1
is executed from main
a new StackFrame
is created for f1
, however if for instance f1
recurses without tail-recursion then there won't be another StackFrame
created, instead the StackFrame
would keep track of the call number for f1
and use it to create new variables the are identified with that frame.
This is fine for normal program execution and some mild recursion but it still can blow up when a recursive function happens deeply, many variables with the same name would be created using the call number and still eat up all the memory but it would still be more efficient than the current call stack model. The symbol table also needs a better data structure with both fast insertion and lookup but also expand naturally. A Hash table is a terrible implementation as it stands due to the underlying implementation of it. An AVL Tree or red-black tree could allow for this kind of fast reactive behavior of quick lookup, allocation, and allowing the table to grow and shrink without sacrificing CPU time to expand the table.
Tail-Recursion
The analyzer could be used to ID functions that have the last statement as a recursive call.
func factorial(n) {
if n == 0 { return 1 }
return n * $factorial(n - 1)
}
This is an example of a non-tail-recursive function because there is a state that needs to be remembered in order to do the full calculation.
func ex1(n) {
if n < 0 { return }
print '.'
$ex1(n - 1)
}
This is an example of a tail-recursive function because there is no need to save the previous state of the function call. The analyzer can mark the function as such so that when the function is called the StackFrame
will know how to update the variable n
and instead of recursing it will just be an iterative process and not a recursive one.
The example below the analyzer can just enter an iterative loop that will execute ex1
and the symbol table won't need to recreate the variable n
and will just effectively update n
equivalent to n = n - 1
for each function call. When the function terminates at the return
statement there is no need to cycle through any call stack, as there is no call stack, and just return back to main
.
To fix the above factorial
function to be tail-recursive we need to think about the problem by using the arguments only to calculate and accumulate your values. You can see this is equivalent to doing n = n - 1
and acc = acc * n
each time the function is executed. However notice that the compiler would need to identify which return statement is actually terminating this process. This is simple, combine the two. Any return statement that is not the last statement is the one that actually terminates the sequence and it's return value is the value we want to really return. The last statement in this function being also a return statement is just a mock return statement and the interpreter can just ignore it or even remove it.
func fact_tr(n, acc) {
if n == 0 { return acc } // this is the real return statement that terminates the execution.
return $fact_tr(n - 1, acc * n) // this return statement is a mock
}
a1. Make sure there is a way to terminate a recursive function, if there is no return statement, regardless of a set of conditions, always throw a NonTerminatedSequenceExpection
before the program runs.
func fact_tr(n, acc) {
return $fact_tr(n - 1, acc * n) // this return statement is a mock
}
Above will always throw this exception. There is no real way to escape this sequence.
- Identify the formal parameters and allocate them in the stack frame
- Identify the last statement and use the expressions for each parameter and store those into a list of expressions
- Execute the function's statement list minus the last statement
- Update the parameters (The current formal parameters) with the given expression list.
- Continue to execute the function until a return statement is found.
- if there is a value to be returned, use that ID to pull the current value from the symbol table and return it to the origin of the invocation.
Why do this?
func factorial(n) {
acc = 1
while (n > 0) {
acc *= n
n -= 1
}
return acc
}
Above is an iterative conversion from a recursive one. This is not as elegant as the first one that pretends to be recursive. Plus it would be faster anyways since it is implemented natively and not in Rowdy.
Analyzer notes
// Defined as a sequence trap, there is no way to escape
func recurse() {
$recurse()
}
// This is an infinite sequence and would create a weakly connected graph.
func main() {
$f1()
}
func f1() {
$f2()
}
func f2() {
$f3()
}
func f3() {
$f1()
}
// May recurse infinitely and may cause a stack overflow, there's no real way to actually know especially if it's based on user input.
func recurse(i, to) {
if i < to {
$recurse(1, to)
}
}
// Will not recurse infinitely but will cause a stack overflow because `i` being an integer would eventually wrap around to become positive.
func recurse(i, to) {
if i < to {
$recurse(i - 1, to)
}
}
// Will not recurse infinitely but may cause a stack overflow
func recurse(i, to) {
if i < to {
$recurse(i + 1, to)
}
}
// May recurse for a very long time and may cause a stack overflow
func recurse(p) {
var1 = rand()
if var1 > p {
$recurse()
}
}