This is a post about... writing code to generate BF code that can do any computation. If you're not familiar with BF, this post will explain everything you need. However, I do assume that you're familiar with Python, C++, and have some general programming background.
The main challenge with is that the number of possible BF programs that are as "complex" as BF code that can multiply 2 bytes is a HUGE number... easily more than 20 digits.
First and foremost, its a way for me to avoid saying BrainFuck because I need you to know that I'm a serious professional and adult who doesn't say F*** too much. I'll refer to it as BF from now on.
If you're already familiar with BF, you can safely skip this section.
Secondly, BF is a very simple programming language that consists of just 8 instructions. The BF Wikipedia article. If you're familiar with Turing machines, BF programs get pretty close and, like a Turing machine, can perform any computation a regular computer can... if you're willing to wait long enough and have enough memory.
BF programs are text-based programs. They can't display pictures and buttons; they can read text you type in (one character at a time) and they can print out text (one character at a time).
A byte can represent any whole number between 0 and 255. Some numbers are used to represent characters, for example the numbers 48, 65, and 97 represent '0'
, 'A'
, and 'a'
respectively. As a very nice convenience, the other digits and letters like '1'
, 'B'
, and 'c'
all have numbers that are just 1, 2, and 3 more than '0'
, 'A'
, and 'a'
. So... if you wanted to know what number represents 'h'
, its 97 + 7 = 104.
When a BF program starts, all bytes it has access to are set to 0 and the program starts out only operating on just the 1st byte. Lets say that we wanted to print out "hello". First we have to get our byte from the value 0 to the value 104 so we can print out the "H". We do that with the increment instruction, +
, which just adds 1 to our current byte. To get from 0 to 104, we need to repeat +
104 times.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Before the code ran, the bytes in memory looked like [0, 0, 0, ...] but after, they contain [104, 0, 0, ...]
Now we can output our byte using the .
instruction.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
.
Outputs:
h
Great! Now we just need to output the remaining letters "ello", which are represented with 101, 108, 108, and 111. To go from 104 to 101, we use the decrement instruction, -
, which subtracts 1 from the current byte. Note that the increment and decrement instructions wrap-around, meaning that incrementing a 255 byte sets it to 0 and decrementing a 0 byte sets is to 255.
Note also that the +
s being on the first line and .
on the second line is just to make the program easier to read. Any characters in a BF program that are not an instruction, are simply ignored by the BF program.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
.
---
.
Outputs:
he
Now we can finish the rest of the "hello" program!
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
.
---
.
+++++++
.
.
+++
.
Outputs:
hello
Now, lets say we want to write a program that reads 2 digits for the user, then outputs their sum. Something like this:
Digits to add: 23
2+3=5
First, we print "Digits to add: "
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ .
+++++++++++++++++++++++++++++++++++++ .
-- .
++ .
+++++++++++ .
- .
----------------------------------------------------------------------------------- .
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ .
----- .
------------------------------------------------------------------------------- .
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ .
+++ .
.
------------------------------------------ .
-------------------------- .
Once the code above has run, our memory contains the value of the last character printed and a lot of 0's: [32, 0, 0, ...].
Next, we read the 1st digit using the read instruction, ,
. This instruction will read a single character ant set the current byte to that value. In our example, the user enters 2
so the byte will have the value 50 = 48 + 2. Our memory looks like [50, 0, 0, ...]
After that, we need to read the next digit... into a different byte. For that, we use the >
instruction to go one byte to the right, then read into that byte via ,
. Once we've done that, the memory contains [50, 51, 0, 0, 0, ...]
Excellent! Now we can print out the 2+3=
one character at a time
<.
>> +++++++++++++++++++++++++++++++++++++++++++ .
<.
> ++++++++++++++++++ .
Our next step is to calculate the digit we want to print. We're so close! Right now our first 2 bytes in memory have the ASCII value for the digits we want to add and what we need to compute is the ASCII value of the resulting digit. (Note: We're not going to worry about the digits adding up to more than 9
for now...)
For that we'll need the loop instructions, [
which starts a loop and ]
which ends a loop. [
will skip to the corresponding ]
instruction if the current byte is 0, otherwise the program will just run whatever code is between the [
and ]
. Let's take some time to understand loops before we use them: The ]
will jump backwards to the corresponding [
. The simplest loop you can write is just []
which will either loop infinitely or do nothing. Perhaps the next simplest loop you can write is [-]
which will decrement the current byte until its 0 or maybe the next simplest loop is [>]
which will go to the next byte in memory that is 0.
Ok, now coming back to our task, the first 2 bytes in memory contain the ASCII values of the digits we want to add, e.g. 50 and 51. We want to calculate the ASCII value for the sum of those digits, e.g. 50-48 + 51-48 + 48 = 53. First, lets subtract 48 from both bytes in memory.
<< ------------------------------------------------
> ------------------------------------------------
Now memory looks like [2, 3, 0, 0, 0, ...].
Lastly, we can loop on the 2nd byte, which we'll call memory[1]
and decrement by 1 then increment the first byte, memory[0]
, by 1. Since the loop will execute that code until memory[1]
is 0, it will run 3 times (or more or less if the user entered a different digit).
[-<+>]
Now the memory looks like [5, 0, 0, 0, 0, ...].
Finally, we add 48 back to memory[0]
and print it!
< ++++++++++++++++++++++++++++++++++++++++++++++++ .
Combining that all together, here's our full BF program:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ .
+++++++++++++++++++++++++++++++++++++ .
-- .
++ .
+++++++++++ .
- .
----------------------------------------------------------------------------------- .
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ .
----- .
------------------------------------------------------------------------------- .
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ .
+++ .
.
------------------------------------------ .
-------------------------- .
,
>,
<.
>> +++++++++++++++++++++++++++++++++++++++++++ .
<.
> ++++++++++++++++++ .
<< ------------------------------------------------
> ------------------------------------------------
[-<+>]
< ++++++++++++++++++++++++++++++++++++++++++++++++ .
If the user enters "23", then the program outputs:
Digits to add: 23
2+3=5
If the user enters "63", then the program outputs:
Digits to add: 63
6+3=9
If the user enters "99", then the program outputs:
Digits to add: 99
9+9=B
Perfection.
Writing BF directly is quite hard, so to make it a little easier, I wanted to write code that finds/generates snippets of BF code to do an arbitrary computation.
For example, that we wanted to update our calculator app so that it prints "9+9=18" instead of "9+9=B". To do that, we need to be able to compute things like 'is x > 10?', 'x / 10', and 'x modulo 10' where x is one of the bytes in memory
.
At this point, I already knew the BF code [->>+<<]>>[-<[-<+>>>+<<]>>[-<<+>>]<]<[-]
replaces memory[0]
with memory[0] * memory[1]
and sets memory[1]
to 0. And that program is 40 instructions total, so to generate BF code that does something more complicated than multiplication, it probably needs to be able to find/generate code snippets that are around 40 instructions or more.
Since we're just performing computations without taking performing input or output, there are 6 instructions available. Each [
must be paired with a ]
instruction so the number of BF programs with exactly 40 instructions is less than 5^40 = 9,094,947,017,729,282,379,150,390,625... which would only take about 5^40 / (365.25 * 24 * 60 * 60 * 4,000,000,000 * 1,000,000,000) ~= 72 years if we can check each program in 1 clock cycle on a billion 4GHz CPU cores. If you include checking all the programs that are less than 40 instructions, it only adds about 18 more years, but if you check up to 41 instruction, it takes centuries.
While there may be many BF programs, most of them are probably performing the same calculation. All the following snippets just set the current byte to 1 without modifying any other bytes: [[-]]+
, [--+]+
, [-]--<>+++
, [+]++-
.
We also just want to compute simple things like memory[0] * memory[1]
or memory[0] / memory[1]
or memory[0] <= memory[1]
. All of these calculations take 2 bytes as input and have 1 byte as output. We might want to have more inputs or outputs but lets just stick with 2 inputs and 1 output for now.
At this point, there are many possible programs but also many duplicate programs.
We can simplify our problem a bit by removing the >
and <
instructions and adding in an "index" or "register" that each instruction applies to. For example, instead of writing [-]
, we could write [0 -0 ]0
to say that the loop-start, decrement, and loop-end instructions all happen on memory[0]
. This can help us skip bits of code that contains ><
or <>
and helps us just focus on BF snippets that access a limited amount of memory (which will be useful for other reasons later).
Rewriting BF [->+<]
to RBF (Register BF), we have [0 -0 +1 ]0
. Similarly, rewriting our BF for multiplication [->>+<<]>>[-<[-<+>>>+<<]>>[-<<+>>]<]<[-]
, we get [0 -0 +2 ]0 [2 -2 [1 -1 +0 +3 ]1 [3 -3 +1 ]3 ]2 [1 -1 ]1
. Note that when we use R registers, then the number of instructions is 4 * R.
Now the total number of programs of length 40 instructions is dependent on the number of registers (i.e. bytes in memory) that we use. For multiplication, we needed 4 registers and we used 19 RBF instructions. To simplify again, we can pretend there's only 1 type of loop instruction to get a lower bound on the number of programs that are "similarly complex" to our multiplication program: (3 * 4) ^ 19 = 319,479,999,370,622,926,848. That's already 2.8 billion times less than our BF estimate, though it's still quite large.
So how many "unique" BF programs can there be that take 2 bytes as input and output 1 byte? There are 256^2 = 65,536 possible inputs and each of those inputs could have 256 different outputs xor the program can infinite loop. That's (256 + 1)^(256^2) = ... a rather large number with log(257, 10) * 256^2 ~= 157,938 digits.
Well... ok.
Do we really need 256 values when we're looking for instructions? There's nothing about [->+<]
that really depends on the number of values each byte or "cell" can hold. The same thing is true for the BF multiplication code. If we ran it on a BF machine where each cell could only go up to 10 instead of 256, the code would work essentially the same as it does now.
So how many unique BF programs can there be that take 2 cells as input and output 1 cell? If each cell can represent any whole number from 0 to M - 1: There are M^2 possible inputs and each of those inputs could have M different outputs xor infinite loop. That's (M + 1)^(M^2) total unique simple programs.
M | (M + 1)^(M^2) |
---|---|
2 | 81 |
3 | 262,144 |
4 | 152,587,890,625 |
5 | 2.84303E+19 |
6 | 2.65173E+30 |
7 | 1.78406E+44 |
8 | 1.17902E+61 |
9 | 1E+81 |
10 | 1.37806E+104 |
11 | 3.81005E+130 |
12 | 2.55766E+160 |
13 | 4.96179E+193 |
14 | 3.26503E+230 |
15 | 8.45271E+270 |
Now we're talking! There's only 1 potential issue: We might need more than just 2 registers. The RBF multiplication code used 4 registers though... it assumed registers 2 and 3 started at 0 and when the code was done running, registers 2 and 3 were back to 0. We'll come back to this later...
First, let's make sure we can detect when a program does what we want...
import dataclasses
import itertools
from typing import Literal
NUM_CELL_VALUES = 256
NUM_REGISTERS = 2
NUM_INPUT_REGISTERS = 2
NUM_OUTPUT_REGISTERS = 1
class RBFException(Exception):
pass
class MismatchedBraces(RBFException):
pass
class InvalidProgram(RBFException):
pass
Instruction = tuple[Literal['+', '-', '[', ']'], int]
def get_matching_braces_dict(instructions: list[Instruction]) -> dict[int, int]:
matches = {}
left_braces = []
for ip, (op_code, register) in enumerate(instructions):
if op_code == '[':
left_braces.append((ip, register))
elif op_code == ']':
if not left_braces:
raise MismatchedBraces(f'Missing left brace for right brace at {ip}: {instructions}')
left_ip, left_register = left_braces.pop()
if left_register != register:
raise MismatchedBraces(
f'Left brace register {left_register} at {left_ip} does not match right brace register {register} '
f'at {ip}: {instructions}')
matches[ip] = left_ip
matches[left_ip] = ip
if left_braces:
raise MismatchedBraces(f'Missing right brace for left brace(s) at {left_braces}: {instructions}')
return matches
@dataclasses.dataclass
class RBFProgram:
instructions: list[Instruction]
matching_braces: dict[int, int]
def __init__(self, instructions: list[Instruction], matching_braces: dict[int, int]=None):
if matching_braces is None:
matching_braces = get_matching_braces_dict(instructions)
for op_code, register in instructions:
if op_code not in '+-[]' or register < 0:
raise InvalidProgram(str(instructions))
self.instructions = instructions
self.matching_braces = matching_braces
def run(self, memory) -> None:
'''Run this RBF program on the memory. This WILL modify the memory.'''
ip = 0 # instruction pointer
while ip < len(self.instructions): # Once we've reached the end of our instructions, we're done!
op_code, register = self.instructions[ip]
if op_code == '+':
memory[register] = (memory[register] + 1) % NUM_CELL_VALUES
ip += 1
elif op_code == '-':
memory[register] = (memory[register] - 1) % NUM_CELL_VALUES
ip += 1
elif op_code == '[':
if memory[register]:
ip += 1
else:
ip = self.matching_braces[ip] + 1
elif op_code == ']':
ip = self.matching_braces[ip]
@staticmethod
def from_rbf_str(rbf_str: str) -> 'RBFProgram':
rbf_str = ''.join(c for c in rbf_str if c in '+-[]1234567890')
# For now, we only support up to 10 registers
instructions = [(op_code, int(register)) for op_code, register in zip(rbf_str[::2], rbf_str[1::2])]
return RBFProgram(instructions)
# For now, we're trying to write a program that can find something "as complex" as multiplication, so we just check
# that the program can multiply 2 cells.
def program_works(program: RBFProgram) -> bool:
for a, b in itertools.product(range(NUM_CELL_VALUES), repeat=NUM_INPUT_REGISTERS):
memory = [a, b, 0, 0]
program.run(memory)
if memory != [(a * b) % NUM_CELL_VALUES, 0, 0, 0]:
return False
return True
def main():
multiplier = RBFProgram.from_rbf_str('[0 -0 +2 ]0 [2 -2 [1 -1 +0 +3 ]1 [3 -3 +1 ]3 ]2 [1 -1 ]1')
assert program_works(multiplier)
adder = RBFProgram.from_rbf_str('[1 -1 +0 ]1')
assert not program_works(adder)
if __name__ == '__main__':
main()
This works. However, it takes over a minute to run on my laptop.
I find RBF code harder to read. Let's focus take a look at the (BF version) of our multiplication program to see what can be sped up.
class RBFProgram:
...
def as_bf_str(self, starting_register: int=0) -> str:
bf: list[str] = []
cur_register = starting_register
for op_code, next_register in self.instructions:
bf.append((next_register - cur_register) * '>')
bf.append((cur_register - next_register) * '<')
bf.append(op_code)
cur_register = next_register
return ''.join(bf)
multiplier = RBFProgram.from_rbf_str('[0 -0 +2 ]0 [2 -2 [1 -1 +0 +3 ]1 [3 -3 +1 ]3 ]2 [1 -1 ]1')
print(multiplier.as_bf_str(2))
# Outputs:
# <<[->>+<<]>>[-<[-<+>>>+<<]>>[-<<+>>]<]<[-]
There are a few patterns, we can recognize:
- [-]
Sets memory[0]
to 0.
- [->>+<<]
Adds memory[0]
to memory[2]
and sets memory[0]
to 0
- [-<+>>>+<<]
Destructively adds one cell to two other cells
If we generalize these a bit, we consider BF like [->-->+++<<]
(and its equivalent RBF) that don't just add one cell to another but instead add one cell times some constant to another; memory[1] -= 2 * memory[0]
, memory[2] += 3 * memory[0]
, memory[0] = 0
.
Instead of each RBFProgram
being a single Python object where the instructions control everything, let's make RBFProgram
that are specialized and delegate to each other. This will allow us to swap out pieces of the RBF program for more optimized versions.
Here's the new version of RBFProgram
before we apply any optimizations:
# Note that no longer need get_matching_braces_dict
class RBFProgram:
@abstractmethod
def run(self, memory: list[int]) -> None:
raise NotImplementedError()
@abstractmethod
def instructions(self) -> Iterator[Instruction]:
raise NotImplementedError()
def as_bf_str(self, starting_register: int=0) -> str:
bf: list[str] = []
cur_register = starting_register
for op_code, next_register in self.instructions():
bf.append((next_register - cur_register) * '>')
bf.append((cur_register - next_register) * '<')
bf.append(op_code)
cur_register = next_register
return ''.join(bf)
@staticmethod
def from_rbf_str(rbf_str: str) -> 'RBFProgram':
rbf_str = ''.join(c for c in rbf_str if c in '+-[]1234567890')
instructions = [(op_code, int(register)) for op_code, register in zip(rbf_str[::2], rbf_str[1::2])]
programs: list[list['RBFProgram']] = [[]]
for ip, (op_code, register) in enumerate(instructions):
if op_code == '-':
programs[-1].append(Increment(register, -1))
elif op_code == '+':
programs[-1].append(Increment(register, 1))
elif op_code == '[':
programs.append([])
elif op_code == ']':
if len(programs) <= 1:
raise MismatchedBraces(f'Missing left brace for right brace at {ip}: {instructions}')
inner_code = Progn.or_single_program(programs.pop())
programs[-1].append(Loop(register, inner_code))
if len(programs) != 1:
raise MismatchedBraces(f'Missing right brace for left brace(s): {instructions}')
return Progn.or_single_program(programs[0])
class Increment(RBFProgram):
"""Increments a cell by ANY amount."""
def __init__(self, register: int, amount: int):
self.register = register
self.amount = amount
def run(self, memory: list[int]) -> None:
memory[self.register] = (memory[self.register] + self.amount) % NUM_CELL_VALUES
def instructions(self) -> Iterator[Instruction]:
for _ in range(self.amount):
yield '+', self.register
for _ in range(-self.amount):
yield '-', self.register
def __repr__(self):
return f'Increment({self.register!r}, {self.amount!r})'
class Loop(RBFProgram):
"""Loops until memory[self.register] != 0."""
def __init__(self, register: int, inner_code: RBFProgram):
self.register = register
self.inner_code = inner_code
def run(self, memory: list[int]) -> None:
while memory[self.register]:
self.inner_code.run(memory)
def instructions(self) -> Iterator[Instruction]:
yield '[', self.register
yield from self.inner_code.instructions()
yield ']', self.register
def __repr__(self):
return f'Loop({self.register!r}, {self.inner_code!r})'
class Progn(RBFProgram):
"""A series of RBFPrograms run in sequence."""
def __init__(self, programs: list[RBFProgram]):
self.programs = programs
def run(self, memory: list[int]) -> None:
for p in self.programs:
p.run(memory)
def instructions(self) -> Iterator[Instruction]:
for p in self.programs:
yield from p.instructions()
@staticmethod
def or_single_program(programs: list[RBFProgram]) -> RBFProgram:
if len(programs) == 1:
return programs[0]
return Progn(programs)
def __repr__(self):
return f'Progn({self.programs!r})'
...
def main():
adder = RBFProgram.from_rbf_str('[1 -1 +0 ]1')
print(adder.as_bf_str(), '-->', repr(adder))
print()
multiplier = RBFProgram.from_rbf_str('[0 -0 +2 ]0 [2 -2 [1 -1 +0 +3 ]1 [3 -3 +1 ]3 ]2 [1 -1 ]1')
print(multiplier.as_bf_str(2), '-->', repr(multiplier))
print()
'''
Outputs:
>[-<+>] --> Loop(1, Progn([Increment(1, -1), Increment(0, 1)]))
<<[->>+<<]>>[-<[-<+>>>+<<]>>[-<<+>>]<]<[-] --> Progn([Loop(0, Progn([Increment(0, -1), Increment(2, 1)])), Loop(2, Progn([Increment(2, -1), Loop(1, Progn([Increment(1, -1), Increment(0, 1), Increment(3, 1)])), Loop(3, Progn([Increment(3, -1), Increment(1, 1)]))])), Loop(1, Increment(1, -1))])
'''
Now, we can detect Loop
s that only contain Increment
s and replace them with a new DestructiveAdd class:
class DestructiveAdd(RBFProgram):
"""An optimized version of [-], [->+<], [+>++>---<<], etc."""
def __init__(self, source_register: int, targets_and_multipliers: list[tuple[int, int]]=None):
if targets_and_multipliers is None:
targets_and_multipliers = []
self.source_register = source_register
self.targets_and_multipliers = targets_and_multipliers
def run(self, memory: list[int]) -> None:
source_amount = memory[self.source_register]
for register, multiplier in self.targets_and_multipliers:
memory[register] = (memory[register] + multiplier * source_amount) % NUM_CELL_VALUES
memory[self.source_register] = 0
def instructions(self) -> Iterator[Instruction]:
yield '[', self.source_register
yield '-', self.source_register
for register, multiplier in self.targets_and_multipliers:
for _ in range(multiplier):
yield '+', register
for _ in range(-multiplier):
yield '-', register
yield ']', self.source_register
def __repr__(self):
return f'DestructiveAdd({self.source_register!r}, {self.targets_and_multipliers!r})'
def simplify_to_destructive_add(original_loop: Loop, increments: list[Increment]) -> RBFProgram:
source_increments = [i for i in increments if i.register == original_loop.register]
source_amount = sum(i.amount for i in source_increments)
if source_amount % NUM_CELL_VALUES == 0:
raise InfiniteLoop()
if math.gcd(source_amount, NUM_CELL_VALUES) != 1:
# Anytime memory[source_register] is not a multiple of source_amount, this is an infinite loop!
# We can't optimize this... easily
return original_loop
if source_amount != -1 and source_amount != 1:
# TODO: Calculate the multiplicative inverse of source_amount modulo NUM_CELL_VALUES so we optimize!
return original_loop
multiplicative_inverse = -source_amount # Only true if source_amount == -1 or 1
targets_and_amounts = collections.defaultdict(int)
for increment in increments:
if increment.register != original_loop.register:
targets_and_amounts[increment.register] += multiplicative_inverse * increment.amount
if targets_and_amounts[increment.register] == 0:
del targets_and_amounts[increment.register]
return DestructiveAdd(original_loop.register, sorted(targets_and_amounts.items()))
def simplified_program(program: RBFProgram) -> RBFProgram:
if isinstance(program, Progn):
inner_programs = [simplified_program(p) for p in program.programs]
if len(inner_programs) == 1:
return inner_programs[0]
if all(simplified is original for simplified, original in zip(inner_programs, program.programs)):
return program
else:
return Progn.or_single_program(inner_programs)
elif isinstance(program, Loop):
inner_code = simplified_program(program.inner_code)
if isinstance(inner_code, Increment):
return simplify_to_destructive_add(program, [inner_code])
elif isinstance(inner_code, Progn) and all(isinstance(p, Increment) for p in inner_code.programs):
return simplify_to_destructive_add(program, inner_code.programs)
elif inner_code is not program.inner_code:
return Loop(program.register, inner_code)
# Note: We can perform a lot more optimizations, but we'll start here for now.
return program
...
def main():
multiplier = RBFProgram.from_rbf_str('[0 -0 +2 ]0 [2 -2 [1 -1 +0 +3 ]1 [3 -3 +1 ]3 ]2 [1 -1 ]1')
print('Original multiplier:')
print(multiplier.as_bf_str(2))
print(repr(multiplier))
print()
print('Simplified multiplier:')
multiplier = simplified_program(multiplier)
print(multiplier.as_bf_str(2))
print(repr(multiplier))
print()
assert program_works(multiplier)
adder = RBFProgram.from_rbf_str('[1 -1 +0 ]1')
print('Original adder:')
print(adder.as_bf_str())
print(repr(adder))
print()
adder = simplified_program(adder)
print('Simplified adder:')
print(adder.as_bf_str())
print(repr(adder))
print()
assert not program_works(adder)
'''
Outputs:
Original multiplier:
<<[->>+<<]>>[-<[-<+>>>+<<]>>[-<<+>>]<]<[-]
Progn([Loop(0, Progn([Increment(0, -1), Increment(2, 1)])), Loop(2, Progn([Increment(2, -1), Loop(1, Progn([Increment(1, -1), Increment(0, 1), Increment(3, 1)])), Loop(3, Progn([Increment(3, -1), Increment(1, 1)]))])), Loop(1, Increment(1, -1))])
Simplified multiplier:
<<[->>+<<]>>[-<[-<+>>>+<<]>>[-<<+>>]<]<[-]
Progn([DestructiveAdd(0, [(2, 1)]), Loop(2, Progn([Increment(2, -1), DestructiveAdd(1, [(0, 1), (3, 1)]), DestructiveAdd(3, [(1, 1)])])), DestructiveAdd(1, [])])
Original adder:
>[-<+>]
Loop(1, Progn([Increment(1, -1), Increment(0, 1)]))
Simplified adder:
>[-<+>]
DestructiveAdd(1, [(0, 1)])
'''
Now the code runs much faster.
We know that when we generate code to evaluate, we will inevitably generate code loops infinitely. While some loops can be detected with static analysis, like the raise InfiniteLoop()
in simplify_to_destructive_add
, there will be infinite loops that we can only detect when we start running the program.
Let's walk through 3 strategies.
Since we know the maximum number of possible values that the memory can take on, We can let each loop of the program run for up to that many steps. Any loop that runs beyond that many steps must be looping infinitely.
For example: If NUM_REGISTERS == 2
and NUM_CELL_VALUES == 10
, then we know that any loop can loop at most NUM_CELL_VALUES
^NUM_REGISTERS
= 100 times unless its an infinite loop!
class Loop(RBFProgram):
...
def run(self, memory: list[int]) -> None:
max_iterations = NUM_CELL_VALUES**NUM_REGISTERS
num_iterations = 0
while memory[self.register]:
self.inner_code.run(memory)
num_iterations += 1
if num_iterations > max_iterations:
raise InfiniteLoop()
Unfortunately this doesn't work when you have nested loops. If each loop is allowed to run 100
times and you have K nested loops, you quickly get to a billion steps for just one program.
Another reason this approach doesn't work is that we need to detect infinite loops faster so we don't spend a lot of time running RBF with infinite loops in it.
The idea here is very similar to '1. Stop the program after N steps'. Instead of keeping track of how many times each loop has run, each loop keeps track of which "memories" its seen before.
class Loop(RBFProgram):
...
def run(self, memory: list[int]) -> None:
past_memories = set()
while memory[self.register]:
self.inner_code.run(memory)
cur_memory = tuple(memory)
if cur_memory in past_memories:
raise InfiniteLoop()
past_memories.add(cur_memory)
This will only run the loop until the memory ends up in a state we've seen before. The good news is we will detect infinite loops much sooner. The bad news is that it will use more memory and will run slower since it has to take snapshots of the memory and constantly check if we've seen that memory state before.
If you've ever heard of the "fast and slow pointers" solution to detecting loops and linked lists, then this approach will look very familiar to you.
I think it's easiest to imagine 2 runners racing on a trail through a forest. The runners don't know if the trail loops back on itself. One runner is twice as fast as the other runner. The runners start running at the same time from the same spot. If there's no infinite loop in the trail, then the runners will never see each other... but if there is an infinite loop, then the fast runner will eventually see the slow runner again.
class Loop(RBFProgram):
...
def run(self, memory: list[int]) -> None:
fast_memory = memory
slow_memory = list(memory) # Copy the memory
while fast_memory[self.register]:
self.inner_code.run(fast_memory)
if fast_memory == slow_memory:
raise InfiniteLoop()
if not fast_memory[self.register]:
break
self.inner_code.run(fast_memory)
# Note: Since fast_memory[self.register] has always been non-zero,
# then slow_memory[self.register] has always been non-zero.
self.inner_code.run(slow_memory)
if fast_memory == slow_memory:
raise InfiniteLoop()
Probably option 3 but... it depends.
Let's say that K is the number of (nested) loops in the RBF and L is the length of the infinite loop before it starts repeating, then the memory and time big-O tradeoffs probably look something like this:
Algorithm | Memory | Runtime |
---|---|---|
Don't detect infinite loops | O(1) | O(infinity) |
Run N steps | O(1) | O((NUM_REGISTERS^NUM_CELL_VALUES)^K) |
Keep track of past memory states | O(K * L) | O(L^K) |
Fast and Slow | O(K) | O(L^K) |
Big-O is not the only thing that matters, so it's a good idea to try out different strategies, especially since options 2 and 3 are harder to distinguish!
The simplest strategy would be to try building the programs up one instruction at a time. Something like this:
...
class MismatchedBraces(RBFException):
pass
class MissingLeftBrace(MismatchedBraces):
pass
class MissingRightBrace(MismatchedBraces):
pass
...
class RBFProgram:
...
@staticmethod
def from_instructions(instructions: list[Instruction]) -> 'RBFProgram':
programs: list[list['RBFProgram']] = [[]]
for ip, (op_code, register) in enumerate(instructions):
if op_code == '-':
programs[-1].append(Increment(register, -1))
elif op_code == '+':
programs[-1].append(Increment(register, 1))
elif op_code == '[':
programs.append([])
elif op_code == ']':
if len(programs) <= 1:
raise MissingLeftBrace(f'Missing left brace for right brace at {ip}: {instructions}')
inner_code = Progn.or_single_program(programs.pop())
programs[-1].append(Loop(register, inner_code))
if len(programs) != 1:
raise MissingRightBrace(f'Missing right brace for left brace(s): {instructions}')
return Progn.or_single_program(programs[0])
@staticmethod
def from_rbf_str(rbf_str: str) -> 'RBFProgram':
rbf_str = ''.join(c for c in rbf_str if c in '+-[]1234567890')
instructions = [(op_code, int(register)) for op_code, register in zip(rbf_str[::2], rbf_str[1::2])]
return RBFProgram.from_instructions(instructions)
...
class Loop(RBFProgram):
...
def run(self, memory: list[int]) -> None:
fast_memory = memory
slow_memory = list(memory) # Copy the memory
while fast_memory[self.register]:
self.inner_code.run(fast_memory)
if fast_memory == slow_memory:
raise InfiniteLoop()
if not fast_memory[self.register]:
break
self.inner_code.run(fast_memory)
# Note: Since fast_memory[self.register] has always been non-zero,
# then slow_memory[self.register] has always been non-zero.
self.inner_code.run(slow_memory)
if fast_memory == slow_memory:
raise InfiniteLoop()
...
# For now, we're trying to write a program that can find something "as complex" as multiplication, so we just check
# that the program can multiply 2 cells.
def program_works(program: RBFProgram) -> bool:
for a, b in itertools.product(range(NUM_CELL_VALUES), repeat=NUM_INPUT_REGISTERS):
memory = [a, b, 0, 0]
try:
program.run(memory)
except InfiniteLoop:
return False
if memory != [(a * b) % NUM_CELL_VALUES, 0, 0, 0]:
return False
return True
def main():
instructions_that_do_nothing = []
q = collections.deque([instructions_that_do_nothing])
num_infinite_programs = 0
num_malformed_programs = 0
for i in itertools.count(1):
instructions = q.popleft()
if i & (i - 1) == 0: # if i is a power of 2.
print(f'Iteration/Number of programs processed: {i}')
print(f'Length of q (programs waiting to be processed): {len(q)}')
print(f'Number of instructions in current/longest program: {len(instructions)}')
print(f'Percent of programs that infinite loop: {100 * num_infinite_programs / (i + len(q)):.2f}%')
print(f'Percent of programs missing left brace: {100 * num_malformed_programs / (i + len(q)):.2f}%')
print()
try:
program = simplified_program(RBFProgram.from_instructions(instructions))
except MissingRightBrace:
pass # We will eventually output programs with the matching right-brace
else:
if program_works(program):
print('Program found!: ')
print(program.as_bf_str())
print(repr(program))
return
for op_code in '-+[]':
for register in range(NUM_REGISTERS):
next_instructions = copy.copy(instructions)
next_instructions.append((op_code, register))
# Don't use space in the q if the program cannot ever work
try:
simplified_program(RBFProgram.from_instructions(next_instructions))
except InfiniteLoop:
num_infinite_programs += 1
continue
except MissingLeftBrace:
num_malformed_programs += 1
continue
except MissingRightBrace:
pass
q.append(next_instructions)
On my (old) laptop, this code used about 7.4GiB in 7 minutes and got about this far:
...
Iteration/Number of programs processed: 2097152
Length of q (programs waiting to be processed): 13364941
Number of instructions in current/longest program: 8
Percent of programs that infinite loop: 5.36%
Percent of programs missing left brace: 3.15%
Iteration/Number of programs processed: 4194304
Length of q (programs waiting to be processed): 26948531
Number of instructions in current/longest program: 8
Percent of programs that infinite loop: 5.04%
Percent of programs missing left brace: 2.70%
For my setup, memory is more constrained than compute time but before optimizing the memory usage directly, let's try to avoid processing duplicate programs. We know from earlier that lots of programs will modify memory
in exactly the same way like [[-]]+
, [--+]+
, and [+]++-
.
One other thing... At this point, we have successfully written code that can find RBF/BF code that performs some operation. See what happens when we change program_works
to go from looking for multiplication to looking for addition (by changing *
to +
in if memory != [(a * b) % NUM_CELL_VALUES, 0, 0, 0]:
).
...
Iteration/Number of programs processed: 1024
Length of q (programs waiting to be processed): 5883
Number of instructions in current/longest program: 4
Percent of programs that infinite loop: 8.74%
Percent of programs missing left brace: 9.76%
Program found!:
>[-<+>]
DestructiveAdd(1, [(0, 1)])
Unfortunately, there's no great way to detect all duplicates just by analyzing the RBF code directly but we can eliminate some duplicates by using simplified_program
to canonical our programs and check for duplicates.
def main():
instructions_that_do_nothing = []
seen_programs = set()
q = collections.deque([instructions_that_do_nothing])
num_infinite_programs = 0
num_malformed_programs = 0
num_duplicate_programs = 0
for i in itertools.count(1):
instructions = q.popleft()
if i & (i - 1) == 0: # if i is a power of 2.
print(f'Iteration/Number of programs processed: {i}')
print(f'Length of q (programs waiting to be processed): {len(q)}')
print(f'Number of instructions in current/longest program: {len(instructions)}')
print(f'Percent of programs that infinite loop: {100 * num_infinite_programs / (i + len(q)):.2f}%')
print(f'Percent of programs missing left brace: {100 * num_malformed_programs / (i + len(q)):.2f}%')
print(f'Percent of programs that are duplicate: {100 * num_duplicate_programs / (i + len(q)):.2f}%')
print()
try:
program = simplified_program(RBFProgram.from_instructions(instructions))
except MissingRightBrace:
pass # We will eventually output programs with the matching right-brace
else:
# Since program is already a simplified_program, certain "duplicate" patterns will already be removed.
# For example, "--+" and "-" will both result in the same instructions.
program_key = tuple(program.instructions())
if program_key in seen_programs:
num_duplicate_programs += 1
continue
seen_programs.add(program_key)
if program_works(program):
print('Program found!: ')
print(program.as_bf_str())
print(repr(program))
return
for op_code in '-+[]':
for register in range(NUM_REGISTERS):
next_instructions = copy.copy(instructions)
next_instructions.append((op_code, register))
# Don't use space in the q if the program cannot ever work
try:
simplified_program(RBFProgram.from_instructions(next_instructions))
except InfiniteLoop:
num_infinite_programs += 1
continue
except MissingLeftBrace:
num_malformed_programs += 1
continue
except MissingRightBrace:
pass
q.append(next_instructions)
Six minutes and 8.7GiB of memory later, we see similar progress but at higher memory usage!
...
Iteration/Number of programs processed: 2097152
Length of q (programs waiting to be processed): 13170653
Number of instructions in current/longest program: 8
Percent of programs that infinite loop: 5.09%
Percent of programs missing left brace: 1.43%
Percent of programs that are duplicate: 0.42%
Iteration/Number of programs processed: 4194304
Length of q (programs waiting to be processed): 26685683
Number of instructions in current/longest program: 8
Percent of programs that infinite loop: 4.67%
Percent of programs missing left brace: 1.01%
Percent of programs that are duplicate: 0.37%
Perhaps the way that we are detecting duplicates is not thorough enough. Let's compare them directly!
class RBFProgram:
...
def __hash__(self) -> int:
some_results = []
# Note: We use % NUM_CELL_VALUES in case NUM_CELL_VALUES is very small
some_cell_values = [x % NUM_CELL_VALUES for x in range(-2, 3)]
for memory in itertools.product(some_cell_values, repeat=NUM_REGISTERS):
memory = list(memory)
try:
self.run(memory)
except InfiniteLoop:
memory = (None,)
some_results.append(tuple(memory))
return hash(tuple(some_results))
def __eq__(self, other: 'RBFProgram') -> bool:
for memory in itertools.product(range(NUM_CELL_VALUES), repeat=NUM_REGISTERS):
memory = list(memory)
other_memory = list(memory)
try:
self.run(memory)
except InfiniteLoop:
memory = None
try:
other.run(other_memory)
except InfiniteLoop:
other_memory = None
if memory != other_memory:
return False
return True
...
def main():
...
for i in itertools.count(1):
...
try:
program = simplified_program(RBFProgram.from_instructions(instructions))
except MissingRightBrace:
pass # We will eventually output programs with the matching right-brace
else:
# Since program is already a simplified_program, certain "duplicate" patterns will already be removed.
# For example, "--+" and "-" will both result in the same instructions.
if program in seen_programs:
num_duplicate_programs += 1
continue
seen_programs.add(program)
if program_works(program):
print('Program found!: ')
print(program.as_bf_str())
print(repr(program))
return
...
After 6.5 minutes, it detected more duplicates and only used 1.8GiB of memory... because it was so incredibly slow that it didn't have the chance to do much useful work:
...
Iteration/Number of programs processed: 2048
Length of q (programs waiting to be processed): 11666
Number of instructions in current/longest program: 5
Percent of programs that infinite loop: 7.07%
Percent of programs missing left brace: 1.09%
Percent of programs that are duplicate: 1.41%
Iteration/Number of programs processed: 4096
Length of q (programs waiting to be processed): 23836
Number of instructions in current/longest program: 5
Percent of programs that infinite loop: 6.63%
Percent of programs missing left brace: 0.72%
Percent of programs that are duplicate: 1.24%
What if we set NUM_CELL_VALUES = 5
? After 5 minutes and 7.7GiB:
...
Iteration/Number of programs processed: 2097152
Length of q (programs waiting to be processed): 13456769
Number of instructions in current/longest program: 8
Percent of programs that infinite loop: 4.28%
Percent of programs missing left brace: 0.01%
Percent of programs that are duplicate: 0.45%
Iteration/Number of programs processed: 4194304
Length of q (programs waiting to be processed): 27088903
Number of instructions in current/longest program: 9
Percent of programs that infinite loop: 4.11%
Percent of programs missing left brace: 0.01%
Percent of programs that are duplicate: 0.39%
That's progress, although it's not enough progress. Something fundamentally has to change.
Even though, we greatly limited ourselves to just 2 registers and only 5 possible cell values, the amount of memory it takes to get up to RBF programs with just 9 instructions is... too much. We need to get to 19!
Writing this post has taken more of my time than I'd like so I'm going to stop building up to the end goal and just JUMP to there!
To really make this work, I tried everything above but ultimately needed to make these changes (in no particular order):
All of these changes together look like this:
#include <algorithm>
#include <bitset>
#include <cassert>
#include <cmath>
#include <cstdint>
#include <cstdlib>
#include <cstring>
#include <iostream>
#include <map>
#include <memory>
#include <set>
#include <string>
#include <string_view>
#include <unordered_set>
#include <vector>
using namespace std;
constexpr int powi(int b, int p) {
int result = 1;
for (int i = 0; i < p; ++i) {
result *= b;
}
return result;
}
// Note: Removing programs that infinite loop (under at least 1 input), usually make the search faster *but*
// it can make it take longer and/or not find the *shortest* RBF program that works.
constexpr bool REMOVE_PROGRAMS_THAT_EVER_INFINITE_LOOP = true;
constexpr bool IGNORE_IRRELEVANT_START_PROGRAM_STATES = true;
constexpr bool IGNORE_IRRELEVANT_END_PROGRAM_STATES = true;
constexpr int NUM_CELL_VALUES = 5;
constexpr int NUM_REGISTERS = 3; // Does not include register rewrites :)
constexpr int NUM_TOTAL_STATES = powi(NUM_CELL_VALUES, NUM_REGISTERS) + 1;
constexpr int NUM_INPUT_REGISTERS = 2;
constexpr int NUM_OUTPUT_REGISTERS = 1;
typedef uint8_t state_t; // Big enough to hold 0 to NUM_TOTAL_STATES (inclusive)
typedef uint8_t output_state_id_t; // big enough to hold 0 to the number of unique output states (exclusive)
typedef uint8_t cell_t; // Big enough to hold 0 to NUM_CELL_VALUES (inclusive)
int memory_to_state(const cell_t* memory) {
state_t state = 0;
for (int i = 0; i < NUM_REGISTERS; ++i) {
state *= NUM_CELL_VALUES;
state += memory[i];
}
return state + 1;
}
inline bool state_is_infinite_loop(const state_t state) {
return !state;
}
// Fill in memory and return true iff state is valid/not an infinite loop state.
bool state_to_memory(state_t state, cell_t* memory) {
if (state_is_infinite_loop(state)) {
return false;
}
--state;
for (int i = NUM_REGISTERS - 1; i >= 0; --i) {
memory[i] = state % NUM_CELL_VALUES;
state /= NUM_CELL_VALUES;
}
return true;
}
struct StateToMemory {
cell_t memories[NUM_REGISTERS * NUM_TOTAL_STATES];
StateToMemory() {
for (state_t state = 0; state < NUM_TOTAL_STATES; ++state) {
state_to_memory(state, &memories[state * NUM_REGISTERS]);
}
}
cell_t const* get(state_t state) {
return &memories[state * NUM_REGISTERS];
}
cell_t get(state_t state, int index) {
return memories[state * NUM_REGISTERS + index];
}
} state_to_memory_cache;
struct StateToDesiredOutputs {
cell_t desired_outputs[NUM_OUTPUT_REGISTERS * NUM_TOTAL_STATES];
output_state_id_t output_state_ids[NUM_TOTAL_STATES];
int num_unique_outputs;
StateToDesiredOutputs() {
for (state_t state = 1; state < NUM_TOTAL_STATES; ++state) {
const auto x = state_to_memory_cache.get(state, 0);
const auto y = state_to_memory_cache.get(state, 1);
// const auto z = state_to_memory_cache.get(state, 2);
// Below are examples.
// DON'T FORGET TO UPDATE NUM_INPUT_REGISTERS and NUM_OUTPUT_REGISTERS!
desired_outputs[state * NUM_OUTPUT_REGISTERS] = (x * y) % NUM_CELL_VALUES;
// desired_outputs[state * NUM_OUTPUT_REGISTERS] = (x * y) / NUM_CELL_VALUES % NUM_CELL_VALUES;
// desired_outputs[state * NUM_OUTPUT_REGISTERS] = (x + y) / NUM_CELL_VALUES % NUM_CELL_VALUES;
// '<[-<->>+<]>[-<->]<<<[>[->-<]<[-]]>>[-<<->>]<[-]'
// desired_outputs[state * NUM_OUTPUT_REGISTERS] = x ? y : z;
// '<[-<[>>+<<[->>-<<]]>>[-<<->>]<]>+<+>[-<<[->>-<<]>[-<+>]>]<'
// desired_outputs[state * NUM_OUTPUT_REGISTERS] = (x <= y) % NUM_CELL_VALUES;
// So far, have not found this one!
// desired_outputs[state * NUM_OUTPUT_REGISTERS] = (cell_t)sqrtl(x) % NUM_CELL_VALUES;
// '<[->+<]->[[>-<---[>-<[->-<]]]>[+<+>[+[->-<]]>[-<->]<]<<+>]'
// desired_outputs[state * NUM_OUTPUT_REGISTERS] = (cell_t)log2l(x) % NUM_CELL_VALUES;
// '<[>-<---[>-<[->-<]]]>[+<+>[+[->-<]]>[-<->]<]'
// desired_outputs[state * NUM_OUTPUT_REGISTERS] = x / 2 % NUM_CELL_VALUES;
// desired_outputs[state * NUM_OUTPUT_REGISTERS] = x / 3 % NUM_CELL_VALUES;
// int result = x * NUM_CELL_VALUES + y + NUM_CELL_VALUES*NUM_CELL_VALUES - 1;
// desired_outputs[state * NUM_OUTPUT_REGISTERS] = result / NUM_CELL_VALUES % NUM_CELL_VALUES;
// desired_outputs[state * NUM_OUTPUT_REGISTERS + 1] = result % NUM_CELL_VALUES;
// desired_outputs[state * NUM_OUTPUT_REGISTERS] = x;
// desired_outputs[state * NUM_OUTPUT_REGISTERS + 1] = y;
// desired_outputs[state * NUM_OUTPUT_REGISTERS + 2] = (x || y) ? 1 : 0;
}
output_state_id_t canonical_output_state_ids[NUM_TOTAL_STATES];
num_unique_outputs = 0;
for (state_t state = 1; state < NUM_TOTAL_STATES; ++state) {
canonical_output_state_ids[state] = -1;
}
for (state_t state = 1; state < NUM_TOTAL_STATES; ++state) {
cell_t memory[NUM_REGISTERS];
set_memory(state, memory);
state_t canonical_state = memory_to_state(memory);
if (canonical_output_state_ids[canonical_state] == (output_state_id_t)-1) {
canonical_output_state_ids[canonical_state] = num_unique_outputs;
++num_unique_outputs;
}
output_state_ids[state] = canonical_output_state_ids[canonical_state];
}
}
output_state_id_t get_id(state_t state) const {
return output_state_ids[state];
}
cell_t get(state_t state, int index) const {
return desired_outputs[state * NUM_OUTPUT_REGISTERS + index];
}
void set_memory(state_t state, cell_t* memory) const {
for (int i = 0; i < NUM_OUTPUT_REGISTERS; ++i) {
memory[i] = get(state, i);
}
for (int i = NUM_OUTPUT_REGISTERS; i < NUM_REGISTERS; ++i) {
memory[i] = 0;
}
}
bool matches(state_t state, const cell_t* memory) const {
for (int i = 0; i < NUM_OUTPUT_REGISTERS; ++i) {
if (memory[i] != get(state, i)) {
return false;
}
}
return true;
}
} state_to_desired_output_cache;
struct Program {
string rbf;
state_t states[NUM_TOTAL_STATES]; // 0 means infinite loop
Program(const string rbf) : rbf(rbf) {
memset(states, 0, sizeof(states));
}
Program() {
memset(states, 0, sizeof(states));
}
Program(const Program& other) = default;
Program(Program&& other) = default;
string as_bf_str(int starting_register=0) const {
string bf;
int cur_register = starting_register;
for (int i = 0; i < rbf.size() - 1; i += 2) {
int next_register = rbf[i + 1] - '0';
if (next_register > cur_register) {
bf.resize(bf.size() + next_register - cur_register, '>');
} else {
bf.resize(bf.size() + cur_register - next_register, '<');
}
bf.push_back(rbf[i]);
cur_register = next_register;
}
return bf;
}
// Return true if there's at least 1 input which does not infinite loop *AND* updates the state
bool is_nontrivial_program() const {
if (REMOVE_PROGRAMS_THAT_EVER_INFINITE_LOOP) {
// Note: The loop below filter programs that infinite loop on any input (not just all inputs).
// This filtering may be too aggressive
for (state_t state = 1; state < NUM_TOTAL_STATES; ++state) {
if (states[state] == 0) {
return false;
}
}
}
for (state_t state = 1; state < NUM_TOTAL_STATES; ++state) {
if (states[state] && states[state] != state) {
return true;
}
}
return false;
}
int next_register_available_ge(int minimum_register) const {
set<int> registers_used;
for (int i = 1; i < rbf.size(); i += 2) {
int r = rbf[i] - '0';
registers_used.insert(r);
}
for (int r = minimum_register; ; ++r) {
if (registers_used.find(r) == registers_used.end()) {
return r;
}
}
}
int next_out_of_bounds_register_available() const {
return next_register_available_ge(NUM_REGISTERS);
}
static Program noop() {
Program result;
for (state_t state = 1; state < NUM_TOTAL_STATES; ++state) {
result.states[state] = state;
}
return result;
}
static Program increment(int index) {
string rbf = "+0";
rbf[1] += index;
Program result(rbf);
// Programs result;
cell_t memory[NUM_REGISTERS];
for (state_t state = 1; state < NUM_TOTAL_STATES; ++state) {
state_to_memory(state, memory);
++memory[index];
memory[index] %= NUM_CELL_VALUES;
result.states[state] = memory_to_state(memory);
}
return result;
}
static Program decrement(int index) {
string rbf = "-0";
rbf[1] += index;
Program result(rbf);
cell_t memory[NUM_REGISTERS];
for (state_t state = 1; state < NUM_TOTAL_STATES; ++state) {
state_to_memory(state, memory);
memory[index] += NUM_CELL_VALUES - 1; // Try to avoid negative modulo below. Oh Python, how I miss you.
memory[index] %= NUM_CELL_VALUES;
result.states[state] = memory_to_state(memory);
}
return result;
}
static Program loop(int index, const Program& inner_code) {
string rbf;
rbf.reserve(inner_code.rbf.size() + 4);
rbf.append("[0");
rbf[1] += index;
rbf.append(inner_code.rbf);
rbf.append("]0");
rbf[rbf.size() - 1] += index;
Program result(rbf);
bitset<NUM_TOTAL_STATES> seen_states;
// Phase 1: Set result.states to be 1 iteration of the loop (or 0 if the loop does not run)
for (state_t start_state = 1; start_state < NUM_TOTAL_STATES; ++start_state) {
result.states[start_state] = state_to_memory_cache.get(start_state, index) == 0
? start_state
: inner_code.states[start_state];
}
for (state_t start_state = 1; start_state < NUM_TOTAL_STATES; ++start_state) {
state_t seen_array[NUM_TOTAL_STATES];
int num_seen = 0;
seen_states.reset();
state_t state = result.states[start_state];
while(state) {
if (seen_states[state]) {
state = 0; // infinite loop detected
break;
}
if (state_to_memory_cache.get(state, index) == 0) {
break; // We finished the loop!
}
seen_states[state] = true;
seen_array[num_seen] = state;
++num_seen;
state = result.states[state];
}
result.states[start_state] = state;
for (int i = 0; i < num_seen; ++i) {
result.states[seen_array[i]] = state;
}
}
return result;
}
static Program concat(const Program& a, const Program& b) {
Program result;
result.rbf.reserve(a.rbf.size() + b.rbf.size());
result.rbf.append(a.rbf);
result.rbf.append(b.rbf);
for (state_t state = 1; state < NUM_TOTAL_STATES; ++state) {
result.states[state] = b.states[a.states[state]];
}
return result;
}
// Return true iff every state where memory[index] == 0 ends with memory[index] == 0.
// This is useful because it means this program can take advantage of a known zeroed out cell and leave that cell
// in the same state.
bool cell_ends_at_0_when_starting_at_0(int index) const {
for (state_t state = 1; state < NUM_TOTAL_STATES; ++state) {
if (state_to_memory_cache.get(state, index) == 0) { // TODO: Precompute which states start with memory[index] == 0
state_t end_state = states[state];
if (end_state == 0 || state_to_memory_cache.get(end_state, index) != 0) {
return false;
}
}
}
return true;
}
// Assuming p.cell_starts_and_ends_at_0(old_index), rewrite p so that old_index is now an out-of-bounds new_index, allowing
// old_index to be used for other computations
static Program with_0_register_rewritten(const Program& p, int old_index, int new_index) {
assert(p.cell_ends_at_0_when_starting_at_0(old_index));
assert(new_index >= NUM_REGISTERS);
string rbf(p.rbf);
for (int i = 1; i < rbf.length(); i += 2) {
if (rbf[i] - '0' == old_index) {
rbf[i] = '0' + new_index;
}
}
Program result(rbf);
cell_t memory[NUM_REGISTERS];
for (state_t state = 1; state < NUM_TOTAL_STATES; ++state) {
state_to_memory(state, memory);
memory[old_index] = 0;
state_t effective_state = memory_to_state(memory);
state_t end_state = p.states[effective_state];
state_to_memory(end_state, memory);
memory[old_index] = state_to_memory_cache.get(state, old_index);
result.states[state] = memory_to_state(memory);
}
return result;
}
bool operator!=(const Program& other) const {
return memcmp(states, other.states, sizeof(states));
}
bool operator==(const Program& other) const {
return memcmp(states, other.states, sizeof(states)) == 0;
}
bool operator<(const Program& other) const {
return memcmp(states, other.states, sizeof(states)) < 0;
}
};
template<>
struct std::hash<Program>
{
std::size_t operator()(const Program& s) const noexcept
{
return std::hash<string_view>{}(
string_view((char*)&s.states, sizeof(s.states)));
}
};
typedef unordered_set<Program> SetOfPrograms;
vector<SetOfPrograms> UNIQ_PROGRAM_CACHE;
bool program_in_unique_program_cache(const Program& p) {
for (const SetOfPrograms& programs : UNIQ_PROGRAM_CACHE) {
if (programs.find(p) != programs.end()) {
return true;
}
}
return false;
}
const SetOfPrograms& unique_programs_of_length(int n) {
if (UNIQ_PROGRAM_CACHE.size() > n) {
return UNIQ_PROGRAM_CACHE[n];
}
unique_programs_of_length(n - 1); // prefill cache of smaller programs
UNIQ_PROGRAM_CACHE.resize(n + 1);
SetOfPrograms& result = UNIQ_PROGRAM_CACHE[n];
// Try all combinations of smaller programs
for (int a_size = 1; a_size < n; ++a_size) {
int b_size = n - a_size;
for (const Program& a : UNIQ_PROGRAM_CACHE[a_size]) {
for (const Program& b : UNIQ_PROGRAM_CACHE[b_size]) {
const Program new_program = Program::concat(a, b);
if (new_program.is_nontrivial_program() && !program_in_unique_program_cache(new_program)) {
result.insert(move(new_program));
}
}
}
}
// Try all loops of size n-1 programs
for (const Program& inner_code : UNIQ_PROGRAM_CACHE[n - 1]) {
for (int r = 0; r < NUM_REGISTERS; ++r) {
Program new_program = Program::loop(r, inner_code);
if (new_program.is_nontrivial_program() && !program_in_unique_program_cache(new_program)) {
result.insert(move(new_program));
}
}
}
// Try rewrites when possible! BF Programs like [->+>+<<]>>[-<<+>>] essentially
// just use register 2 as temporary storage in order to copy register 0 to register 1.
for (const Program& p : UNIQ_PROGRAM_CACHE[n - 1]) {
for (int r = 0; r < NUM_REGISTERS; ++r) {
if (p.cell_ends_at_0_when_starting_at_0(r)) {
int new_register = p.next_out_of_bounds_register_available();
Program new_program = Program::with_0_register_rewritten(p, r, new_register);
if (new_program.is_nontrivial_program() && !program_in_unique_program_cache(new_program)) {
result.insert(move(new_program));
}
}
}
}
return result;
}
vector<state_t> get_valid_start_states() {
vector<state_t> valid_start_states;
for (state_t state = 1; state < NUM_TOTAL_STATES; ++state) {
const cell_t* memory = state_to_memory_cache.get(state);
bool valid_state = true;
for (int i = NUM_INPUT_REGISTERS; i < NUM_REGISTERS; ++i) {
if (memory[i] != 0) {
valid_state = false;
break;
}
}
if (valid_state) {
valid_start_states.push_back(state);
}
}
return valid_start_states;
}
map<output_state_id_t, vector<state_t>> get_output_state_ids(const vector<state_t>& valid_start_states) {
map<output_state_id_t, vector<state_t>> output_state_ids;
for (state_t state : valid_start_states) {
output_state_ids[state_to_desired_output_cache.get_id(state)].push_back(state);
}
return output_state_ids;
}
SetOfPrograms get_start_program_candidates(
const vector<state_t>& valid_start_states,
const map<output_state_id_t, vector<state_t>>& output_state_ids) {
SetOfPrograms start_programs;
for (const SetOfPrograms& some_programs : UNIQ_PROGRAM_CACHE) {
for (const Program& a : some_programs) {
// Skip processing "a" programs that cannot possibly give the right answer
bool a_is_valid = true;
bitset<NUM_TOTAL_STATES> all_intermediate_states;
for (const auto& output_group : output_state_ids) {
bitset<NUM_TOTAL_STATES> group_intermidiate_states;
for (const state_t start_state : output_group.second) {
state_t state = a.states[start_state];
if (all_intermediate_states[state] || state == 0) {
// We've detected that 2 different starting states that should have different outputs,
// have the same intermediate state (so it is not possible for them to have different outputs)
a_is_valid = false;
break;
}
group_intermidiate_states[state] = 1;
}
if (!a_is_valid) {
break;
}
all_intermediate_states |= group_intermidiate_states;
}
if (a_is_valid) {
assert(all_intermediate_states.count() >= output_state_ids.size());
if (IGNORE_IRRELEVANT_START_PROGRAM_STATES) {
Program rewritten_a(a);
for (state_t s = 1; s < NUM_TOTAL_STATES; ++s) {
if (find(valid_start_states.begin(), valid_start_states.end(), s) == valid_start_states.end()) {
rewritten_a.states[s] = 0;
}
}
start_programs.insert(rewritten_a);
} else {
start_programs.insert(a);
}
}
}
}
return start_programs;
}
SetOfPrograms get_end_program_candidates(
const vector<state_t>& valid_start_states,
const map<output_state_id_t, vector<state_t>>& output_state_ids) {
SetOfPrograms end_program_candidates;
bitset<NUM_CELL_VALUES * NUM_OUTPUT_REGISTERS> necessary_values;
for (const auto& kv : output_state_ids) {
necessary_values[kv.first] = 1;
}
for (const SetOfPrograms& some_programs : UNIQ_PROGRAM_CACHE) {
for (const Program& b : some_programs) {
bitset<NUM_CELL_VALUES * NUM_OUTPUT_REGISTERS> missing_values(necessary_values);
// Check if b is a (non strict) superset of end states needed
for (state_t start_state = 1; start_state < NUM_TOTAL_STATES; ++start_state) {
missing_values[state_to_desired_output_cache.get_id(b.states[start_state])] = 0;
}
if (missing_values.none()) {
if (IGNORE_IRRELEVANT_END_PROGRAM_STATES) {
Program rewritten_b(b);
for (state_t s = 1; s < NUM_TOTAL_STATES; ++s) {
state_t es = rewritten_b.states[s];
cell_t memory[NUM_REGISTERS];
state_to_memory(es, memory);
for (int i = NUM_OUTPUT_REGISTERS; i < NUM_REGISTERS; ++i) {
memory[i] = 0;
}
rewritten_b.states[s] = memory_to_state(memory);
}
end_program_candidates.insert(rewritten_b);
} else {
end_program_candidates.insert(b);
}
}
}
}
return end_program_candidates;
}
size_t uniq_program_cache_size() {
size_t result = 0;
for (SetOfPrograms ps : UNIQ_PROGRAM_CACHE) {
result += ps.size();
}
return result;
}
// Checks all 2-grams of programs in UNIQ_PROGRAM_CACHE to see if any of them are equivalent to doing a multiplication!
void check_programs() {
vector<state_t> valid_start_states = get_valid_start_states();
map<output_state_id_t, vector<state_t>> output_state_ids = get_output_state_ids(valid_start_states);
SetOfPrograms start_programs = get_start_program_candidates(valid_start_states, output_state_ids);
SetOfPrograms end_programs = get_end_program_candidates(valid_start_states, output_state_ids);
cout << start_programs.size() << " Programs are valid start Programs.\n";
cout << end_programs.size() << " Programs are valid end Programs.\n";
size_t checking_combos = start_programs.size() * end_programs.size();
double total_combos = (double)uniq_program_cache_size() * uniq_program_cache_size();
cout << "Checking " << checking_combos << " combinations ("
<< 100.0 * checking_combos / total_combos << "% of " << total_combos << " possible combinations).\n";
// Checks every concatenation of a start_program_candidate with a end_program_candidate
for (const Program& a : start_programs) {
for (const Program& b : end_programs) {
const Program p = Program::concat(a, b);
bool program_works = true;
for (state_t start_state : valid_start_states) {
state_t state = p.states[start_state];
const cell_t* memory = state_to_memory_cache.get(state);
if (state == 0 || !state_to_desired_output_cache.matches(start_state, memory)) {
program_works = false;
break;
}
}
if (program_works) {
for (state_t start_state : valid_start_states) {
const cell_t* start_memory = state_to_memory_cache.get(start_state);
state_t state = p.states[start_state];
const cell_t* memory = state_to_memory_cache.get(state);
for (int i = 0; i < NUM_REGISTERS; ++i) {
cout << (int)start_memory[i] << ' ';
}
cout << " --> ";
if (IGNORE_IRRELEVANT_END_PROGRAM_STATES) {
for (int i = 0; i < NUM_OUTPUT_REGISTERS; ++i) {
cout << (int)memory[i] << ' ';
}
for (int i = NUM_OUTPUT_REGISTERS; i < NUM_REGISTERS; ++i) {
cout << "? ";
}
} else {
for (int i = 0; i < NUM_REGISTERS; ++i) {
cout << (int)memory[i] << ' ';
}
}
cout << endl;
}
cout
<< "RBF: " << p.rbf << endl
<< "BF: " << p.as_bf_str() << endl
<< "BF (starting at register " << NUM_INPUT_REGISTERS << "): " << p.as_bf_str(NUM_INPUT_REGISTERS) << endl;
exit(0);
}
}
}
cout << "Could not find desired program... (yet).\n";
}
int main(int argc, const char* argv[]) {
assert(1L << (8 * sizeof(state_t)) >= NUM_TOTAL_STATES);
// Fill first 2 layers of UNIQ_PROGRAM_CACHE
assert(UNIQ_PROGRAM_CACHE.empty());
UNIQ_PROGRAM_CACHE.resize(2);
UNIQ_PROGRAM_CACHE[0].insert(Program::noop());
for (int i = 0; i < NUM_REGISTERS; ++i) {
UNIQ_PROGRAM_CACHE[1].insert(Program::decrement(i));
UNIQ_PROGRAM_CACHE[1].insert(Program::increment(i));
}
for (int i = 0; ; ++i) {
const auto& programs = unique_programs_of_length(i);
cout << "Found " << programs.size() << " unique RBF programs with " << i << " loops/increments." << endl;
if (i >= 10) {
check_programs();
}
}
return 0;
}
After ~31 seconds and <1GiB, outputs:
Found 1 unique RBF programs with 0 loops/increments.
Found 6 unique RBF programs with 1 loops/increments.
Found 21 unique RBF programs with 2 loops/increments.
Found 62 unique RBF programs with 3 loops/increments.
Found 207 unique RBF programs with 4 loops/increments.
Found 870 unique RBF programs with 5 loops/increments.
Found 3774 unique RBF programs with 6 loops/increments.
Found 16431 unique RBF programs with 7 loops/increments.
Found 73824 unique RBF programs with 8 loops/increments.
Found 344618 unique RBF programs with 9 loops/increments.
Found 1641549 unique RBF programs with 10 loops/increments.
55543 Programs are valid start Programs.
76522 Programs are valid end Programs.
Checking 4250261446 combinations (0.0981115% of 4.33207e+12 possible combinations).
0 0 0 --> 0 ? ?
0 1 0 --> 0 ? ?
0 2 0 --> 0 ? ?
0 3 0 --> 0 ? ?
0 4 0 --> 0 ? ?
1 0 0 --> 0 ? ?
1 1 0 --> 1 ? ?
1 2 0 --> 2 ? ?
1 3 0 --> 3 ? ?
1 4 0 --> 4 ? ?
2 0 0 --> 0 ? ?
2 1 0 --> 2 ? ?
2 2 0 --> 4 ? ?
2 3 0 --> 1 ? ?
2 4 0 --> 3 ? ?
3 0 0 --> 0 ? ?
3 1 0 --> 3 ? ?
3 2 0 --> 1 ? ?
3 3 0 --> 4 ? ?
3 4 0 --> 2 ? ?
4 0 0 --> 0 ? ?
4 1 0 --> 4 ? ?
4 2 0 --> 3 ? ?
4 3 0 --> 2 ? ?
4 4 0 --> 1 ? ?
RBF: [0[1-0[1+2+1]1]1[0-1-0]0]0-1[2-2[1-3+1-0]1[3-3+1]3]2
BF: [>[<->[>+<+]]<[>-<-]]>->[-<[>>-<<+<->]>>[-<<+>>]<]
BF (starting at register 2): <<[>[<->[>+<+]]<[>-<-]]>->[-<[>>-<<+<->]>>[-<<+>>]<]
Woohoo! It works. The original goal was to be able to search up through programs that were at least as "complex" as the multiplication BF from the beginning. We determined that's roughly 40 BF instructions, 19 RBF instructions, or 4 loops and 11 increments. On my 16GiB laptop, this program can find all the RBF programs with up to 10 loops/increments and then search through all 2-gram concatenations of those programs within a minute and all RBF programs up to 11 loops/increments before it runs out of memory! From unimaginably large numbers of possible BF programs to search through to trillions of RBF programs all the way done to the millions that a single laptop can handle.
If you look at the StateToDesiredOutputs
class above, you can see lots of examples of other functions that we can find RBF (and BF) code for as well as some that I couldn't.
Checkout ASCII if you want to learn a little more about what numbers represent what (English) characters or Unicode and UTF-8 if you want to learn a lot more, including how to output emojis and non-English characters.
You might also enjoy learning about Turing Machines, compilers, or Common LISP.
While I wouldn't want to write a complicated program in BF, I did/do enjoy writing a program (in Python) that takes non-BF code in and outputs BF code.
The following Python-like program can be compiled to BF.
def print_number(x):
if x >= 100:
print(x / 100 + '0')
if x >= 10:
print(x / 10 % 10 + '0')
print(x % 10 + '0')
def is_digit(c):
'0' <= c and c <= '9'
def read_num():
c = read()
result = 0
while is_digit(c):
result = result * 10 + (c - '0')
c = read()
result
def start():
print("Enter equation to solve (e.g. '123 / 45'): ")
a = read_num()
op_code = read()
read()
b = read_num()
print("\n")
if op_code == '+':
result = a + b
elif op_code == '-':
result = a - b
elif op_code == '*':
result = a * b
elif op_code == '/':
result = a / b
else:
print("Sorry! Only +, -, *, and / are supported operations but we got '", op_code, "'.\n")
exit()
print_number(a)
print(" ", op_code, " ")
print_number(b)
print(" = ")
print_number(result)
print("\n")
If you run that BF program and type in something like "6 * 7", it outputs:
Enter equation to solve (e.g. '123 / 45'): 16 * 7
16 * 7 = 112
I think it'd take too long to do a proper write up about the compiler and the compiler's code needs to be cleaned up, tested, and documented but it works!