--- title: "Variables, Constants, and Arrays" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Variables, Constants, and Arrays} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) library(froth, quietly = TRUE) ``` Forth-like languages are nice because they don't require a lot of variables. However, sometimes there's a problem that requires saving a particular value for later usage that can't be on the stack. This is where variables, constants, and arrays come in. ## Variables Let's say we want to store a value somewhere outside of the stack. We can initialize a variable using the `VARIABLE` word. ``` fr> VARIABLE testvar ok. ``` Nothing happened! That's because this word only allocates space for a variable without doing any other actions. We can now call the name of the variable at any time to push its address onto the stack. In Forth, this would be an actual memory address. `froth`, however, uses an abstracted memory map. ``` fr> test .s [[1]] [1] "addr " [[2]] NULL ok. ``` We can see the address of the variable at the top of the stack. We'll come back to the `cell 0` bit later. To store a value in a variable, we use `!`. We can copy the value in an address to the stick using `@`. ``` fr> variable newvar ok. fr> 20 newvar ! ok. fr> newvar @ . 20 ok. ``` Notice that for `!` we push the value first, then the address. It's also important to note that the variable retains its value even after calling `@`. ``` fr> variable newvar ok. fr> 20 newvar ! ok. fr> newvar dup @ . dup @ . @ . 20 20 20 ok. ``` `@ .` is a pretty common call to inspect the value of a variable. In fact, it's so common that there's a shortcut word to perform both, `?`. ``` fr> variable newvar 20 newvar ! ok. fr> newvar ? 20 ok. ``` Variables are often useful as counters, using the `+!` command. `+!` is similar to `!`, except that it adds to the variable rather than replacing it. ``` fr> variable newvar ok. fr> 0 newvar ! ok. fr> 5 0 do 1 newvar +! loop newvar ? 5 ok. ``` ## Constants Sometimes we just need a name for a value. Variables that will never change values are called constants, and can be fittingly be defined with the `CONSTANT` word. ``` fr> 100 CONSTANT hundred ok. fr> hundred . 100 ok. ``` Constants avoid the hassle of dealing with addresses, and can be accessed slightly faster than variables. Calling the name of the variable copies its value directly onto the stack. ## Arrays Remember when I said we'd come back to that whole "`cell 0`" thing? By default, initializing a variable creates space for one "thing" at an address. Unlike Forth, `froth` creates this space at an abstracted memory location, and that space is big enough for any arbitrary R object. However, sometimes it's nice to have a variable that can store more than one value, like in the case of vectors or lists in R. Let's start by initializing an array. The definition proceeds like in normal variables, we just have to further specify that we want the variable to be larger than a single value: ``` fr> VARIABLE myarray 5 CELLS ALLOT ok. ``` This initializes a variable with 5 slots for things. Internally, this is the same as a length 5 list. The word `ALLOT` allots memory for the preceding variable, and we'll come back to `CELLS` in a moment. Previously, we saw this behavior: ``` fr> myarray .s [[1]] [1] "addr " [[2]] NULL ok. ``` `cell 0` means that this address is pointing to the first value of the array `myarray`. Cells are 0-indexed in `froth`, so this is the first entry in the array. Now as before, we can store a value at the first cell. I'm assuming the address is still on the stack, so remember we need to call `swap` to ensure the top value is the address. ``` fr> 1 swap ! ok. fr> myarray ? 1 ok. ``` Ok, now we can look at the word `CELLS`. Let's start by inspecting what we get by pushing `CELLS` to the stack: ``` fr> 4 CELLS .s clear [[1]] [1] "addr <, cell 4>" [[2]] NULL ok. ``` It's the fourth cell of...nothing? `CELLS` creates a unique kind of address--it stores cell number, but it doesn't point to any variable. Words like `ALLOT` expect one of these objects to tell it how much space to allocate. While `CELLS` calls don't point to anything, we can add them to variable addresses to get to other values in the array. ``` fr> myarray ok. fr> 4 CELLS + .s [[1]] [1] "addr " [[2]] NULL ok. ``` Notice how we've now gotten an address to `myarray` that's offset by 4 cells. Since the size of `myarray` is 5, this is the final entry in the array (indexing starts at 0, so the 5 entries are 0-4). This address works like the examples in the Variables section, so we can use `!`, `@`, `+!`, and `?` on it: ``` fr> VARIABLE myarray 5 CELLS ALLOT ok. fr> 0 BEGIN dup 5 < WHILE dup dup myarray swap CELLS + ! 1 + REPEAT drop ok. ``` Here we're putting it all together. See if you can guess what the contents of `myarray` now look like before reading the next section. We start by initializing an array `myarray` of size 5. Then, we put `0` on the stack and `begin` a loop. At each iteration of the loop we: 1. Dup the top of the stack and check if less than 5; if not, go to (8) 2. Dup the top of the stack twice. 3. Load the address of `myarray`, swap it with the top duplicate 4. Create a cell offset equal to the top duplicate, add to the address of `myarray` 5. Store the second duplicate (from 2) at the offset location 6. Add 1 to the value on the stack 7. loop 8. finish loop So what happens? When the stack is 0, we store 0 at the 0th position of `myarray`, then 1 at the 1st position, 2 at the second, etc. Let's print out the entire array: ``` fr> 5 0 do myarray i cells + ? loop 0 1 2 3 4 ok. ``` It's also possible to initialize an array to a specific value using the `CREATE` and `,` words. *Pay very close attention to this syntax!* ``` fr> CREATE myarray 0 , 1 , 2 , 3 , 4 , ok. fr> 5 0 do myarray i cells + ? loop 0 1 2 3 4 ok. ``` Did you notice how we used `,`? Unlike English, `,` is a `froth` word, meaning that it needs to be space separated and it acts upon the stack. This means that we **must** include a comma after the last element to ensure it's added to the array. ## Modifying Arrays It's often useful to be able to quickly change the values or the size of allocated arrays. To change the values in an array, we can use the `FILL` word. `FILL` expects three items on the stack: a variable address, a `CELLS` offset `n`, and a value `x`. It then fills the first `n` cells with `x`. ``` fr> variable filltest 3 cells allot ok. fr> 3 0 do i filltest i cells + ! loop ok. fr> 3 0 do filltest i cells + ? loop 0 1 2 ok. fr> filltest 3 cells 10 fill ok. fr> 3 0 do filltest i cells + ? loop 10 10 10 ok. ``` There's also the shortcut word `ERASE`, which is equivalent to `0 FILL`. ``` fr> filltest 3 cells erase ok. fr> 3 0 do filltest i cells + ? loop 0 0 0 ok. ``` That changes values, but what if we end up with the wrong size array altogether? For that, we have two options: `EXTEND` and `REALLOT`. Both of these words have the same form; they expect the stack to be comprised of a variable address `v` and a value `n`. `EXTEND` will make the array at `v` larger by `n` cells, and `REALLOT` will change `v` to have exactly `n` cells (copying the contents of the first `n` cells, if any). ``` fr> variable myarray 3 cells allot ok. fr> myarray 5 cells + ? Error: array accessed out of bounds! fr> myarray 2 cells extend ok. fr> myarray 5 cells + ? 0 ok. fr> 10 myarray 1 cells + ! ok. fr> myarray 3 cells reallot ok. fr> myarray 5 cells + ? Error: array accessed out of bounds! fr> myarray 1 cells + ? 10 ok. ``` If you're ever unsure about how large your array is, use the `LENGTH` word to copy the length of an address to the stack, and `LENGTH?` to print it out. ``` fr> variable myarray 10 cells allot ok. fr> myarray LENGTH . 10 ok. fr> myarray 20 cells reallot ok. fr> myarray LENGTH? 20 ok. ``` ## Execution Tokens Variables storing values is nice, but sometimes it's useful to be able to store a function inside a variable. This is a common practice in R (e.g., any of the `*apply` functions), and also exists in C in the form of function pointers. Forth also incorporates a way to do this, using a unique word to put an "execution token" on the stack. An execution token is essentially just the location of a particular type of code. When you want to execute that token, Forth runs the code at the location referenced by the execution token. This capability is implemented in `froth`, but it does have a few differences to classic Forths due to how the memory is abstracted. Let's start with a simple example: ``` fr> : GREET ." Hello!" ; ok. fr> GREET Hello! ok. fr> ' GREET . executable token < greet > ok. fr> ' GREET EXECUTE Hello! ok. ``` We have two new words here: `'` and `EXECUTE`. The first of these, `'` (pronounced "tick"), creates an execution token for the word that follows it, then pushes it to the stack. If we print this value out, we get `executable token < greet >`. Forth languages would traditionally return memory location, but `froth` just shows the word that the execution token corresponds to. The second word, `EXECUTE`, does just that: it executes the execution token at the top of the stack. In a way, `' GREET EXECUTE` is just a really roundabout way of accomplishing the same thing as just calling `GREET` directly. Naturally, one might wonder what the purpose of this is. We can use `'` to store functions in variables so we can *indirectly* execute them later on. Let's see how to do this with an example: ``` fr> : HELLO ." Hello" ; ok. fr> : GOODBYE ." Goodbye" ; ok. fr> VARIABLE 'aloha ' HELLO 'aloha ! ok. fr> : ALOHA 'aloha @ EXECUTE ; ok. ``` I'll show what happened in a minute, but I'd encourage readers to first think about this code. What's going on here? What would running `'aloha` do? Read through the code, then continue on to the explanation below. The first two lines should be simple by now. We've defined two functions, `HELLO` and `GOODBYE`, that simply print out `Hello` and `Goodbye` (respectively). Then, we initialize a variable called `'aloha`, and store an execution token for `HELLO` inside that variable. Finally, we define a function `ALOHA` that executes the execution token stored in `'aloha`. This implies two things. First, if we run `ALOHA`, we should expect it to call `HELLO`: ``` fr> ALOHA Hello ok. ``` And indeed it does. Secondly, if we store a different function pointer in `'aloha` and then call `ALOHA`, we should expect to see a different result: ``` fr> ' GOODBYE 'aloha ! ok. fr> ALOHA Goodbye ok. ``` Here lies the power of execution tokens. Using a single variable, we can call multiple functions! I'll also note that we called our variable `'aloha` ("tick-aloha"). This convention of preceding the variable name with a `'` is used in Forth to denote variables that store execution pointers. ### For Forth programmers There is a one major difference here between Forth and `froth`. `froth` is completely interpreted, and thus does not distinguish between variable definition lines and non-definition lines. What this means for Forth programmers is that the `froth` word `'` is equivalent to the Forth word `[']` when used in a definition context. `froth` cannot look at the input stream, and will instead always look at the next word on the execution stack. I'm thinking of changing this to be more consistent with Forth in the future, but I haven't gotten to it yet. This implications of this are demonstrated by the following: ``` fr> : SAY ' 'aloha ! ; ok. fr> SAY HELLO Error: can't find 'aloha fr> : COMING ' HELLO 'aloha ! ; ok. fr> : GOING ' GOODBYE 'aloha ! ; ok. fr> COMING ALOHA GOING ALOHA Hello Goodbye ok. ``` In traditional Forth, the `GOING` definition would be `: GOING ['] GOODBYE 'aloha ! ;`, and `SAY HELLO` would not produce an error. However, this isn't the case here. `[']` is also provided as an alias for `'`. Again, I plan to change this eventually. ## Words in this chapter - `VARIABLE xxx`: creates a variable named `xxx` - `! ( n addr -- )`: stores the value `n` at address `addr` - `@ ( addr -- n )`: copies the value at `addr` to the stack - `? ( addr -- )`: prints the value of `addr` - `+! ( n addr -- )`: adds the value `n` to the value at `addr` - `CONSTANT xxx (n -- )`: creates a constant called `xxx` that stores `n`; `xxx` returns `n` when called - `ALLOT ( addr ncells -- )`: allocates `ncells` cells at `addr` - `CREATE xxx y1 , y2 , ... yn ,`: creates an array `xxx` with values `y1, y2, ... yn` - `CELLS ( n -- )`: creates a memory address offset for arrays - `FILL ( addr ncells val -- )`: fills `ncells` cells of memory beginning at `addr` with `val` - `ERASE ( addr ncells -- )`: fills `ncells` cells of memory beginning at `addr` with 0 - `REALLOT ( addr ncells -- )`: reallots array at `addr` to have size `ncells`. - `EXTEND ( addr ncells -- )`: extends the array at `addr` by `ncells` cells - `LENGTH ( addr -- len )`: pushes the length of the array at `addr` onto the stack - `LENGTH? ( addr -- )`: prints the length of the array at `addr` - `' xxx ( -- addr )`: attempts to find `xxx` in the dictionary, and pushes an execution token for `xxx` to the stack if found - `EXECUTE ( xt -- )`: executes an execution token on top of the stack - `['] xxx ( -- addr )`: currently equivalent to `'` for `froth`