The MPS language
Introduction
MPS in Swedish means My Programming Language. It dates
back to about 1974, when I built a computer from loose
integrated circuits like shift registers, counters, adders
and the like. This was the time when you could buy the
very first Microprocessor, Intel 4004. I discovered that
it took a full page of code to add two 16 bit numbers, so I
decided to build my own computer instead. For it, I made
a language, that worked like assembler, but which could
be written so that it looked like higher level programming.
My computer, which was called Dator 108 (Computer 108),
was an accumulator machine, where the computational results
accumulated in an accumulator register.
Later I became acquainted with the principles of stack
machines, so I developed a new language called MNPS.
However, there was never any machine for it.
When I met with the Propeller computer, I used that
language for it, though, under the name of Myra, from
the Swedish word for ant. Still the Propeller wasn't
actually a stack machine, so it costed code size and
computation time to adapt the stack language to a
non-stack-machine.
So I decided that the MPS language would make a come back
after 40 years, but now as a language for the Propeller
computer. I have also thought of adapting MPS to the
ARM architecture, i.e. a Raspberry PI, but we will see
how that goes.
Virtual architecture
So the language is thought for a machine, which has
an accumulator register. If we write
+op
it means
accu + op ->accu
where accu is the value of the accumulator register
accu
If we write
x +u ->x
what happens is this:
- The value of the variable x (in memory) is loaded
into accu
- The sum of the value of accu and the variable
u (in memory) is written into accu
- The value of accu is written back to the
memory to the variable x
In fact, accu is also a cell in the memory. Indeed,
all this is a waste of resources in the Propeller computer,
where all memory cells are equal, and data can be transferred
from any cell to any other cell. But it is a way of standardizing
memory use, to make it easier to read and write programs.
Except accu, the virtual machine has a program counter,
though, in the Propeller, the program counter is a
specialized register. It has an index register index,
which allows for relative adressing into arrays and the
like. There is also a register that holds the current
memory adress, but this is only temporary in character.
The code
x +u ->x
is typical. The code is written "chronologically", i.e.
you have to compute the value of x before you can store it.
Hence we use a '->' symbol, rather than a '=' symbol or
a ':=' -symbol. Instruction are separated by spaces
(and also by linefeeds if you write code on more than
one line). '+u' is one instruction. Hence it doesn't
contain a space character. So the placement of spaces
is crucial.
All this is about a portion of the code, called the
executable part (what the program actually does). We will
come back to this soon, but first something about the
code as a whole.
General program structure
An MPS-application is written into on single file,
though that file may refer to other, so called object
files. That file has the following parts:
- Optionally a chrystal message, if one uses
a chrystal with some other frequency than
5 MHZ. The format is
10MHz chrystal
10 MHz and 20 MHz are supported. The Spinstamp
hybrid cirquit, which seems to have been discontinued,
had a 10 MHz chrystal.
- The system name keyword followed by a
system name. This name is used for naming all files
that are generated during compilation. It is
usually practical to let this be the same name
as the name of the file.
- The global keyword followed by declarations
of global variables. These variables reside in
the Propeller's global RAM, and are used
for communicating between the different parallel
processes. The variables are declared each on
one line. They can't be initialized (i.e. they
are initialized to zero). MPS doesn't recognize
any type concept, so all variables are seen as
32 bit numbers. Hence, the only thing one needs
to mention on a declaration line, is the variable
name.
- The globaldata keyword followed by
declarations of global data blocks. These data
blocks are initialized from files. Each block
is declared on one line as in this example:
block @ablock from(blockfile.mpd)
ablock is a 'handle' to the block, so that it can
be found. The variable 'ablock' has to be declared
as a global variable. The variables in the block
reside in the Propeller's global RAM. The actual
data come from the file 'blockfile.mpd'.
- The exec keyword followed on the next
line by an execution pattern. Items separated
by spaces in this pattern, represent processes
that are run in parallel, each one in its own
cog. Such an item may be further divided in
items separated by '/'-characters. The first
of these subitems is a number indicating a
semaphore. The next item is the process executing,
when the semaphore is 0, the second the process
executing when the semaphore is 1 etc. The semaphores
are the topmost global variables. Number 0 cannot
be used. The semphores can have any names you wish,
but in the execution pattern, they are represented
by their numbers, and not by their names.
These "semaphores" are not semaphores in the
Propeller systems sense; they are just global
variables controlling execution in the way described
here.
- The process codes, as described in a while,
each starting with the keyword process,
and ending with a '\'-sign (backslash)
A line starting with a '--'-sign is a comment. Anything
that follows a '--'-sign on a line is also a comment.
Process code
As mentioned, processes start with the process
keyword and end with a backslash. Between there, there
are three segments of code. The first is a description of
the process. The process may then call functions. These
can appear in the second segment. But they can also
be loaded from external files. For this purpose, there
is the third segment, which is a load segment.
Let's return then to the first segment; the description
of the process proper.
So it starts with the process keyword followed by
the process name. This is the name, that is refered to in
the execution pattern. Then follow the
variable declarations. These contain the variable name
and optionally a '='-sign and an initial value. (These
"local" variables which reside in each cog's memory can
be initialized.) Initial values may be given with
- Normal numbers, representing decimal values.
- A hexedecimal representation, preceded by a '$'-sign.
Up to 8 hexadecimal digits may be given, but if
they are fewer, they represent the least significant
part of the word.
- A binary representation, preceded by a '%'-sign.
- A letter or other sign, surrounded by quotation
marks ("). This represents a character, but
"physically" the characters are representd by their
ASCII-codes.
- If the variable name is the name of an array,
the values of the respective elements are separated
by commas.
- A string may then be represented as a series
of characters surrounded by quotation marks,
separated by commas. But the string can also
be written as a contiguous line of characters,
like "Hello". This is a null terminated string,
i.e. the system adds a 0 at the end, i.e. the
ASCII representation of the null character.
- An array can also be declared without being
given initial values. In this case, the size
of the array (in longwords) is given between
brackets, like buffer[64]
The variables are followed by the begin keyword,
followed by the executable code.
A process doesn't have a 'return' statement. It may
be an infinite loop, if you wish, with an initial part.
You can end the process with an infinite empty loop,
which you can describe as A:here (this means "Jump Always
here"). Alternatively you can use a stop instruction.
The difference between the stop and the empty loop, is that
the stop instruction saves power, but it also disables
all outputs.
Another type of end of a process is that a semaphore
is changed, which may mean that the process is overwritten
with new code. This may take a little time, though, so
it may be wise to put a stop or A:here after it. Otherwise
execution may continue out in the empty space, and you
never know what will happen.
Then we come to the second segment, which contains
function code. We come to that in the next section.
Here, then, is an example of the third segment, i.e.
the load segment:
load from (ram.mpo)
*ram.init
*ram.write
After load from comes the filename.
The rest is a 'point list' of functions to load.
Functions may be loaded from several object files,
which requires several constructs as the one above.
We will return to object files in a while.
Function code
Functions which follow the process code are
written almost like the process code.
The code starts with the function keyword
followed by the function name, and optionally a so
called suffix which is a hyphen (-) followed
by a string. More about the suffix later. After
that follows variable declarations as in the process
code, the begin keyword, executable code, and
last there is one and only one return
instruction
Object files
Object files are collections of functions to be used
by processes, and sometimes by other object files.
Object files should contain such collections, which
deal with some sort of "thing" in the external
world, like a sensor, or an external memory. But
they could also deal with more abstract things, like
a communication interface standard, och a mathematical
number type. It's a good convention to mark all these
functions with an object prefix, followed by a dot.
So as an example a function that gets the reading of
a compass could be called compass.get, and if
some initialization of the compass is required, there
could be a function compass.init.
Except for the functions, an object file can contain
variables, called field variables. The variables are
declared on one line each. The list is prececed by
the keyword fields and is terminated by the
keyword endfields. The field variables represent
the state of the object. A common type of field variable
is the pin number to which a device is connected. These
pin numbers are usually set in an init function.
A function declaration starts with the function keyword,
and ends when the keyword function appears again. Last
in the function declaration, there may be a load section,
to load other functions called. There has to be such a
load section, also when a function calls a function within
its own object file. The load section has the same syntax
as the load section in process code.
Object oriented language?
MPS is not fully an object oriented language. But it
has the most essential aspect of object orienting.
This is that a program should be built with components,
that correspond to things, that belong to the
problem to be solved.
In "real" object oriented languages, the objects are
instances of classes. Instances of a class all have
the same functions (called methods in OOL-jargon) but
the instances have individual field variables. In MPS
we can arrange this by letting the field variables be
arrays. In a true OOL, we could write:
Dog fido = new Dog();
fido.give(food);
After this, the variable fido.hungry would be false.
In MPS we could do this:
-- fido is dog number 3 (this is just a comment)
( 3 , food , )give
Now [3] hungry is false. (Here again, everything is
written in "chronological order"; we will come back
to this in a while.)
The OOL solution is more elegant, but the MPS solution
works. It's just that you shouldn't forget to put
an instance number first in your function calls. (If
I am not wrong, the "MPS solution" is the one used
for the Ada language, which wasn't originally an OOL)
One thing that is definitely missing in MPS is heritage,
i.e. that an object can be a specialization of another,
like that a student is a special form of a human.
Variable types and scopes
We distinguish between the following type of
variables:
- Global variables reside in the Propellers
global RAM. They are the primary means of communication
between parallel processes. They are declared
in the global section before all the different
process codes. These variables are accessed in a
special way through the readlong and writelong
instructions.
- global data blocks represents a way to
give global variables initial values. These are
taken from files. Typically a datablock can represent
a font, or list of initial parametervalues to some
device. These data are mostly intended to be
constants, i.e. you don't write to them. But in
the context here, they behave as global variables.
- field variables are declared between
fields and enfields in object files.
They can be seen as state variables for an object.
- public variables are variables which
are accessible throughout a process. Normally
they are declared in the process code.
- private variables are variables which
are accessible only within a function.
Everything we write in MPS is converted to Propeller
assembler (and some SPIN). Propeller assembler doesn't
have very much notion of scope, though. If ever you declare
some variable twice with the same name, you will get
an assembler warning: "Symbol is already declared". If
you declare it only once, it is still accessible throughout
a process. Hence, to make a private variable inaccessible
outside a function, requires a special effort.
Here are the principles:
- Global variables must have their own names.
- Nonglobal variables are not accessible outside
the process where they are declared.
- Field variables could be mixed up with like-named
variables in a process that uses the object.
Therefore, field variables should have names
marked with the object file they pertain to. It
seems to be easier to read the code if this
marking is put at the end of the name, i.e
inpin_compass
- To keep private variables private, we use
a suffix mechanism. The suffix for a function
is written after a hyphen (-) after the
function name. All labels and all variables
declared in that function are suffixed with
the suffix before compilation. Hence, if the
suffixes are unique, the private variable names
will be unique, and thus private. But all these
suffixes don't disturb you when reading the
MPS code.
- A variable declared in the process code, or in
an unsuffixed function are public throughout the
process.
- If the same name is used for a private variable
and a public variable, the compiler will prefer
to see it as a private variable. So the suffix
will be added to the variable.
Executable code
So now, at last, we come to the executable code, i.e.
the instructions. The principle is simple. If you
chop up the code after begin along spaces and line feeds,
you get the individual instructions, which are executed
in that order.
There are a number of categories of instructions.
Binary instructions
These are "mathematical" operations on data, and two
data are involved (hence the name binary). One of them
is accu, and the other one, which we call
op is refered to in the instruction. op
can be
- the name of a variable. In that case the
value of that variable is used.
- something identifiable as a number. In that
case that number is used. Numbers can be
given in hexadecimal form if they are preceded
by a $-sign, or they can be given in binary
form, if they are preceded by a %-sign
- a single character enclosed in quotation marks
(e.g. "a"). In that case, the ASCII-code for
the character is used.
- a variable name preceded by a "#"-sign.
In that case, the adress of the variable is
used.
Then, quite in general, if ¤ is an operation, ¤op means:
perform the operation between accu and op,
and write the result into accu.
The operations are
- load. This operation is written "anonymously" as 'op'.
The operation creates the value of op
regardless of what accu is. So the instruction
'op' will load op into accu
- +, addition
- -, subtraction (accu-op)
- *, multiplication (this is done with a multiplication
routine, that you have to load from the object file
'std.mpo')
- (There is no division instruction, but there are
division routines at 'std.mpo')
- &, logical and
- |, logical or
- X, exclusive or (this makes the capital X a
reserved name)
- <<, left shift. accu is shifted left
op steps. For the normal data reprsentation,
this means that accu is multiplied by 2 op
times
- >>, right shift. accu is shifter right
op steps. This is an algebraic shift, which
means that the sign bit (the leftmost bit) is
shifted in from the left. For the normal data
representation, this means that accu is
divided by two op times.
- N>> ,logical right shift. Here zeroes ar shifted
in from the left.
Unary instructions
These instruction only involve one piece of data.
namely the value of accu. So they are mappings
from accu to accu. Some of them don't affect accu
either.
- - ,negation. accu is multiplied
by -1. (for the standard binary 2's complement
data representation)
- C, complementation. Every 0 in
accu is replaced by 1, and every 1 by 0.
(As a consequence, capital C is a reserved symbol.
You can't declare a variable C, and then load
it)
- Z, sets accu to boolean true,
if accu is exactly zero. Otherwise
boolean false. A boolean true, is a 1 in the
least significant bit. All other bits are zero.
A boolean false is all bits zero. There are
no reserved symbols for true and false.
If accu already has a boolean value,
Z acts a logical negation (true becomes false,
and false becomes true)
- G, sets accu to boolean true,
if accu is strictly greater than zero.
- now, loads the value of the Propellers
real time counter cnt into accu.
- wait, halts execution for accu
Propeller ticks. With a 5 MHz chrystal and a
PLL setting of 16, one tick corresponds to
1/80'000'000 seconds (12.5 nanoseconds).
If you try to wait less than approximately 120
ticks you will be in trouble. When the check is
made with cnt for the first time, cnt will
already have passed the desired value, so the
program will halt several minutes (until cnt
overflows). The
wait function will prevent this by setting a
floor to the value.
- waituntil, halt execution till
cnt reaches the value of accu.
For this instruction there is no safety measure
like for wait.
- stop stops the cog in which it is
executed. The cog goes into a low power mode,
but this also means that all outputs are
disabled.
- ( is a compilation directive to
reset a mechanism for sending arguments to
a function. We will return to this later.
- ,. This is an instruction for sending
of arguments to a function. We will return to
this later.
Function call and return
- )fun is a call to the function fun.
It can be combined with ( and , to
send arguments to a function
- return is an instruction to return from
a function. The mechanism here is the one used
in the Propeller system. The adress to return
to is written into the return instruction of the
function at the moment of calling. This is a
nice solution, but it prohibits a function to
call itself.
Labels
An "instruction" (i.e. a code fragment surrounded
by spaces or linefeeds) which starts with a colon
(:) is a label. It is not an instruction to be
executed, but it marks a place in the sequence of
instructions to jump to.
In the assembler code, there is an effort to place
the labels at existing instructions, but this fails,
if there is reason to put more than one label on
one instruction. In that case, a nop-instruction is
inserted to carry one of the labels.
Jumps
In Propeller assembler (and ARM by the way)
all instructions can be conditional. In Myra I used
this, but in MPS I don't. The reason is that there
is a "security risc". Wether a condition is satisfied
or not is measured by means of two status registers.
These are updated when you want them to be. Now, you can
have a chain of conditioned instructions. While you
execute them, you don't change the status registers.
But what if you have a function call? Are you sure
that the function doesn't contain similar chains
of conditioned instructions. If so, you would change
the status registers in there. This seems to be
dangerous to me. I have thought of different kinds of
stack mechanisms, where the status before the function
call could be restored. But it seems complicated.
So my decision for MPS was to have only conditional
jumps. Conversly all jumps are conditional, but there is
an always condition A.
The condition is a relation, R, between accu and a
reference value ref. If the relation is satisfied,
the jump is done. The syntax for the jump is now:
Rref:goal
If accu R ref (read as: accu related with R to ref)
then the jump is made. These instructions are
thus characterized by the presence of a colon, but the
colon is not the first character. The relations are:
- =, for accu is equal to ref
- #, for accu is not equal to ref
- >, for accu is greater than ref
- <, for accu is less than ref
- >=, for accu is greater than or equal to ref
- <=, for accu is less than or equal to ref
if you let true be equal to the 'true' value, i.e.
the LSB equal to 1, you can write =true:goal, but
there is a simpler notation just with T:goal. Analogously
there is an instruction F:goal.
Finally, the always condition deviates from the standard
syntax in that there is no reference value to compare to.
The syntax is simply:
A:goal
The ref value can be given in the same ways
as the value of op, but global variables are not
allowed, and the notation #var for the adress of var
is not supported.
Here's just an example:
x >30:much
Execution goes to the label :much if x is greater than 30.
Preprocessing. Extended MPS
There are certain instructions, that are replaced by
others by the compiler in a preprocessing step.
This preprocessing allows you to write more readable code.
We call the "instructions" processed in this way
Extended MPS elements.
The preprocessing right now deals with conditional code
(if-then-statements).
It allows you to write something like this:
if(x >y){ x +1 ->x } else
{ x -1 ->x }
with a hopfully obvious meaning. It's a system that lets
x follow y. The extended MPS elements are if(,
>y){, }, else, { and a second
} . The
principle is that all these elements are instructions in
themselves, so they should be surrounded by spaces. A maybe
unfortunate exception is the if( element,
which doesn't need to be followed by a space (I thought
that would be ugly).
To explain this, let's start with the simpler case without
an else branch:
if(x >y){ x +1 ->x }
First of all, the word "if" may be important to you. It
signals that there is a condition evaluation down the line.
But the computer doesn't care. So the compiler removes
if( completely. Then the condtional job is done by
skipping over the code within { and },
if the condition is not satisfied. We do
so by means of a skip label, that we create for the
purpose. So the above code is modifed to:
x <=y:s11 x +1 ->x :s11
(We use the possibility to put a label anywhere on a
line in MPS).
Before I show the complete translation scheme, I'll mention
a variant of the conditional jump. A conditional jump
looks like for example #y:target. Now, if you have
introduced the if( element, it would be nice to
be able to have a balancing right parenthesis. So, just put it
there!: #y):target. The compiler will remove the
parenthesis for you.
Here now, the translations that the compiler does:
- if( is translated into nothing
- ): is translated into :. However
this fragment is found also in the NEXT
instruction (see below).
The compiler detects this case, and
avoids removeing the ).
- Ry){ is replaced byt R'y:Sj,
where R' represents the opposite condition, and
where Sj is a label, that the compiler invents
for the purpose
- If there is no else element, then
} is translated as the label :Sj
- if } is followed by else, then
a few things happen. Firstly, there has to be
a jump past the else branch. For this purpose
the compiler invents an else label Ej
Then, the first } is translated into
two instructions: A:Ej :Sj.
The second } is translated as the label
:Ej
- else, after having been used this way,
is removed.
- A single { can only appear in the beginning
of an else-branch, and serves no purpose, so it
is removed.
So now we can see how this example
if(x >y){ x +1 ->x } else
{ x -1 ->x }
is translated:
x <=y:S1 x +1 ->x A:E1 :S1
x -1 ->x :E1
I have tried to write the compiler in such a way, that
these condtional statements can be nested into each other,
but I haven't tested it yet. (The trick is to put
the created labels on a stack).
One may wonder how these extended MPS elements differ from
standard MPS instructions. Well, standard MPS is what we
call a context free language. Every instruction in
MPS is independent of its context. The only way, that the
instructions communicate, is with accu, index
and the program counter (and possibly through the outer world
through I/O instructions). For the extended MPS elements, it
is not so. The >0){ has to know where the label to
jump to goes. This is handled by the compiler, which has to
have some overview of the code.
The NEXT instruction
The NEXT instruction is in a sense a binary instruction,
but none of the operands is accu.Here's an example
NEXT(i):loop
i is a variable, and it is decremented with one. If i is
not equal to zero, execution goes to the label :loop.
This instruction is directly based on the djnz
instruction in Propeller assembler (decrement and jump
if nonzero), which is popular in many other asembler
languages too. It represents a simple way to make a
loop with a defined number of turns. Note though that
the variable i, if you want to use it in the loop,
evolves backwards, and it never reaches the value zero.
Index instructions and arrays
The following is an index instruction
[op]
It loads the value of op into the systems index register
index. Normally maybe, op is the name of
a variable, whose value is loaded into the index register.
But op can also be something identifiable as a
number. However names of global variables are not allowed.
If the bracket is empty as
[]
then it is equivalent to [accu].
So, index is loaded. More importantly, the
instruction, that immediately follows, is modified, so that
the adress is augmented with index. In the normal
situation, when op is a variable, we don't load
that variable, but the variable that has an adress index
steps higher up in memory. The 'step' here is a step in
the longword sense, i.e. we move up 32 bits, or equivalently
4 bytes in the memory.
So, in this way, we can move up in arrays of data. But where
do the arrays come from?
Well, there are two ways of declaring arrays.
- We give more than one initial value to a variable.
We separate the values by commas. So
alpha = 4,5,72,3
creates an array of four variables, with the
values 4, 5, 72, 3. Now what happens with the
following code?
[2] alpha ->x
Well, x will be 72, because that is the value of
the second element of the array. (the value 4 is the
value of the zeroth element). With the same principle
we could write:
text = "H","e","l","l","o"
But we could write this simpler as:
text = "Hello"
but now the system will add a null character after
"Hello". So, now
[1] text
will load the ASCII code of "e" onto accu
- We can also just reserve space for an array
by writing
beta[128]
as a declaration. This will be an array with
128 elements (numbered from 0 to 127), with no
defined initial values. I guess the values are zero.
So beta is a space to write to before you read from
it.
Writeback instructions
Writeback instructions deviate from the pattern,
that all computations go through accu. These
instructions are for saving space and time, because they
allow you to do an operation in a single assembler
instruction. Here's an example:
+1,x<
1 and x are added, and then the result is written back
to x, which is marked by the "backwards arrow" '<'. You
can do this for any of the binary operations. Particularly
you can do it for the load instruction:
320,x<
will load the value 320 into x.
These instructions should be used when needed for speed
and small program size, as they are not terribly readable.
Note also, that accu is never affected, so you can't
go on and imagine that the result of the computation is
available in accu.
There is one more type of one-assembler-instruction
instruction, the n>x instruction, that we will encounter
in the next section.
Function calling
If we want to call a function fun with two arguments
x and y, and then want to deposit the result in z, we
write this as:
( x , y , )fun ->z
In mathematics we would write
z = f(x,y)
Here, again, we have the "chronological order" in MPS.
We have to collect the arguments to the function before
we can call it, and we have to call the function before
we can store or use the result.
The spaces in the MPS code are important, because both
'(' and ',' are actually instructions. The principle is
that we pass arguments to functions over a standardized
set of variables called arg0, arg1, ...argi,... arg5.
The '(' resets a kind of index i to 0 in the compiler.
',' stores the current value of accu into the
variable argi, and increments i. Most of this just happens
in the compiler. In real time, ',' is just a store
instruction, and '(' is nothing at all. Finally, when
all the necessary arguments have been brought into the
argi variables, we can call the function, which is the
')fun'-instruction, which is a subroutine call to fun.
This mechanism means that you can have expressions in
the function call. The system just grabs the result in
accu and transfers it as an argument. But you
can't have function calls in these expressions. Then
you would have two '('-instructions before the first
')'-instruction, and the system makes no attempt to
find out what that would mean.
The expression can also be empty for the first argument
in the function, if you know that accu contains the
right value.
If you want to pass only one argument to a function,
you can just as well use accu to pass that argument.
Then the call is simply:
( x )fun
The '('-instruction doesn't do anything here, but it
doesn't do any harm either. Some functions take no
argument at all, and in that case, I have used the
convention to leave out the '(' and just write:
)fun
')' here simply symbolizes a jump.
A word of warning now. All functions use the same variables
arg0 ... arg5. So if you call a function in a function,
the new call will overwrite your arguments. So if you need
them after that second call, you have to rescue them
before the second call.
The variables argi can be used directly, but if you
want to move their values over to other variables, there
is a fast instruction for this:
3>y
is equivalent to
arg3 ->y
but it is realized as a single assembler instruction.
As in most languages, a function returns a single result.
This is done through accu, which is loaded with
the result just before the return instruction. If you
wish to return more results, you can do so through the
argi variables, and in that case, the n>x instructions
may be usefull.
In a language like Java, the standard method for returning
more than one result, is to pack the result into an instance
of a class, and then return a pointer to that instance
(though in Java one has tried to make the concept of pointer
invisible. One returns the 'name' of the instance, though
technically this name symbol may be a pointer).
Another popular method of returning result, is to let
the user of a function indicate where he wants his result
deposited. If we for example want a result as a string,
we can declare that string, say as txt. Then we
send the adress of txt to the function as #txt.
Now, the function knows where to put the result. Another
question is how to do it. The next section is about that,
and that section is about arrays again.
The symbol L0
The symbol L0 (Local zero) represents an array which
fills out the local memory of a cog. (If we load a process
proc into a cog, the name of the first word in
the cog, must be proc, because after a cog has
been loaded with instructions, the only thing the system
can do, is to jump to the first word. Hence the adress
of the array L0 in this case is the same as the adress
of proc). Now assuming that we know the adress
adr = #var, we can read the value of var
[adr] L0
or we can write to it through
[adr] ->L0
This gives us a mechanism to deposit our result in the
right place.
In line assembler code
A line starting with a semicolon is imported directly
into the assembler code, as it is. Hence, this is supposed
to be a line written in Propeller assembler. Such lines
are used in cases, where something can't be expressed
in MPS. Writing assembler lines like this may require an
extra effort. One case is suffixing. If you use a local
variable in MPS, the suffix is added automatically
when you use the variable, but for the in line assembler
code, you have to add the suffix manually.
The std.mpo object file
The object file std.mpo is a little like the Java
package Java.lang. It contains standard technical and
mathematical concepts, that can be used by any program.
The functions in std.mpo are written in MPS except for
a few lines, that are assembler in line code.
I have thought that MPS is effective
enough, so that I wouldn't need to go to assembler.
The functions in std.mpo fall in a few categories, that
we will deal with in different sections.
Pin manipulation functions
Propellers and similar processors communicate with
the external world through 'pins' which are pins on
the processor chip. The Propeller has 32 such pins, and
they are exclusively digital pins. They can be set
to be input pins, which measure the input voltage,
and interpret that as a 0 or as a 1. In this mode
they have high impedance. Or they can be output pins,
with "totem-pole-configuration", so that they can deliver
either a high voltage (1) i.e. 3.3volts, or zero voltage
(0), in both cases with low impedance. As the
pins are 32 in numbers, they directly correspond to the
bits of a 32 bit number.
For this, there are the following functions:
- ( x )setin, sets pin number x to be an
input pin.
- ( x )setout, sets pin number x to be an
output pin.
- ( mask )setdir. Any pin which corresponds
to a bit in mask, which is 1, is set to be an
output pin. Other pins are unaffected.
- ( x )set0, sets pin number x to be 0
(zero volts), if the pin has been set to be an
output pin.
- ( x )set1, sets pin number x to be 1
(3,3volts), if the pin has been set to be an
output pin
- ( i )inpin, reads pin number i, and
returns the result in the least significant
bit of accu. Hence, accu is zero
(boolean false), if pin nr i has low voltage.
If the pin has higher voltage than 1.65volts
then accu contains the number 1
(boolean true).
- ( data , mask , )outpins, affects only
those pins for which the corresponding bits of
mask are 1. These pins
are given values from the corresponding bits
of data.
- ( mask )waitfor0, halts until
a certain bit pattern appears at the input.
That pattern is that all the bits corresponding
to 1:s in mask are 0.
- ( mask )waitfor1, halts until
at least one of these same bits is 1.
If you want to read all the inputs simultaneously,
there is a variable ina which contain the
whole bit pattern as a 32 bit word.
In the same way, there is a variable outa,
where you can set all 32 output pins, by writing
a 32 bit value to outa. In most cases, this
is not very practical. You want to leave some bits
unaffected, when you send out information to the
pins. This is handled readily with the outpins
function.
There is also a variable dira which controls
which pins are output pins. setout, setin
and setdir write to that variables.
ina, outa and dira are in reality
physical registers, but they are mapped to the adress
space of each cog's local memory. There are other such
registers, like the real time counter, cnt.
As these registers are mapped to the cog's adress spaces,
you can set them in each cog, and you
can set different values to them. The rules then are
- If some cog thinks that a pin should be an output
pin, then it is an output pin.
- If some cog thinks that an output pin should be
1, then it is 1.
ina, cnt and others are passive input
registers, and you can read their values from any cog.
Mathematical functions
These functions are:
- ( i , #var , )bit, returns in accu
that the value of the i-th bit of the
variable var (which is presented to the
function with its adress). The bit value is
delivered in the least significant bit of accu.
If i is greater than 32, the function will
continue into the next word. In this way, it would
be possible to scan the bits of the whole cog memory.
var must be a local variable (not global).
- ( a , b , )mult, multiplies a and b
and returns the result in accu. This function is
used when a multiplication instruction is written
in MPS (*op), and the mult function has to be
mentioned in a load statement from std.mpo. The
function assumes the standard number representation
of numbers with binary representation and sign bit.
The purely binary algorithm used, was known already
in ancient Egypt, and the Greeks and Romans borrowed
it as the only feasible way to multiply numbers
written with Roman figures.
- ( n , d , )div, divides n with d, and
delivers the result in accu and the remainder
in arg0. John von Neumann's name is associated
with a division algorithm directly adapted to binary
number. I have used another algorithm, though, which
goes back to Euclid. The idea is this: Trying to
divide n by d, you make guess,
q0.
Then you can compute the error as r1
= n - q0d. To try to improve the result you
compute
r1/d with a similar guess q1
and so on. If your way of guessing is such, that
you get a smaller remainder most of the time, you
can attain a good result. The result is the sum of
all the qi:s, and the remainder is the
last ri. The "guess" for q I make, is
to replace the division with an approximating
shift instruction. So I compute q as d >> k, where
k is the biggest integer that makes q smaller than
the true value d/n. I call this k the binary log of d.
As this approximation of the division, can be
pretty rough, convergence can take some time. but
I still believe that this algorithm can be competitive
with von Neumanns algorithm. The function itself if
named pdiv, which only manages positive values for
n and d, but it's wrapped into a function div, which
corrects the signs.
- ( d )blog, is the abovementioned binary
logarithm of d, so it has nothing to do with blogging.
Its main use is presumably as a subfunction to
div.
- ( x )div10, is a funtion specifically designed
for dividing by 10. Such a function can be used for
making decimal expansions of numbers, e.g. for printing
numbers. That application requires high precision in
the division. The algorithm is essentially the same
as for div, but the appoximation of x/10 is
made as x*103/1024, where the division by 1024 is made
as a right shift with 10 steps. This approximation is
of course pretty good, so convergence can be expected
to be fast.
- ( v )sin, computes the sine of the angle
v. v shall be given in a format known as BAM, Binary
Angle Measure. This format overflows to -180, when the
angle passes 180 degrees. In this format, the second
bit is worth 90 degrees. At 180 degrees, 1 will appear
in the first bit. By convention, this is a sign bit,
so the angle now is seen as negative. This means that
angles are represented between -180 degrees and 180
degrees. The function uses the
Propeller ROM-table for the sine function. This table
gives sine for angles in the first quadrant, so the
MPS program has the task to identify quadrant, and
produce the correct result for all quadrants.
- ( v )cos, computes the cosine of v, which
should be an angle in the BAM format.
- ( x , min , max , )lim, is a limiter,
which limits x between min an max, i.e. it
delivers min if x is smaller than
min, and max if x is greater
than max and otherwise x.
- ( x , min , max , )between, is a boolean
function which delivers a boolean true if x falls
in the interval between min and max
- itsqrt is there to allow the computation
of the square root of a number using Newton Raphson's
method. If x is a candidate for being the square root
of y, then ( x , y , )itsqrt gives a better
candidate. If y is a slowly varying quantity, and
you need the square root of y in a cyclic process, then
it may suffice to call this iteration once in each
cycle with ( x , y , )itsqrt ->x. x will then be
a state variable, which chases the square root of
y.
- ( u , x , m , )lp makes up for a first order
low pass filter. With the cyclic iteration of
( u , x , m , )lp ->x, x will vary slowly, but
ultimately reach u. m is setting the
time constant. If the iteration is done cyclically
with a cycle of Ts seconds, then the
"physical" time constant of the filter is
Ts°2m seconds. You can write
a low pass filter with any time constant if you use
division, but this is a division free way of making
a low pass filter.
Serial communication
Serial communication here follows a standard
called RS232, but which officially has changed name
to EIA232. It is a serial communication without clock
signal. Hence, communication can take place on a
single line + ground. The line is high, when nothing
is going on. When the line is going down, this is
the beginning of a start pulse. It is then followed
by a number of datapulses, usually 8, which come with
fixed time intervals. Finally there is one or two
stop pulses, which take the line high again. The
fixed time interval is an important parameter in the
protocol. It is usually given with its inverse, which
is the number of pulses per second. There is a long
row of standard values for this so called baud rate.
The second parameter is, on which computer pin data
come in or out.
The standard is, as mentioned a format of 8 bits (one byte)
but std.mpo also has functions for 9 bits and 32 bits.
The 9 bit programs are intended for some cases, when one
wants to transmit both data and commands. The data words
may contain 8 bits, and in that case, the ninth bit is
needed to distinguish between commands and data.
When using this protocol, it is important that the
receive function is called before the data are expected.
Otherwise, data are lost. If called early enough, the
receive function will wait for the data. To make sure
that the receive function is called early enough, it
often has to reside in its owh process. When the data
are received, they are quickly deposited somewhere where
other processes can read them, and then the process
returns to the receive function, to be ready for new
data.
Here are the RS232 functions:
- ( data , pin , baud , )send, sends
the data over pin pin with baud rate
baud. The data bits are coded in the
8 least significant bits of the word data,
and the rest of the bits are masked by the function
- ( pin , baud , )receive, receives a
word on pin pin assuming a baud rate
of baud. The result is delivered as the
8 least significant bits in accu
- ( data , pin , baud , )send9 and ...
- ( pin , baud , )receive9 are the
same functions but with 9 bits.
- ( data , pin , )send32 and ...
- ( pin , )receive32 are the
same functions but with 9 bits. and a fixed
baud rate of 500,000 bits per second.
The 8 bit functions are further supported by
buffering functions,
They may be used when the user of the data temporarily
uses the data at a slower pace, than they are delivered.
When the buffering funtctions are used, the the
receiving process deposits data in a buffer with a
call )wbuff with the received data in accu.
Then some other
process can get the data in the right order and without
loss by making a call )rbuff. The buffer is of
limited size however, so in average, the using process
can't be slower than the receiving process. To use these
functions, one has to declare global variables:
iw
ir
buff[128]
The buffer size is 128 words, which means that the
buffer holds 128 bytes. The size can be changed to
another power of 2, but this requires a slight change
of wbuff and rbuff.
Analog input
These functions work for a specific A/D-converter
Microchips MC3208, which is connected to the Propeller
chip to given pins. The analog data are brought to
the computer in serial form according to a protocol
called SPI. The protocol is also used for ordering
conversions. The converter has 8 channels, and data
are converted to 12 bit numbers. Two functions handle
this:
)ianalog, initializes the A/D converter
and the connection to it
( ch )analog, bring the value of the
ch:th analog channel to accu
Number conversions
These two functions construct strings for typing
the value of numbers. The numbers are alsways
viewed as 32 bit integers, and they are converted
to decimal or hexadecimal form. For the decimal
versions, negative numbers are preceded by a
minus-sign.
- ( x , nfig , #str , )dec creates
the decimal expansion of the number x, and
deposits the result in str, which is
presented to the function with its adress.
nfig is the desired number of figures.
There is no guarantee what happens i nfig
is too small to contain the number.
- ( x , #str , )hex, like dec, but
it makes a hexadecimal expansion. The number
of figures is always 8, which is what is
required for a 32 bit number.
===========================================================
The Myra language is a stack based, higher order
language, that can be compiled into Propeller (TM) assembler
code. Propeller is a trademark of the Parallax company,
that has developed and manufactures the Propeller
processors.
As a language it is fairly universal, but some of its
features make use of features in the
Propeller assembler language.
Above all, the Propeller computer is a parallell processing
computer with 8 separate processes, called cogs,
communicating via a common memory.
The language is named after the Swedish word 'myra', which
is 'ant' in English, as it is a stack based language.
(Ants build stacks in Swedish and heaps in English, but
these words are used more or less interchangeably in
computer science)
Compilation
A Myra program called 'example.myr' can be
compiled using
java Myra example (no need to write '.myr')
This generates a number of .spin files, one for each
cog used in the system. The filenames are derived from
a system name, given on a system-line in the Myra
program. The files are then named
systemname1.spin
systemname2.spin
systemname3.spin
etc.
They are assembled and loaded from the Propeller environment,
by opening the first of these files. The Propellent
environment can also be used.
The generated code is slower than manually made
assembler code, as there is some overhead for managing
the stack, but the code is much faster than spin code.
Stack based programs
A stack is a heap or stack of values stacked on top
of each other. You can only enter data on the top of the
stack, and only the value currently on the top of the stack
is visible. Normally when you use a value, it is
simultatneously lifted off the stack, making the value
under it visible. The normal operations are
- pushing data on the stack from a source
- poping data off the stack and moving them to a sink
- computing functions, which first pop arguments off
the stack, then compute the result, and finally
push the result on the stack
The functions can be normal operations like + or *, or
they can be any function with a name like sin.
This is the elegance with stack based programs:
operations and functions look the same. This also
means that there is a standardized way of supplying
arguments to functions, and retrieve the result. Unlike
most programming languages, a function is free to deliver
more than one value as a result.
The language has four special stack operations
- popoff, pops a value from the stack without
using it for anything
- dup, duplicates the top of the stack. Good if
you want to store a result into two places.
- flip, swaps the two top elements on the stack.
Good if you have asymmetric operations, like -.
You can compute both a-b and b-a easily
- unpop, restores a value previously popped
off the stack.
Stack machines inherently work with ?ukaszevicz notation,
also called reversed Polish notation, or postfix notation.
This means that all arithmetic expressions can be written
without parentheses, and these expressions can be treated
without parsing. This simplifies the compiler. The presence
of a parenthesis in normal code, implies that there is a
need for a temporary variable. In a stack based system
you can avoid temporary variables, and use the stack instead.
You can develop it into an art to write stack based
programs with as few variables as possible, but beyond
some limit, this can be harmfull for the readability of
the code.
Example program
As an example to look back to, when reading the
following paragraph, here's a piece of code:
system colors
global
x
y
exec
mix an
process mix
red = 3
green = 6
blue = 5
black = 7
colormask = 7
tred
tgreen
tblue
trg
ttot = 40960
begin
colormask setdir
:loop
y ttot * 12 >> ->tblue
ttot tblue - ->trg
x trg * 12 >> ->tred
trg tred - ->tgreen
red tred [show]
green tgreen [show]
blue tblue [show]
black 0 [show]
go(loop)
function show
tshow
begin
->tshow
colormask outpins
tshow wait
return
\
process an
chx = 0
chy = 4
begin
init_analog
:loop
chx analog 4 >> ->x
chy analog 4 >> ->y
go(loop)
\
The program sends out different colors on an RGB-led.
The led is connected to pins 0,1 and 2 on the Propeller,
and it is connected so that, a diode is on, when the
corresponding Propeller output is zero. Hence 7 means
that all three diodes are off, so the color is "black".
The color is controlled by two potentiometers, Their
values are read in the second process. The light is
controlled in the function show which sets up
a "pure" color, and then lets that be on for a specified time.
The times are set in the process mix, which
proportionates a total time of 40960 tics to each of
the colors, depending on the potentiometer settings.
General program structure
A Myra program has the following separate parts:
- Potentially a "10 MHz"-tag if the program
is to be run in a Spin Stamp module (which has
a 10 MHz crystal; Normally 5 MHz crystals are used)
- A system name part consisting of the
keyword system followed by a space
and the system name. This name is used
for naming the compiled spin-files
- Global variable declarations following
a global-keyword. Each variable
is declared on a separate line. The variables
cannot be assigned initial values. Space can
also be reserved with a bracket notation.
a[256] means that 256 long words, corresponding to
1024 bytes, are reserved. The first of these long
words can be referenced with the name 'a'. The
usefullness of this is shown later.
Global variables are of the long type
and are placed in the Propeller's common
memory.
- An execution part following a exec keyword.
The processes named on the following line and
separated by white spaces are executed in parallell,
each in one cog. There is an extention of this format
to allow serial execution. This is described
here.
- Up to seven processes, for running in each of
the Propellers cogs. (Cog nr 0 runs the process,
loads the other cogs.)
Structure of a process
A process starts with a process keyword, followed
by the process name. This name should appear also in
the exec-part of the program as mentioned above.
(if it does not, this process will never be executed,
and it will not even be compiled.)
After this, variable declarations follow, each on one
line with a variable name optionally followed by an
'=' sign and a value. These variables are of type
long and are stored in the cog's local memory.
There is also provisions for declaring arrays. We will
come back to that later. Values are written with
the same standard as in Propeller assembler, i.e.
a '$' sign means hexacecimal notation, '%' means
binary notation, and "a" means a character value,
i.e. the ASCII-code for the character 'a' is stored.
Symbols like '|<20', which means a 1 in the 20th
position, can also be used. Also, expressions involving
contstants, like 80000000/9600, can be used.
After the begin keyword follows the executable
code. This code can call functions, and it can
contain macros. These can be
declared locally in the process itself, or externally
on separate files. In the former case the functions
follow the executable code of the process. The scope
of these declarations is limited to within the process.
In the latter case, the functions have to be referred
to with a 'load from'-section.
A local function declaration starts with a
function keyword, followed by the function
name. The rest of the function declaration is similar
to the process declaration, except that the function
must finish with a return statement, while
a process must not contain a return statement.
(Processes are either infinite loops, or just terminate.)
A 'load from'-section starts with a load from
keyword followed by a filename within parentheses.
The functions to be loaded are then listed, each on
one line starting with an asterisk (*). If
functions are loaded from several files, each file
requires its 'load from'-section.
It is recommended that these external functions are
named in object oriented style with a dot, like
display.write. As Propeller assembler is intolerant
to these dots, they are replaced with underscores
in the assembler code.
A process declaration ends with a \ (backslash). The
backslash of the last process ends the whole program.
These backslash signs are crucial for correct compilation.
Empty lines are ignored. Lines starting with '--'
are comments, and are ignored by the compiler. If a
line contains '--', the rest of the line after that
is ignored as a comment.
External function files
External functions are written on external function
files, preferably with a '.myo'-extention. These files
don't contain any processes, and are thus not
executable. They merely contain declarations of functions.
It is recommended that functions are collected into
.myo-files in such a way, that each file represents some
kind of an object, like a sensor, a display, something
simulated etc.
Apart from functions (or methods in object oriented
terminology) such an object file can also contain
variables, that represent the state of the object.
Such variables are called fields. They are declared
in the beginning of the file, between a fields-
line and a endfields-line.
These variables can be assigned intial values
like local variables in functions. They are stored in
cog memory, and are reachable from all functions of an
object. But they are not reachable from one process to
another.Fields should not be referenced from outside the
object functions. Instead they should be reached through
so called 'put-' and 'get-' functions. The natural way for
these functions to communicate with the outside world, is
to use the stack.
The functions on a myo-file may further refer to other
functions
on other external files, or on their own file. If this
is done, the function body should be followed by one
or more 'load from'-section, as in the main program.
It is recommended that an object oriented style dot
notation is used for the functions of an object. An object
representing a display could have funtions called
display.init, display.writetext etc.
The dots in these names will be replaced with underscores
in the assembler code.
Instructions
The instructions are found in the executable part of
processes and functions. Instructions are separated
with white spaces, or with line feeds if instructions
are written on several lines. Except for what is mentioned
about conditional statements later, the programmer is free to
divide his code into lines as he wishes.
Instructions are interpreted in the following way by
the compiler;
- If the code is recognizeable as a number,
that number is pushed on the stack. The number
should not exceed 511, as that is a limit in
Propeller assembler
- If the code is recognizeable as a variable name,
the value of
that variable is pushed on the stack. The variable
may be local to the cog, or global. The mechanism
for the loading is different for local and global
variables, but this is hidden for the programmer.
There is still the consideration that loading
a global variable takes longer time.
- If the code starts with a #-sign,
and continues with a recognizeable variable name,
the adress of that variable is loaded on the
stack.
- If the code is recognizeable as an assembler
routine name, that assembler routine is called.
The assembler routines are stored on an
assembler resource file called Assembler.spin.
More information about this later.
- Some assembler routines have short alias names
like the following
- + for addition
- - for subtraction
- * for multiplication
- / for division
- & for bitwise and
- << for left shift
- >> for right shift (arithmetic)
- <<<< for a long left shift
- || for taking the absolute value
- An instruction incscribed in brackets, like
[sin] causes a call to a Myra function with
the name 'sin'. If no such function is declared,
an Unrec: sin error message is typed.
- An instruction starting with a '->' arrow will
store the top element of the stack to the variable
that follows (without white space; ->x). This
value is also popped off the stack.
- The instruction '?' will set the status
registers of the cog to reflect the status of
the top element of the stack at that moment.
The top element is then
popped off. The Z-register of the cog will be set to true
if the top element was 0. The C-register of the cog
will be set to true if the top element had its MSB set to
1. The usefullness of this will be described later.
-
The special stack instructions are as follows:
- dup for duplicating the stack top element
- \ for flipping the two top stack elements
- popoff for poping the top element off.
- ' the "unpop" instruction, for restoring
a value on the stack, that recently has been popped
off
- The >label statement will move code execution
to the label referred to after '>'. The code
go(label) has the same effect.
- An instruction starting with a ':' (colon) is
a label whose name is what follows the colon sign.
As an instruction the instruction is an assembler
nop, which consumes two clock cycles.
- The return instruction is a return instruction
that makes execution return to the caller of a function.
Return statements may only be used in functions, not
in processes, and they must be the last instructions
in a function.
The "unpop" instruction maybe deserves the following
comment. A write instruction
->x, saves the value on
the top of the stack, but removes it from the stack.
The combination ->x ' can then be seen as a
modified write instruction, that maintains the stored value
on the stack for future use.
Macros
A macro differs from a function, in that it is never
called, i.e. there are no jumps to a macro. Instead,
the macro code is substituted for the call of the macro
directly in the code before compilation. This gives faster
code, as the jumps to and back from the macro are avoided.
But there is a penalty in memory usage, if the same macro
is used more than once. If the macro is used n times,
the macro code will be appear in the compiled program
n times.
A macro is defined in a macro definition. On its first
line is the keyword macro, followed by the macro name.
On the second line is the macro code. Hence the macro
code can not be longer than that it fits into one line.
Macros are supposed to be small.
The 'call' of a macro (which isn't a call anyway) looks
like the function call, i.e. it is the macro name placed
between brackets. The compiler will substitute the macro
name and the two brackets with the macro code.
Macro definitions can be placed in the main program file
(the '.myr-file'), or in object files ('.myo-files). There,
they can be used either by the main program or by the
object-functions.However, the main program can use the
macros only if the object file is referred to at all,
through a loading of some object funtion. For macros
no 'load from'-operation is necessary; the macros will be
found, once the system has had reason to open the
object file.
There is also a special macro file, for universally
usefull macros, called macro.myo. The macros on
this file are always available.
Name uniqueness is as urgent for macros as it is for
functions. Hence it is recommended that macros on
objectfiles are named with a 'dot'-notation like the
functions.
A macro can use a macro, but currently this is limited
to two levels, i.e. a macro can us a macro, but that
macro cannot use a macro.
Technically macros don't add anything to the system
functionality. There is no difference between
using the macro concept, and substituting the macro
code yourself. Macros are there to enhance the readability
of the code. You substitute a piece of technical code
with a name, which reflects what the code is good for.
Conditional statements
Instructions inscribed between curly brackets {...}
are executed only if the condition preceding the left
bracket is satisfied. The condition is based on the
value on the top of the stack, at the time when the
last ?-instruction was executed.
The available conditions are as follows:
- > if the stack element was positive
- < if the stack element was negative
- = if the stack element was exactly zero
- # if the stack element was non-zero
- >= if the stack element was positive or zero
- p (for 'positive'), equivalent to >
- n (for 'negative'), equivalent to <
- f (for 'false'), equivalent to = for
boolean variables
- t (for 'true'), equivalent to # for
boolean variables
If the condition is not satisfied, the instructions are
executed as nop-statements, as this is the way
Propeller assembler works.
A line can only contain one curly-bracket-pair, but that pair may
contain as many instructions as one wishes. If there isn't
space for all the instructions on one line one can continue
on the
next line with a new curly-bracket-pair, but the condition
has to be repeated.
The principle for conditional statements here mimics what
happens in Assembler and in the computer itself, but it is
also in a way quite elegant. You can write something that
works as if-then-else constructs, without letting the
compiler construct lots of jumps. But there is a very
important pitfall. Instructions inside curly brackets
can change the status registers. The ?-instruction does
that, but you probably learn pretty soon to avoid
?-instructions inside curly brackets. But the problem is
with functions and assembler functions. Remember that
arithmetic instructions are executed as assembler functions.
Most of them don't change the status registers, but
multiplication and division do. And many other assembler
functions do.
If the status registers are changed anywhere between a
?-instruction and any conditions that is supposed to use
it, things don't work the way they should.
If it happens inside
a curly bracket, the rest of the instructions in there
will not be executed. If there is a curly bracket pair
on the next line with the opposite condition (an
'else branch') then the code in there may be executed, even
though it shouldn't.
As a remedy to this, there is a version of ? called
?s, which stores away the stack content that
set the status registers, in a fixed place. We call
this the status variable. Then, you
can at any time restore the status register, which is
done with a ! instruction. Place this instruction
after any instruction that may have changed the status
register. Here's an example:
x ?s
={a b c * ! + ->z}
if x is zero z is computed as a+bc. '!' protects for
the multiplication, which might have changed the
status registers.
All this i OK, unless a function that you call
also uses the ?s-instruction. That will destroy
the status variable. As a remedy to this,
we have two other instructions, called S and R. S
saves the status variable, and R restores it.
As a matter of fact, they push and pop the values into
a small stack. It only has a height of two now, but that
should be sufficient.
Now the rule is: If a function uses a ?s instruction,
it should start with an S instruction and end with an
R instruction. Assembler functions give no problems; they
don't use the ?s-instruction.
The codes for ?s, !, S and R are quite small. In fact
?s is no bigger than the normal ?.
(It would of course be tempting to use a more consistent
stack concept for all this. The problem is that there
can be several !-instructions for each ?s-instruction,
and it is difficult for the compiler to know which
?s- and !-instructions belong together).
Booleans
The mechanism with conditional statements, as described
above, is both elegant and powerful. But it makes it
difficult to combine several conditions with boolean
operators. To help with this, a notion of a boolean
variables is introduced. Boolean variables are either
true or false. These values ar represented as the
integer 1 (a 1 only in the least significant bit), and
0 (all bits zero). With this representation, the operators
&, or and xor, act on booleans, as one would expect.
There are two instructions that generate boolean values
on the stack:
- Z, which replaces the stack top with true,
if the current value on the stack was exactly zero
- G, which replaces the stack top with true
if the current value on the stack was greater than zero
Note that Z also serves as a complement function,
which replaces true with false, and false with true.
Arrays
An array is defined simply by writing several values
separated by commas after the equals sign:
M = 1,2,3,1,2,3,1,2,3
declares an array M with 9 elements.
A string after the equals sign, like
message = "Hello world"
is interpreted as
message = "H","e","l","l","o"," ","w","o","r","l","d",0
"H" means the ASCII-code for the character H. The final
0 can be seen as the ASCII-code for the null-character.
Hence we represent a so called null-terminated string.
Arrays can also be declared with a bracket-notation.
area[256]
reserves 256 long words, i.e. 1024 bytes for the
array area.
Arrays can be adressed in two ways:
i M[]
will push the i:th element of the array M on the stack.
The enumeration starts with zero. Hence
4 message[]
will push the ASCII-code for 'o' on the stack.
The other alternative is used to adress an arbitrary
array. It uses the function @.
adress i @
loads the i:th element of the array starting at
adress adress on the stack. If we want to
load the first character in message, we do the
following:
#message 0 @
This loads the "H" character on the stack.
Serial execution
The following code after the exec keyword:
exec
proc1 proc2 sema/proc3a/proc3b proc4
makes the process proc3a and proc3b alternate in one
and the same cog. They are controlled by the variable
sema, which should also be declared as a global
variable. When sema switches to the value 0, which it
always does initially, proc3a is executed in the cog nr 3.
If sema switches to 1, the cog is instead loaded with
proc3b. The variable sema can then switch back to 0
or to higher values, and then other processes are
executed, if they are mentioned after more "/":es.
When sema switches values, the current process is interupted
abruptly, so it may be an advantage to let each process
interrupt itself in a controlled way. Thus, it would be
safer only to allow the processes controlled by sema
to control sema.
After change of process the exiting process is completely
wiped out, so the only way it can communicate with the
world into the future, is by writing to global variables,
or output pins.
The motivation for this whole concept is, that the limited
size of the cog memories (512 long words) maybe is the
strongest limitation to what you can do with a Propeller.
As long as you can divide your computations into independent
chunks, this concept allows you to do as big computations
as you like, upto the limitation of the size of the global
memory (which is 8k long words). Naturally the reloading
of a process into a cog takes some time. A natural use,
is when a process requires much code for initialization.
Then you let one process (init) initialize, and another
process (run) execute. Then you write
exec
... s/init/run ...
When init has done its work, it sets s to 1.
This is the case when the processes actually execute in
series, but the concept allows you to let a process
tree branch out, depending on the results of the computations.
A high level construct
Macros are treated by a preprocessor, which substitutes
the macro name with the macro code. The same preprocessing
can be used to handle high level constructs. I have made
one, which mimics a standard for loop. It is made
in stack processing style. The idea is, that if you load
two numbers, k and n, on the stack, we can let that
represent the interval
between k and n, i.e. all the integers between k and n.
Then we have a function [all:i] which produces all
the integers between k and n. These values are produced
consecutively in time in the variable i. These values are
used in a number of statements enclosed in standard
parentheses (()). This means that the code within () is
repeated for each value of i. Here's an example:
1 ->nfact
1 n [all:i]
(nfact i * ->nfact
)
With this code, nfact is the factorial of n (called n!).
Here's another example:
1 ->pn
1 n [all:i]
(pn p * ->pn
)
With this code, pn is the n:th power of p. Note that
the variable i is not at all mentioned between
( and ).
As the system is now, the loop variable i has to be
declared separately.
Assembler functions
A backbone in the system is the assembler resource
file Assembler.spin. This file contains a number of
usefull functions, that can be directly interfaced
from Myra. How to add functions to this file is
described later. Here's what it contains now.
A number of functions are handling the stack, and
implement simple arithmetic operations. Multiplication
is worth mentioning specially, as Propeller assembler
doesn't have any multiplication instruction. The same
is true for the division instruction, but it is
mentioned further down. Non arithmetic
instructions are logical and (&), logical or (or),
exclusive or (xor),
right shift (>>), and left shift (<<). The right shift
is arithmetic, i.e. it preserves the sign of the number.
- / divides the number one step down on
the stack with the number on the top of the stack.
The result is a 64 bit number placed in the two
top elements on the stack. The top contains the
most significant result of the division, i.e. the
integer part of the result. If a/b is computed,
and a is smaller than b, the integer part will
be zero. The fraction part of the result is
pushed one step down on the stack.
- <<<< is a left shift instruction specially
designed for use together with the division instruction.
It regards the two top elements on the stack as one
64 bit word, and shifts them to the left together.
After the shift, the most significant part of the
word is pushed on the stack, while the least significant
part is omitted.
- bit. Called as ibit adress bit,
this function returns the ibit:th bit of
the word at adress adress. The bits are
counted from the most significant bit and down.
If ibit is greater than 32, the function procedes
into the next word, and the next word after that
if ibit is greater than 64 and so on. The result
is returned in the least significant bit on then
stack top.
- next. If you use this function as
ix iy next ->ix ->iy repeatedly, you will
cycle through a regtangular area in the (ix,iy)-plane.
ix will move fastest, and will return to 0
as it equals a value xmax (see the next
instruction), At this moment, iy will be
incremented. There is no ymax. You must handle
the y-boundary of the rectangle outside this
assembler function. Here is a template of how to
use the function:
ix
iy
nx = 100
ny = 100
begin
xmax initnext
0 ->ix ' ->iy
:loop
-- use ix and iy
ix iy next ->ix ->iy
iy ny - ?
#{>loop}
-- end
- initnext. Called as nx initnext
it sets the internal variable xmax for the
next function, as described above.
- now loads the current value of the computer's
time counter on the stack
- wait waits a specified time. The time is given
on the top of the stack in computer ticks. In normal
use with a 5 MHz crystal and PLL multiplication of
16, the tick is 1/80,000,000 seconds long.
- waituntil waits for a specified value on the
computer's time counter. A way of using this is as follows:
now deltat + waituntill
This would be equivalent to deltat wait. The
construction with waituntill consumes some time, however,
so deltat can't be too small. If it is, the
time set up for waituntill may already have passed,
when the assembler waitinstructions starts executing. In
that case the computer will wait for several minutes.
The best use of waituntill is, when one wants to produce
a process with a well defined frequency.
- waitfor0 is used as follows:
jpin waitfor0
The computer waits until input pin nr jpin becomes 0.
- waitfor1 waits for the input pin to be 1 instead.
- setdir sets a selected number of I/O pins to
direction out. The stack should contain a long word
with 1:s at the places where one wants the corresponding
I/O pin to be directed out. This function can be called
several times, and the 1:s will be or:ed to what is already
set.
- outpins handles the setting of output pins, once
they are set to be output pins by setdir. It is
called like this:
data mask outpin
Only those pins which correspond to 1:s in mask
are affected, and they are set according to the bit
pattern in data
- inpin loads the status of a single input pin.
k inpin
pushes 1 on the stack if the k:th input pin is one,
otherwise 0.
- ina loads the whole input pin pattern as a long
word on the stack.
- send sends one byte serially according to the
RS232 protocoll (TTL-level 3.3v, 1 startbit, 1 stopbit.
no parity, no handshaking). The use is:
data pinnr pulsewidth send
pinnr defines on which computer pin transmission takes
place. Pulsewidth, can be computed with the baud
function, which follows. If the computer clock frequency
is 80 MHz (the normal value), and the baudrate i 9600 baud
(bits/s), then the pulsewidth is 80,000,000/9600.
- baud makes this computation, assuming a computer
frequency of 80 MHz. If the variable k9600 has the value
9600, we can write
data pinnr k9600 baud send
- receive receives a byte according to the RS232
protocoll (with parameters as above). It is a blocking
instruction, i.e. the cog will hang there, until
a byte arrives to the computer. The byte will be
pushed on the stack, as the 8 LSB's of the long word
on the top of the stack. We write:
pinnr k9600 baud receive ->data
- send32 sends 32 bits of data at a time in
essentially the same format as send. The transmission
is made with a fixed frequency of 1 Mbit/s. Data
has to be received with a receive32 instruction
on another propeller chip. The call is
data pinnr send32
- receive32 receives 32 bits of data. It is
blocking just like receive, and it is called as
pinnr receive32
- spiinit, spisend are functions for
using the SPI protocol (Serial Peripheral Interface).
- iicinit, iicstart, iicstop, iicack, iicnoack, iicwack,
iicwrite, iicread are function for handling
Philips I2C-bus (Inter Integrated Circuits).
Commercial use of the I2C bus is subject
to paying license to Philips.
- kbrec receives a so called scan code from
an IBM PS2 keyboard
- kbsend sends a keyboard command to an
IBM PS2 keyboard
- analog is a function designed to handle a
specific A/D-converter (MC3208
from Microchip), It is an 8 channel converter with
12 bits output.
One controls a chip select pin
a clock pin and a data in pin, to send a command to
the converter. The command tells which analog channel
to convert. Then the converted data can be clocked
in through a data out pin. These four pins should
be connected to the Propeller chip. The Assembler
function has to be modified if other converters are
used. (There is a function analoga which fits
to Analog Devices AD1202).
The function is used in the following
way:
ch0 analog ->x0
The analog signal at channel ch0 is converted and stored
into x0. ch0 is the code sent to the A/D converters
in pin.
Before using this, one has to initialize the
system with:
- init_analog. This assumes a certain
(but fairly natural) wiring between the Propeller
and the A/D converter.
- init_analogp is a similar initialization
which admits parameters describing how the
pins are wired.
- analog2 is a function fitted for a configuration
with two parallell AD 1202 A/D converters as above (for the user
who needs more than 8 channels). The two converters are
set up simultaneously, and convert their signals simultaneously.
The two results are stacked on top of each other.
- coginit is a Myra encapsulation of the
coginit assembler instruction. Call as
icog adress par coginit, where icog is the
number of the cog to start, adress is the
Propeller RAM-adress of the beginning of the code
to load, and par is the desired content of
the par-register.
Adding assembler functions
The user can write his own Myra functions, as he likes.
To speed up the programs, he can also add his own assembler
functions to the file Assembler.spin.
These functions should have the following properties:
- They should be correct assembler code, of course.
This is validated when assembling the complete program
with the Propeller environment. The whole file
Assembler.spin file is written, so that it could be
assembled on its own, but it is now too big for that;
it contains more than 512 instructions. The assembler
functions are there to be called, and hence they need
a properly labeled return statement. If the function
is called fun, then the return statement should
be labeled fun_ret.
- Each assembler function must have a header.
From the assembler language point of view, this is
a comment with the following format:
'> name aliasname
The name should be the same as the label on the
first line of the actual assembler function (the
"name" of the assembler function). The aliasname,
can be the same as the name, or something shorter,
down to a simple operator symbol like '+'.
The system uses these headers to load the used
assembler functions into the code. It does so by
matching the call in the Myra code with the aliasname.
- The assembler functions should have a stack type
interface with the rest of the system, i.e. arguments
should be popped from the stack, and the results
should be pushed on the stack.
- Functions should not unnecessarily have side effects,
i.e. their only effect should be the result they
deliver. However, when we use assembler functions
to send things out to the output pins, or to control timing
(by waiting for instance), these are of course
unavoidable side effects
The stack handling is made using self modifying code,
i.e. using the movs (move source) and movd
(move destination) instructions. Globally there is an
array, whose first element is called stack, and there
is a stack pointer called sa ("stack adress"). Then
the following code loads the item on the top of the stack
(which is at the adress sa) to a local variable arg1:
movs instr1,sa
nop
instr1 mov arg1,stack
sub sa,#1
...|
arg1 long 0
The instruction instr1 is modified, so that data are fetched
at the adress sa. This overwrites the adress 'stack',
so we could write whatever we like there, but 'stack' is a
litte bit informative. The final instruction moves the
stack pointer down, so that the loaded value is no longer
reachable. (The value is "popped" from the stack). The
nop instruction is necessary, because the Propeller
uses pipeling. Without it, the instruction instr1 would
be loaded before it were modified.
You could look inte the file Assembler.spin to find
examples to learn from.
Variable and symbol scopes
The scope of the global variables is of course global,
i.e. the variable names can be used throughout the system.
Processes can not be "called" by each other; they can
only be called on the line after the exec keyword.
The scope of the rest of the symbols (variables, labels,
function names) is the containing process. For some
tastes, this might seem a bit too wide. For instance one
function in a process could use the local variable of
another. This is due to, that the assembler language doesn't
have much of a notion of scope. Nevertheless a check
against this missuse, could be made at compile time,
but this hasn't been implemented so far. A consequence
of this is that variables in functions must have unique
names. Otherwise the Propeller assembler will complain
("symbol already defined"). I think it would seem
acceptable, to let the variables defined in the process
be accessible also to the functions, but the programmer
should avoid to borrow variables between the functions.
Note that if the same function shall be used by more
than one process, the function has to be repeated in
each of the processes.
Suffixing
As a remedy to the scope problem (that the scope of
variables and labels is too wide), there is a mechanism
of suffixing. If the function statement is followed
by at least one space, and then the tag "-?tag?", all
local variables, and all labels (goals for jumps) are
suffixed with _?tag?. Hence if one writes -cos, then
the variable x will be renamed to x_cos,
and the label loop will be renamed to loop_cos.
These suffixed names will appear in the assembler code,
but in the Myra code, the names will of course remain
unsuffixed. If all functions are given unique tags,
then there is no problem with variable scopes, unless
of course one deliberately creates names like x_cos
somewhere.
It may happen that the unsiffixed name of a local variable
coincides with the name of a variable in the process or
a field variable in an object file. In that case, the
compiler prefers to interpret the variable as a local
variable, i.e. it gets a suffix. (For those who study
the compiler source code, Myra.java, this is the reason
for the notion of a PVariable (a public variable), so there
is a function 'recognizeableAsPVariable(...)').
Recursive calls
The Propeller doesn't have any subroutine call stack,
and there is no attempt to construct any call stack
in Myra. Hence recursive calls are not possible.
If an assembler or Myra function tries to call itself,
it will get lost.
Motivations for the language
The Propeller computers are programmed with
the Propeller assembler language, and the Spin language.
Assembler is very fast, but not always easy to write and
to read. Spin is an interpreted language, and thus
relatively slow. One can see this for the Spin instruction
wait(581+cnt)
This instruction works. It will wait untill 581 clock
cycles have elapsed from now. But if we wrote 580 instead,
more than 580 clock cycles would have elapsed, before
the actual waiting started, so then waiting would go on
till the clock had completed a full cycle through spilling.
But 581 clock cycles is pretty much.
In that situation, one would like to mix spin and assembler
code, and the ideal way would be to write fast assembler
routines, and call them from spin.
But the spin code doesn't really call assembler code;
it loads assembler code into cogs, and lets it run there.
Execution doesn't return from the assembler code, unless
the assembler code halts the whole cog. Also the interface
to the assembler code is quite thin; it is only a single
long variables, which preferably would contain an adress,
through which the assembler code can interface.
In Myra, everything is assembler, (except a few coginit-
statements and some code for handling serial execution).
This opens up for using pre-written assembler
routines directly and uniformly. The stack-based architecture
gives a standardized interface to these routines. Likewise
one can use pre-written Myra routines, and they are
fast, as they are compiled into assembler code.
As compared to standard languages like C and Forth, a
specialized language makes it easy to use features of
the Propellers, like reading and controlling time,
reading and writing to
i/o pins, and the language/instruction feature, that
each instruction can be run conditionally.
Myra also has a nice concept for making object oriented
code. Except for object-code files, the whole system is
kept together in a single file, that completely defines
the system.
As for all stack based languages, one has to get used
to the stack architecture, and till then, it may seem
like "write only programs", but once one is used to it,
programs are actually quite elegant.