MyCA - The Cross Assembler for MyCPU
Table of Contents:
About Cross Assemblers
You may ask what a cross assembler is. Maybe you have heared about assemblers. "Assembly" or
"Assembler" is not only a synonym for the machine language of microprocessors, but it stands also for
a program or tool that is used to translate readable programs (text files) into machine code that can be
understood by microprocessors. These tools often run on the machine you are writing programs for. For example,
you can write an assembly program for you PC, and compile (or better: assemble) it on your PC. That would be
the easy way. But what if your target machine does not have enough memory for the tool, or even worse, does
neither have a keyboard nor a display? Then you have to use a "cross assembler", that means
an assembler tool that runs on one machine but produces code for an other machine. This is the way I started
with working on MyCPU.
The Old Cross Assembler for MyCPU
When I started the MyCPU project in 2001, I needed a simple cross assembler to compile my first test programs
for MyCPU. At that time I had already written programs for the 8051 microprocessor, and for that purpose I used
a table oriented macro assembler called "hasm". Table oriented? Yes, that means that the assembler could
be easyly adopted to lots of target platforms, by simply writing a translation table ("replace text command
X by binary code Y" and so on). But I quickly hit the limits of that tool - especially the limited
space for labels was annoying. (The program was a MSDOS program from the 1990's and used only the lower 640kB
of system memory.) Because of this limitation I was forced to split MyCPU's kernel ROM into two halfs, meaning two
different software projects that needed to be managed. Finally it was no more possible to put even one more
source code line into the ROM, because the assembler raised immediately the out-of-memory error.
Even worse, the old hasm executable does no more run in the command shell of modern 64-bit operating systems.
And because of its architecture the hasm.exe executes very slow in DOS-box emulators, thus making it nearly
unusable on Windows 7 upwards. And because I changed my host operating system from Windows XP to Windows 7 in
april of 2014, I was ultimately forced to write a new cross assembler for MyCPU. And the neat side effect is
that the new cross assembler is "open source", and it is also available for the Linux platform!
MyCA was born
Over the christmas holidays 2014 I had some time to start writing the new cross assembler. I wrote
"myca" completely from scratch, and this took about 100 hours. My goal was that "myca"
should be 100% compatible to hasm.exe, so that the generated .hex files are equal between the two programs,
which made it easier for me to track down errors in myca. But myca is still not a full replacement for
hasm.exe, because it supports only the functions of hasm.exe that I used in the MyCPU source code.
Features of MyCA
MyCA is a macro cross assembler that runs on a Windows- or Linux- based host.
MyCA has a built-in C/C++ style preprocessor. All well-known preprocessor commands are supported,
like #ifdef, #ifndef, #if, #else, #elif, #endif, #include, #error and #waring. But the preprocessor does not
support macros with a parameter list like #define MIN(x, y) ((x)<(y)?(x):(y)). This is because
myca itself is a macro assembler and supports the MACRO command for this purpose.
MyCA supports user-defined memory segments for code and data. If segments are not bound by the
user to a specific base address, myca will place the segments at compile time to the best suited
start address.
MyCA supports macros with local labels. I am using lots of macros in the source code of
the MyCPU kernel ROM.
MyCA has no hard coded limit for the amount of labels, variables and defines. But because
it is still a 32-bit program, the heap can not get bigger than 4GB, but you should never hit
this limit (not when assembling code for MyCPU...)
MyCA is a multi-pass assembler. Code is built incrementally, thus making the generated program
as small as possible. For example, if a memory cell in the zero page is addressed, you do not need
the OP-codes for addressing 16-bit addresses, but you can take the smaller OP-codes that are
especially available for zero page addressing. The assembler needs several passes through the code
to decide which OP-code would be the best.
MyCA can be used to replace the myasm assembler on MyCPU. If you have written programs
on the MyCPU itself, you can now also cross-assemble these programs by using myca.
MyCA will auto-detect if your program was originally written for myasm.
MyCA can generate binary files, Intel hex files and listfiles.
MyCA is really fast, it is at least 20 times faster than hasm.exe
You can use MyCA for your own 8-bit homebrew computer project. You must only rewrite the
two source files assembler.c and opcodes.h to support your own instruction set.
New Features since 2016:
MyCA Command Line Arguments
When you run myca without any arguments, myca will print out these help text:
[MyCA] Macro Cross Assembler V1.07 for MyCPU, (C) 2023 by Dennis Kuschel
Command line arguments:
myca source.asm [-o binfile][-h[hexfile]][-l[listfile]][-r binrange][-m][-n]
Flags:
-o Set name for output binary file
-h Generate a hex file
-l Generate a list file
-lnv Do not show vCPU2 instructions in list file
-r Set binary output address range, e.g. -r 0x0000-0xFFFF for full 64kb
-m Make myca behave like the myasm program on MyCPU
-n Forbit myca to behave like the myasm program on MyCPU
-s Be silent and print only error messages
-D Set preprocessor define: '-D string' or '-D string=value'
-I Set include path: '-I /path/to/include/files'
-O Set optimization level (only valid for vCPU2): -O0 =none, -O1 =optimize
-O2 =optimize more, -Os =optimize for size, -Ofast =optimize for speed
-t Set target. Allowed targets are: mycpu, vcpu, vcpu2. Example: -t vcpu2
Examples:
myca rom.asm -o rom.bin -r 0x0000-0xFFFF
-> Builds the ROM memory image 'rom.bin' with the address range 0x0000-0xFFFF
myca program.asm -o programname
-> Builds a user program (the parameter -o is optional), works like myasm
|
Usually you would call MyCA with at least one parameter: That is the source file
you want to compile. If MyCA gets only this one parameter, it automatically
generates a binary output file that has the same name like the source file but
with the ending ".o". This is the same behaviour like myasm has on MyCPU.
Also like with myasm, you can tell myca the name of the output file by using the
parameter "-o
". Please note that it does not matter if you write
myca sourcefile.asm -o testprogram
or
myca sourcefile.asm -otestprogram
, the result is the same:
MyCA will generate the binary called "testprogram".
You can also advice MyCA to generate a hexfile instead of a binary. This is done
by setting the "-h
" flag:
myca testprogram.asm -h
will then generate the hexfile testprogram.hex.
Note that you can, similar to the -o parameter, influence the name of the hexfile.
A very usefull feature is the list file that is generated when you append the
flag "-l
" to the command. The listfile shows all source code
lines and the generated program code bytes.
In the first column you will find the target address, the second column contains
up to three code bytes, and the last column contains the source code line that was
translated into these bytes. At the bottom of the listfile you will find a table
that lists all memory segments that are used by your program, including the base
addresses of these segments and the size. Below the table is a status line that
informs you about the overall compilation status (if it was successfull or not).
The special flags "-m
" and "-n
"
are used to advice MyCA to behave like myasm or not. Usually MyCA detects
automatically the file format of your source file, but sometimes it may fail.
If you observe such a problem then try to set one of these flags. The difference
between these formats is mainly the default setting of string formats: myasm
automatically translates ASCII strings into the PETSCII code that is used by MyCPU
to display texts. MyCA does this translation not by default to keep compatibility
to the old hasm.exe program. Note that you can also define the string format within
your source file: Use either ".mode ascii
" or
".mode petscii
" to switch between these two formats.
Some words about the source file encoding format: If you have started coding your
assembly program on MyCPU with the text editor "edit
",
you may have encountered that your program is unreadable with an external
texteditor. This is because MyCPU stores all text files in PETSCII format by
default. MyCA detects the PETSCII format automatically and translates it
on-the-fly into the ASCII format. You can convert your source code files
into the ASCII format by simply using the copy-command on MyCPU with the
parameter "/a": copy source-pet.asm source-asc.asm /a
.
Assembler Programming with MyCA
Some short facts:
- The usual preprocessor directives are supported
- Comments are started with a semi-colon (;)
- Labels start always at the left side of the line. They may end with a colon(:)
- Mnemonics must not start in the first column of a line. At least one space is required.
- There are only a few commands that MyCA understands:
- ORG : set origin of a program or data segment to a fixed memory address
- SEGMENT : open a new or switch to an existing data or code segment
- SET and EQU : assign a value to a variable
- DB and DW : Insert bytes or words into the program. DB can also be used to put text strings into a program.
- DS : reserve some bytes within a segment
- MACRO and ENDMACRO: used to define a macro
- MODE : switch between ASCII and PETSCII mode for characters and strings
Some words about segments:
Usually when you are programming in assembly, you would use only one memory segment,
that is the segment where all your programm code is stored in. But using multiple segments
can make things easier. For example, when your program uses the paged data memory area
$4000-$7FFF, you can define a data segment within this memory. All variables and
buffers you are using will be placed in this segment. You must only tell MyCA the
size of the variables and buffers with the DS command, and MyCA will automatically
assign memory addresses to your variables.
An other advantage of segments is that you can switch between segments at compile
time. For example, the source files of the MyCPU kernel ROM use generally two segments:
A program code segment and an initialization code segment. All initialization code
(such as memory and register initialization) goes into the init code segment, and all
other code goes into the program code segment. When assembling this sources, MyCA will
first collect all the initialization code snippets of all source files and put them
together, and after that init code segment all other program code is placed.
So the user does not need to explicitly call the initialization routines
of the modules.
Segments can be bound to a memory start address or they can be left floating. Binding
a memory segment is done by the ORG command. When the ORG command follows immediately
after the first segment command, this segment is bound to the memory address given by ORG.
If a segment is not bound it remains floating. That means that MyCA will try to
automatically assign a memory base address to this segment. But this is only
possible if there is at least one bound segment of the same type in the program.
Floating segments are then placed behind the already bound segments. Segments can
be of type code or type data.
Example Program
This short example program shows all the features and functions of MyCA.
Below the program you will find the listfile that was generated by MyCA and the
hexdump of the program. The command was "myca example.asm -l
"
Program example.asm:
; Set mode to PETSCII. Strings are now automatically converted to PETSCII code.
.mode petscii
; Assign a value to a variable. This can be done with EQU or SET.
KERN_GETCH EQU 023Ch
KERN_PRINTSTR EQU 0244h
; Define a macro that does not take any parameter.
ret0 MACRO
CLA
RTS
ENDMACRO
; Define a macro that takes one parameter, that is str
print MACRO str
LPT # str
JSR (KERN_PRINTSTR)
ENDMACRO
; Define a macro that takes three parameters
bcopy MACRO srcptr, dstptr, cnt
LDX # cnt
CLY
cploop LDA ( srcptr ),Y
STA ( dstptr ),Y
INY
DXJP cploop
ENDMACRO
; Build the program header that is required for all MyCPU programs
programheader SEGMENT CODE
ORG 8000h
DW staddr ; ptr to memory start address of the program
staddr DW main ; program entry point (ptr to main function)
DW 0 ; optional: main program termination function
DW codestart ; this pointer points behind the initdata segment
; Define a data segment in the zero page.
; You are allowed to use the addresses $00-$0F.
zeropage SEGMENT DATA
ORG 00h
; Declare two 16-bit pointers within the zero-page
ptr1 DS 2
ptr2 DS 2
; Open a new segment that contains initialized data, like strings and tables
; Also some small amount of buffer memory can be defined here.
initdata SEGMENT CODE
ORG 8008h
text1 DB "This is a text.\r",0
buffer DS 20 ; reserve 20 bytes buffer memory for later use
; Now follows the code segment. Since it is not bound to any start address
; the constdata segment can still grow. See the example at the end of this file.
programcode SEGMENT CODE
codestart:
;Four dummy instructions: They are required directly after "codestart"
;to reference all zeropage addresses we use. This is required for the
;dynamic program loader and the automatic memory allocation routine.
FLG ptr1
FLG ptr1+1
FLG ptr2
FLG ptr2+1
main: ;program entry point: main program (must be somewhere behind codestart)
;example: use a macro to copy some bytes
LPT #text1
SPT ptr1
LPT #buffer
SPT ptr2
bcopy ptr1, ptr2, 17
;print the string that was copied to the buffer
LPT #buffer
JSR (KERN_PRINTSTR)
;print a string using a macro
print text2
;this demonstrates how a label works
waitkey: JSR (KERN_GETCH)
CMP #0
JPZ waitkey
;return with accu=0 (exit code), use macro ret0
ret0
; Switch back to the initdata segment and add one further text line.
; MyCA will place this data to the correct place between
; the program header an the start of the program code.
initdata SEGMENT CODE
text2 DB "Press any key to quit.\r",0
|
Listfile example.lst:
;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
;~~ [MyCA] Macro Cross Assembler V1.0 for MyCPU, (c) 2015 by Dennis Kuschel ~~
;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
;[File: example.asm]
; Set mode to PETSCII. Strings are now automatically converted to PETSCII code.
.mode petscii
; Assign a value to a variable. This can be done with EQU or SET.
KERN_GETCH EQU 023Ch
KERN_PRINTSTR EQU 0244h
; Define a macro that does not take any parameter.
ret0 MACRO
CLA
RTS
ENDMACRO
; Define a macro that takes one parameter, that is str
print MACRO str
LPT # str
JSR (KERN_PRINTSTR)
ENDMACRO
; Define a macro that takes three parameters
bcopy MACRO srcptr, dstptr, cnt
LDX # cnt
CLY
cploop LDA ( srcptr ),Y
STA ( dstptr ),Y
INY
DXJP cploop
ENDMACRO
; Build the program header that is required for all MyCPU programs
programheader SEGMENT CODE
ORG 8000h
8000 0280 DW staddr ; ptr to memory start address of the program
8002 4D80 staddr DW main ; program entry point (ptr to main function)
8004 0000 DW 0 ; optional: main program termination function
8006 4580 DW codestart ; this pointer points behind the initdata segment
; Define a data segment in the zero page.
; You are allowed to use the addresses $00-$0F.
zeropage SEGMENT DATA
ORG 00h
; Declare two 16-bit pointers within the zero-page
ptr1 DS 2
ptr2 DS 2
; Open a new segment that contains initialized data, like strings and tables
; Also some small amount of buffer memory can be defined here.
initdata SEGMENT CODE
ORG 8008h
8008 D44849 text1 DB "This is a text.\r",0
800B 532049
800E 532041
8011 205445
8014 58542E
8017 0D00
buffer DS 20 ; reserve 20 bytes buffer memory for later use
; Now follows the code segment. Since it is not bound to any start address
; the constdata segment can still grow. See the example at the end of this file.
programcode SEGMENT CODE
codestart:
;Four dummy instructions: They are required directly after "codestart"
;to reference all zeropage addresses we use. This is required for the
;dynamic program loader and the automatic memory allocation routine.
8045 3C00 FLG ptr1
8047 3C01 FLG ptr1+1
8049 3C02 FLG ptr2
804B 3C03 FLG ptr2+1
main: ;program entry point: main program (must be somewhere behind codestart)
;example: use a macro to copy some bytes
804D 6C0880 LPT #text1
8050 6F00 SPT ptr1
8052 6C1980 LPT #buffer
8055 6F02 SPT ptr2
bcopy ptr1, ptr2, 17
8057 5011 LDX # 17
8059 2E CLY
805A 3400 __m1_cploop LDA ( ptr1 ),Y
805C 4402 STA ( ptr2 ),Y
805E 8B INY
805F 495A80 DXJP __m1_cploop
;print the string that was copied to the buffer
8062 6C1980 LPT #buffer
8065 1B4402 JSR (KERN_PRINTSTR)
;print a string using a macro
print text2
8068 6C2D80 LPT # text2
806B 1B4402 JSR (KERN_PRINTSTR)
;this demonstrates how a label works
806E 1B3C02 waitkey: JSR (KERN_GETCH)
8071 7000 CMP #0
8073 196E80 JPZ waitkey
;return with accu=0 (exit code), use macro ret0
ret0
8076 2C CLA
8077 1F RTS
; Switch back to the initdata segment and add one further text line.
; MyCA will place this data to the correct place between
; the program header an the start of the program code.
initdata SEGMENT CODE
802D D05245 text2 DB "Press any key to quit.\r",0
8030 535320
8033 414E59
8036 204B45
8039 592054
803C 4F2051
803F 554954
8042 2E0D00
Segment Table:
**************
Segment Name Startaddr Endaddr Size Type
=========================================================================
programheader 8000 8008 8 CODE fixed
initdata 8008 8045 3D CODE fixed
programcode 8045 8078 33 CODE floating
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
zeropage 0 4 4 DATA fixed
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
No errors found.
|
Hexdump:
<< go back