(Quetzalcoatl Home Page) (kdef.com) (VIC-20 Home) (Geek Home)
[History] [Things you to Do] [Future] [Design] [Development Guidelines] [Contact]

Quetzalcoatl
A Compiler Hackers' Guide
This Document: http://www.kdef.com/geek/vic/quetz/develop/chg.html Version 2.1.0 BETA - October 11, 2006
Author: Brendan Jones

October 8, 2006: Released Quetzalcoatl 2.1.0 BETA under GNU Public License (GPL). Never had time to fully comment the body, so this guide and the headers will have to do for now.

Legals

Before viewing, using, modifying or copying the Quetzalcoatl Source Code be sure to read the legal note in legal.txt. If this is missing, contact me. You can reach me through the kdef.com web site. I can also be reached through the VIC Denial Forums. I shan't include the e-mail address here because of spam. Alternately contact the Free Software Foundation for the GPL Legal Notice.

What is Quetzalcoatl?

The goal is for Quetzalcoatl to be a ruthless, squeeze-every-byte and waste-no-cycle compiler;
Something that would give you the absolute efficiency of assembler.
Letting you cut your code in C, without sweating on lost bytes or cycles.
Queztalcoatl - The Only Compiler with a Vision Statement.
(If Al Dunlap was a Compiler, he would be Quetzalcoatl)

Queztalcoatl is a Cross Compiler for 6502 Processors. Queztalcoatl runs under Windows, Linux, Solaris, DOS and MacOS. It can compile and link programs written in a subset of ANSI C, Assembler, the 1983 UPL language or any combination thereof. It is suitable for producing programs for the Commodore VIC-20, Commodore 64, Rockwell AIM-65 (Hey! isn't that a missile?) and Atari 7800. (You can run executables directly on emulators such as VICE, Free-64 or PCVIC). With some minor modifications Queztalcoatl will also produce code for the Apple ][, old 6502 game consoles (such as the Nintendo Entertainment System) and 6502 embedded systems.

The Quetzalcoatl Award goes to Harry Dodgson who took Quetzalcoatl in 2005 and ran with it, testing and swatting heaps of bugs, improving the code, adding new runtime targets and porting the compiler to the latest version of GCC and Solaris; Quetzalcoatl is Harry's program too.

This version of the Compiler Hacker's Guide has been expanded for the 2.1.0 BETA release. It includes a new section on optimisation which will hopefully explain much. While I never got the comments to the level I'd wanted to, with this guide it'll hopefully be enough for would-be compiler hackers to find their way around. One thing to keep in mind is Quetzalcoatl is a classic LL(1) compiler using a runtime based on the ideal stack machine. You should be able to find a discussion of this sort of compiler in any elementary compiler text.

I hope you and Queztalcoatl have fun riding together. J

History

Quetzalcoatl's roots go back to 1983, when I wrote the UPL Compiler for the VIC-20 just for fun. In 1983 Development tools were in short supply, so I ended up writing UPL in BASIC. This meant some compromises, such as restricting UPL to byte variables and no advanced features, such as local storage, pointers or structures. Despite this, UPL did run and with inline assembler and the mem[] array you write programs that were almost useful. When the VIC-20 finally lost its crown to the Commodore 64 I put my UPL and my other VIC-20 programs into storage, and chalked up writing UPL as a good educational experience. (You can download the original UPL Compiler by clicking here).

In 1998, having purchased an old second-hand VIC-20 at a garage sale, and having discovered VIC-20 emulators on the Internet, my interest in the VIC-20 was renewed. I wondered, as a dare, if I could could write a 6502 Cross Compiler equivalent to UPL in a night? It ended up taking a few nights, but I soon had something that compiled UPL. Of course UPL was kind of useless. Would it be possible to modify the compiler to read Tiny C programs as well as UPL programs? Indeed it was. In a few nights I had that running too. But the Tiny C/UPL hybird wasn't particularly useful either, so I started adding features like subroutine parameters, arrays and pointers. I'd also spent some time adding optimisation so Quetzalcoatl (as the hybrid was now named) started producing some impressively tight code. Unable to find a decent 6502 assembler for the runtime library, I wrote one of those too and integrated it into the compiler.

I had originally developed Quetzalcoatl to satisfy my own curiosity. At Linus Åkerlund's suggestion I placed its binary up on the web site and have been amazed ever since at the number of people that have downloaded it. (I mean, it's a compiler for a museum-piece CPU! Like, why would anyone care!? J)

In September 1998 I was in the middle of adding proper C data types (pointers, structures and composite types) and improving the optimisation when the real-world beckoned. (Oh to find a Landlord who thinks Retrocompilers are cool! J) I figured I'd return to Quetzalcoatl and finish it when time permitted, but as of time of writing it's been 12 months and I'm still flat out. (Heh. Make that 96 months!)

I had been thinking about open sourcing Quetzalcoatl for some time, but the source lacked all but the most basic comments explaining how everything is strung together. The Netscape experience had taught the world it's not good enough just to slap an open source label on a piece of software, and expect the world to rush in to fix it up for you. It has to be maintainable. Plus I take pride in my code; I don't want a half-assed, second-rate product with my name on it floating around out there, periodically coming back to haunt me as an example of why someone shouldn't hire me. (For my commercial quality work I'm expected to use ISO.)

Rather than let Quetzalcoatl sit around and gather bit-rot, I've been documenting it when I get the chance. As of 15 September 1999 it finally reached the stage where the basic structure is almost understandable. So Quetzalcoatl got a limited public release to people who had contacted me over the last year and expressed either a curiosity or an interest. Quetzalcoatl 2.0a5 was a limited source release to those people. Over the years various people did request it, play around with it and occasionally tweak it. Of these, Harry Dodgson wins the "Quetzalcoatl Prize" for a concerted effort testing, debugging, improving and porting the Big-Q in 2005. This lead to the GPL release in 2006

Over the years, one of the unexpected things was learning the number of Commodore 64 programmers who used Quetzalcoatl, not for the C, but for the Assembler. I had originally planned to use a third-party assembler for the runtime, but what was out there wasn't very good so I ended up writing my own. That turned out to be nice, because Quezalcoatl can compile its own runtime! The interesting thing is your program can sometimes find uses you'd never thought of. If you're lucky, you'll hear about it.

Future

As of October 8, 2006, Quetzalcoatl 2.1.0 BETA is now licensed under the GNU Public License (GPL). I've precious little time on my hands these days, so if the right person comes along I'd be happy to hand it over to someone else with the time and the interest. That may be how community projects evolve; You take something, run hard with it for as long as you can, and exhausted, you hand the batton to someone else. (Or put it on the grass and hope its glint catches someone's eye J).

I think Quetzalcoatl needs to reach a certain level of ANSI C functionality for the C-part of it to be useful. Unless it does, you'll find it kept out of the realms of "real" programs. Given how far it has already come, it'd be a shame not to go than final mile.

But that said, I don't think it needs to be 100% ANSI C compliant. For one, Quetzalcoatl uses an LL parser, where as C is a LR grammar. For two, it isn't really necessary. Very few C programs need 100% compliance, and if you must compile one with bells-and-whistles C there are already perfectly good 6502 C Cross Compilers out there. But as an LL compiler Quetzalcoatl offers some interesting opportunities, because you can really get in there and fine-tune the mechanics. For example, you can add custom optimisers to squeeze every byte and cycle out of the machine. This fits well with Quetzalcoatl's orginal goal; to produce 6502 code for a VIC-20; a machine with a scant 3.5Kb of RAM(!) In such a constrained environment, you have to make every little bit count.

To Do

Goals can be divided into four broad categories:
  1. Arrays/Pointers.
  2. Improvements that bring Quetzalcoatl closer to ANSI C, so that it can compile "real" programs.
  3. Optimisation.
  4. Versatility.

Here are some ideas for Quetzalcoatl development:

Rationale

Here are some changes I don't see as necessary, and why.

Design

LL vs LR

There are two broad families of compilers; LL and LR. Quetzalcoatl is an LL compiler. These are easy to write and easy to understand. The catch is they can only parse certain types of grammars. (A grammar defines the rules of a language.) C is actually an LR grammar, so I had to bend it somewhat to get Quetzalcoatal, an LL compiler, to be able to be able to parse it. (The official response is that Queztalcoatl isn't C, even if it happens to look a lot like it.)

LR compilers are internally quite complicated. Compiler writers need to use a Compiler Compiler such as YACC or Bison to build them. They do the job beautifully, but internally they're extremely boring. Quetzalcoatl was written for fun, education and as a testbed for optimisation, which is why I chose to write it as an LL compiler. If 100% ANSI compliance was my goal, LR would have won, hands down.

Architecture

Here are the major objects within Queztalcoatl:

The upl.h file is a good place to look for more detail on many of these objects.

While Statement Example

The actual upl_Compiler parsing routines are contained in the file UPL_COMP.CPP. Here is a fragment from while_statement() illustrating the conceptual simplicity of an LL parser. For clarity, the address patching code is not shown in this example.


void upl_Compiler::while_statement(Flex& L, upl_Context& C)
{
  // ... patching declarations not shown

  upl_Expr_result       clause;
  upl_Expr		expr;

  L.check("while");
  if (L.matches("("))
    {
    expr(L, C, clause, 0);
    L.check(")");
    L.matches("do");
    }
  else
    {
    expr(L, C, clause, 0);
    L.check("do");
    }

  // ... statement parsing and patching code not shown
}

Also see the runtime library source; UPLRTIME.ASM. It's quite well commented, and has a good explanation of how subroutine calls work.

Optimisation Techniques

We will give an example here of optimising a piece of code.

Quetzalcoatl is a classic stack-based LL(1) compiler. Refer to any elementary compiler text to learn about these. Basically, it computes expressions by pushing intermediate values on a stack, and execution operators (e.g. multiply) by popping the arguments off the stack, doing the operation, then pushishing the result back. Compiler texts often refer to this as an "ideal stack machine"

For example:


weather = (wind + rain)
becomes

push wind
push rain
multiply		// Pops wind and rain off, multiplies, and pushes single result back
pop weather
But that's for an "ideal stack machine." We have to generate that in 6502 code, which is, shall we say, not quite so elegant? :-) While the 6502 does have a processor stack, it's very small and it already contains subroutine calls and is used by the operating system. One little slip up there, and we'll crash the machine. So the UPLRTIME.ASM defines a separate 256 byte area. We call this the datastack. We define our own routines; RUNTIME_PUSH_B to push a byte and RUNTIME_POP_B to pop it. With both of these we use the 6502 Accumulator to transfer our data. The accumulator is pushed with RUNTIME_PUSH_B, and popped with RUNTIME_POP_B.

LDA [wind]		// push wind
jsr RUNTIME_PUSH_B
LDA [rain]		// push rain
jsr RUNTIME_PUSH_B
jsr RUNTIME_MUL_BB	// multiply
jsr RUNTIME_POP_B
STA [weather]		// pop weather
The routine is equivalent to the ideal stack machine multiply operator. It does't (appear) to use the accumulator at all; it only changes what is on the datastack.

Now, actually this isn't quite right. If you multiply an 8 bit number by an 8 bit number, the result can be upto 16 bits long. This is called a "word" in 6502 parlance. Unfortunately the RUNTIME_MUL_BB routine only multiplies bytes, so we must use RUNTIME_MUL_WW instead (WW means multiply a word by a word). To do this we have to cast (convert) the byte elements on the stack into words. (The example here is straight from the .LST file you can get with the -a option. If you do this, use -f to force recomplilation since it won't generate the .LST file if the module has already been compiled.)


lda [wind]
jsr push_b	// This is RUNTIME_PUSH_B, as it appears in the LST style
lda [rain]
jsr push_b
jsr cast_b2w	// Convert the single byte on the top of the stack into a two byte word.
jsr pop_w_c	// We need to get the one underneath; 
		// temporarily store the word in the C pseudoregister (See UPLRTIME.ASM).
jsr cast_b2w	// Convert the single byte underneath into a word
jsr push_w_c	// Now push the word we'd stored in C back on the stack.
jsr umul_ww	// Unsigned multiply the two words
jsr pop_w	// Pop the result into the A and X registers.
sta [weather]	// A goes in the LSB (low-byte) of weather
stx [weather+1]	// X goes in the MSB (high-byte) of weather
How did we know the result of RUNTIME_MUL_BB was a word rather than a byte? How do we know there are two bytes pushed on the stack, and not one? Inside UPL_COMP.CPP there is a routine called op_result(). It takes two argument types (in this case, byte and byte) and an operator (in this case multiply), works out what they should be. The normalise_binary() and normalise_unary() routines call op_result() to find out what the result should be, then cast/convert their arguments respectively. (C-folks call conversion between types "casting").

That looks long for what it does... How could we optimise this. One way is to convert the multiply to inline code, but the 6502 lacks a multiply opcode so that wouldn't be very efficient. For adds and subtracts though, that can work well. See Peephole Optimsation below.

One thing that does look wasteful with that code are all those casting calls. How could we speed this up, or given our small memory footprint, make it more compact? Often you'll have to trade off speed and memory given the limited resources of the 6502. In this case I'd suggest putting that casting code in the UPL Runtime, and saving 9 bytes for every byte by byte multiply in our program; not bad!!! We could add a routine to the Runtime called say cast_bb2ww:


                // PROCEDURE cast_bb2ww.
                //
                // Pop two bytes, cast them into words, then push them back.
                //
cast_bb2ww:
		jsr cast_b2w	// Convert the single byte on the top of the stack into a two byte word.
		jsr pop_w_c	// We need to get the one underneath; 
				// temporarily store the word in 
		jsr cast_b2w	// Convert the single byte underneath into a word
		jmp push_w_c	// Now push the word we'd stored in C back on the stack.
				// A one-way jump, so we don't need to rts.
We could then generate the above code:

lda [wind]
jsr push_b	// This is RUNTIME_PUSH_B, as it appears in the LST style
lda [rain]
jsr push_b
jsr cast_bb2ww	// Cast two bytes on stack into two word. #OUR NEW ROUTINE IS HERE#
jsr umul_ww	// Unsigned multiply the two words
jsr pop_w	// Pop the result into the A and X registers.
sta [weather]	// A goes in the LSB (low-byte) of weather
stx [weather+1]	// X goes in the MSB (high-byte) of weather
The file UPL_RUNT.H contains the runtime routine names. We need to add our new symbol so we can call it:

#define	RUNTIME_PEEK_W_W        74

#define	RUNTIME_POKE_W_W	75



// Pop two bytes, cast them into words, then push them back.
// [bj 9oct2006]
//
#define	RUNTIME_CAST_BB2WW	76
We also need to add it to UPL_RUNT.CPP. This must matche the name of the label in UPLRTIME.ASM:


char const *runtime[] =
{
...,

"cast_bb2ww"
}
Remember to comment what you add, both here and in the MODIFICATIONS section.

Now, here is the existing code in UPL_COMP.CPP normalise_binary():


  // Find out what the result of this operation will be,
  // and what the  and  value types are.
  // If these are different from the existing  and ,
  // then we must cast/convert them.
  //
  result = op_result(Op, Underneath, Top, New_underneath, New_top, Unsigned);


  // Convert the top operand; we can do this as an unary (one argument) cast.
  //
  if (Top != New_top)
    {
    normalise_unary(L, C, Top, New_top);

    // Normalise_unary() returns true if it generates code for the conversion,
    // or false if it generates no code (because none is necessary).
    // In either case, we don't neeed to check this. All that is imporrant 
    // is that we have the desired binary value  on the top of the stack.
    }


  // The underneath operand is harder, because we already have the top argument in our way.
  // Solution is to generate code that pops the top register, stores it in 
  // (a pseudoregister defined in the runtime library), call normalise_unary()
  // to convert the underneath argument which is temporarily on the top,
  // then push the value in  back on top of the stack.
  //
  if (Underneath != New_underneath)
    {
    C.mark(state);  // Remember in case we want to rollback.

    // Write code to pop the top value into the <C> pseudoregister.
    //
    C.code.out(ASM_JSR);
    C.code.out_word_patch(
      value_bytes(New_top) == 1 ? RUNTIME_POP_B_C : RUNTIME_POP_W_C);

    // Now the <Underneath> value is temporarily on the top of the stack!
    // Call normalise_unary() to convert it.
    // It will return true if it generated code to convert it.
    // If will return false if the conversion turned out to be unnecessary.
    //
    if (normalise_unary(L, C, Underneath, New_underneath))
      {
      // Code was generated to do the conversion.
      // Now all we need do is push <C> back on the stack.
      //
      C.code.out(ASM_JSR);
      C.code.out_word_patch(
	value_bytes(New_top) == 1 ? RUNTIME_PUSH_B_C : RUNTIME_PUSH_W_C);
      }
    else
      //
      // Normalise unary did not generate any code.
      // This happens if the conversion is unnecessary.
      // For example, converting a ushort to a short.
      // If it didn't generate any code, then we can skip this 
      // part of the conversion.
      //
      C.rollback(state);
    }

First note the Rollback Optimisation. At the start of the if (Underneath != New_underneath) statement block, we take a watermark with C.mark(state);. If we later decide the code we have generated after that is unnecessary, we can rollback. We do this by calling C.rollback(state);. Read the above example to see how we do this.

Returning to our changes, we want to replace that entire block of code with a single call to our new RUNTIME_CAST_BB2WW routine. There are quite a few ways you could do this. We'll arbitrarily choose this one:


  // If parameters are both bytes, and we need to convert them both to words,
  // then call our RUNTIME_CAST_BB2WW routine. [bj 09oct2006]
  //
  if (value_bytes(Top)		  == 1
  &&  value_bytes(Underneath)	  == 1
  &&  value_bytes(New_underneath) == 2
  &&  value_bytes(New_top)	  == 2)
    {
    C.code.out(ASM_JSR);
    C.code.out_word_patch(RUNTIME_CAST_BB2WW);
    }
  else
    {
    ...
This would work, but it's not future-safe. What if we add new types in the future that can't be cast in this way? It's safer to say explicitly what types we want to convert. This is better. It's a bit tedious type spell out everything, but this way you can be sure you don't accidentally "optimise" the wrong thing in the future:

  if ((Top == upl_byte		  || Top == upl_char		|| Top == upl_boolean)
  &&  (Underneath == upl_byte	  || Underneath == upl_char	|| Underneath == upl_boolean)
  &&  (New_top == upl_ushort	  || New_top == upl_short)
  &&  (New_underneath == upl_ushort || New_underneath == upl_short))
    {
    C.code.out(ASM_JSR);
    C.code.out_word_patch(RUNTIME_CAST_BB2WW);
    }
  else
    {
    ...
If our specific optimisation case doesn't work, we run the existing code.

In this example, you changed the runtime library. The code you generate will only work with your modified runtime library. When you make changes to the runtime library, you need to consider how useful the changes will be. If your changes won't be used by the majority of programs, it's adding bloat to the runtime library. A proposed extension to Quetzalcoatl would be to make runtime library linking granular, so that parts of the runtime which aren't used can be skipped at link time.

The code you've seen here hasn't at time of writing been added to UPL_COMP.CPP. Rather, this is an example of one situation you might optimise.

Linking

Linking isn't opimisation, but this is a timely juncture to discuss it.

Quetzalcoatl has a full-blown linker. A linker takes each object module (compiled separately), and combines them into a single executable. When each object module is generated, the addresses are all relative. This is because we don't know what the final executable will look like. The linker takes each object module, decides where to place it in memory, reallocates all its relative addresses and patches its calls to the runtime library.

Note how we said C.code.out_word_patch(RUNTIME_CAST_BB2WW);. The patch says that the address of the RUNTIME_CAST_BB2WW routine is not yet known, and will need to be patched at linktime (when we finally tie the runtime library together with the code).

In other places where you generate code you will see calls like this:


  C.code.out_word(test_label, upl_code_word);
The upl_code_word enumeration says that what we are outputting here is a word (2 bytes) of code. The linker will then know to reallocate it to the code segment at link time; The code segment being where the code lives. The data segment is where the data lives.

For example, we may assign the variable weather the relative address of 0x0003 in our module's data segment. Later, the linker may decide to relocate this modules data segment to the absolute address 0x2410. Thus weather's address needs to be corrected to 0x2413. For example this code:


jsr pop_w
sta [0003]      ; Relocate data word
stx [0004]      ; Relocate data word 
would be relocated thus:

jsr pop_w
sta [2413]
stx [2414]

Instead of writing words, you may write a high-byte or low-byte. For example, upl_data_byte_hi says that the byte we are outputing should be relocated to the data segment, adjusted with the high byte. This is actually much easier on modern CPUs, but on the 6502 we're forced to adjust the LSB and MSB individually. To print a string, we need to load the address into the A and X registers. Since the address is split, we can't do a word relocation. We must do a byte relocation instead. This example prints the string begining at byte 76 (hex) in the data segment.


lda #76       ; Relocate data low byte
ldx #00       ; Relocate data high byte
jsr print_string_ax
Relocated to 0x2410 this would become, that string would begin at 0x2486. Thus:

lda #86
ldx #24
jsr print_string_ax

Transient (Register) Optimisation

Here is a piece of code that loads a character into the variable pressed:


pressed = getch();
This compiles to:

jsr get_ch_a
jsr push_b
jsr pop_b
sta [pressed]
Note we push the value and immediately pop it. Those two steps are redundant. It is better if we generate this:

jsr get_ch_a
sta [pressed]
The trouble is, at the time we generate getch() we don't know if they'll be manipulating that value before they store it. Quetzalcoatl optimises this sort of code by combining rollbacks we saw easier with what we call transients. A transient declares something like "at the moment, the value you want is a character stored in the register A". This code from UPL_COMP.CPP does precisely that:

      C.code.out(ASM_JSR);
      C.code.out_word_patch(RUNTIME_GET_CH_A);

      // Remember the state we are in now, where the character is in <A>.
      //
      upl_Context_state	key_state;
      C.mark(key_state);

      // Remember that we have stored the character in the accumulator 'a'.
      // We call this <*transient_a>. (Later we may decide to rollback
      // to this point...)
      //
      Result.set_value_type(upl_char, 0, upl_transient_a, &key_state);

      // For now though, we push it on the datastack.
      //
      C.code.out(ASM_JSR);
      C.code.out_word_patch(RUNTIME_PUSH_B);
After we've parsed the getch() expression, the assign_statement() parser indirectly calls load_reg() which makes sure the expression parsed is loaded into registers. The first thing does it is look to see if a transient is available. If it is, it rolls back to the code (in this case to C.mark(key_state)). By doing this, it knows the expression result (the character returned by getch()), is sitting in the A register.

void upl_Compiler::load_reg(
		upl_Context&	  C,
	const 	upl_Expr_result&  Clause)
{
  Select(Clause.is_transient() and Clause.transient != upl_transient_znc)
      C.rollback(Clause.transient_state);

      switch (Clause.transient)
	{
	case upl_transient_a:
	case upl_transient_ax:
	  break;

	case upl_transient_x:
	  C.code.out(ASM_TXA);
	  break;

	case upl_transient_y:
	  C.code.out(ASM_TYA);
	  break;

	case upl_transient_none:
	default:
	  abend(WHERE0, "Bad case");
	}

	...
assign_statement() then calls reg_var(), which takes the register values and stores them in the variable. The result is the optimised code we wanted:

jsr get_ch_a
sta [pressed]

Transients can include register values A, A/X, X, Y and also the condition flags ZNC. This lets us test branches by similarly avoiding a redundant push and pop.

Constant Optimisation

When expr() parses an expression, it not only generates the code but it also tries to evaluate the expression at compile time. If the expression includes function calls or looks up variable values, then obviously it can't do this. But if the expression is made of numbers and constants, then it can. If it finds the expression has a constant value, it rollsback the expression code and uses the final value instead. So this code:


rain = 12 + 6 * 2
compiles to this:

lda #24
sta [rain]

Peephole Optimisation

UPL_COPT.CPP implements a peephole optimiser. It looks at code as it is generated, and optimises it on the fly. For example, if we spot them multiplying or dividing by a power of 2 (2, 4, 8, etc.) then we can use a much faster bitshift:


  // If  is a non-zero constant with a single bit (e.g. 2, 4, 8, 16, etc.),
  // then we can perform fast multiplication or division using bit shifting.
  // This is much faster than calling the runtime libraries routines.
  //
  if (Op == upl_mul || Op == upl_div)
    if (c->value > 0)
      if (c->is_constant() && upl_Utility::count_bits(c->value) == 1)
	fast_mul_div = true;
and then below:

	case upl_mul:
	case upl_div:
	  {
	  // Assertion: expression  now sits in the accumulator.
	  //

	  // How many times we have to shift the accumulator depends
	  // on the value held in the constant <c>.
	  //
	  int shift_bits = upl_Utility::log2(c->value);


	  // Generate a shift instruction for each bit we have to shift the accumulator.
	  // If we're multiplying, shift left.  If we're dividing, shift right.
	  //
	  for (int i=0; i<shift_bits; i++)
	    C.code.out(Op == upl_mul ? ASM_ASL_A : ASM_LSR_A);

	  // The result of the addition is held in the accumulator.
	  //
	  push_byte_a = true;
	  }
	  break;
See UPL_COPT.CPP for details. This source file is well commented, and this type of optimisation is the easiest/most modular. It's also the module that most benefits from human eyes; UPL_COPT.CPP optimises the same way that people dom effectively looking at the last few lines of code and saying "Hey! I know a better way!"

Enabling Peephole Optimisation

By default, peephole optimisation is disabled. Turn it on using the -Op option. For example: quetzal -Op test.c

Once sufficiently tested we should enable peephole optimisation by default.

Flow Optimisation

A Flow Optimiser would take the entire linked executable, and optimise it in its entirity. This is sometimes called "Whole-of-program Optimisation." Currently Flow Optimisation is not supported. It is a suggested future extension.

Development Guidelines

Until a suitable volunteer appears, I'm happy to act as a central point-of-contact for people wanting to modify or otherwise extend the code. I'll keep a list of active developers and what they are doing on the Queztalcoatl web site.

Source Guidelines

If you change the source code, and want to integrate those changes back into the Queztalcoatl primary distribution, please:

  • Comment your code properly.

  • Keep to the naming, commenting and indentation conventions used in the source.

  • Let others know what you are doing. This'll save two or more people all doing the same thing at the same time.

  • EVERY SOURCE FILE HAS A MODIFICATION SECTION. PLEASE KEEP THIS UP TO DATE! This is located at the start of each file. When you make a change, make an entry in this section. Assign the modification a sequence number (the previous modification number plus 1, or 1 if there are no prior changes). Enter this number, your initials, the date and a brief description of the change. At the point where you make the change enter a comment describing it, along with the following tag: [initials date MODn]. eg. [bj 15sep1999 MOD6]. If your changes are long include comments denoting where they start and finish. Sticking with this format will make it easier for multiple people to make changes to a module without tripping over each other.

  • When appropriate, prefix a comment with the keyword: CAUTION, NOTE, TODO or when you're unsure of something, UNSURE. This'll help other programmers quick identify the relevance of your comment to their current situation.

Contact

If you're thinking of changing the code and want to check in, or you're trying to understand what a particular piece of code does or why it was implemented that way, then please contact me (See Legals above for contact details).

If you're using Quetzalcoatl for a 6502 programming project, I'm also interested to hear from you. To date there have been over (lost count) downloads of Quetzalcoatl and I'd love to know what on Earth it is that people are doing with it! J

Links

  • The Queztalcoatl web site. Contains the Quetzalcoatl binaries, news and links to the development pages.
  • The Geek VIC-20 web site. My collection of software, links and information on the VIC-20.
  • The Kdef.com web site. Where I Work.


© Brendan Jones, 1999-2006. All Rights Reserved. Quetzalcoatl and sachiel.com are trademarks of Brendan Jones. Kdef.com is a trademark of Kestrel Defence. Pussiesgalore.com.au is a trademark of Katina Balson. VIC-20 and C-64 are probably registered trademarks of someone.

This file is part of The Quetzalcoatl Compiler.

The Quetzalcoatl Compiler is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

The Quetzalcoatl Compiler is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with The Quetzalcoatl Compiler; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA