Last updated: January 24, 2007 13:03
Maybe some infos in this document are wrong so take it as is :P
(Pencil art courtesy of PreyingDantis)

Courtesy of PreyingDantis - DeviantArt

Setting up
You can download a binary version of fpc4gba, or you could try to rebuild your own copy by hand. I suggest this second way, because you will get the very latest compiler and library (you know, many bugs fixed and so on =) )

Binary install:
Download the latest fpc4gba snapshot here:

or here:

Unpack it in a folder of your choice, set the path and edit fpc.cfg accordingly. Refer to readme.txt for more infos.
More, download the new gbalib for fpc that you can find *here* and copy all file in your freepascal\units\arm-gba directory.

Source install:
Download a freepascal 2.1.x binary release for your platform (eg. i386-win32) that you can get here:

Download the binutils for gba here:

Get the latest fpc sources here:

Get the FPC4GBA Batch Builder.

Install the freepascal binaries, then unzip the binutils in a directory of your choice and the fpc sources in another directory. Edit the batch builder accordingly, by providing a folder where you want to install your fpc4gba copy. Save and launch the batch file. If you are lucky, you will get a congratulations message stating that all went right. The last step is to add your fpc4gba bin/arm-gba directory to your search PATH.
If you can't get this batch file working, feel free to contact me. In all cases you can follow the step-by-step guide you will find here.

- A text editor of your choice. I use PSPad, but any other editor should be good.
- A graphic file editor/converter. I suggest Usenti, that can resolve all your problems.
- A tilemap editor. You can try TileStudio.

GBA Hardware
ARM7TDMI @16.78 mhz. This cpu has two different instruction set: ARM (32 bit) and THUMB (16 bit). ARM instructions can do more stuff in a single command, so it should run faster. A THUMB instruction is reduced to 16 bit so it can be executed directly in ROM in a single cycle, because the bus size to ROM is 16 bit too. Unfortunatley, at the moment I'm writing this tutorial, fpc can handle ARM instruction set only.

The Gameboy Advance has several memory types: 96kb of video memory; 32kb of fast internal ram; 256 kb of external ram. This memory is subdivided as follow:

EWRAM: (external working ram) starts at $2000000 and is 256kb wide. It can be accessed in read/write mode at 8, 16 and 32 bit in two cycles.
IWRAM: (internal working ram) starts at $3000000 and is 32kb wide. It can be accessed in read/write mode in a single cycle (so IWRAM is faster than EWRAM), at 8, 16 and 32 bit; all your variables and the stack normally should go here.
IORAM: (IO registers) starts at $4000000 and is 1kb wide. You can control all GBA hardware by setting or unsetting bits in this section.
PALRAM: (palette memory) starts at $5000000 and is 1kb wide. It contains ttwo color palettes of 256 entries of 15 bit each. The first palette is for the background, the second one is for the sprites.
VRAM: (video ram) starts at $6000000 and is 96kb wide. According to video mode, this memory is divided in different way between video memory and sprite memory. The video bus is 16 bit, so it should be a good idea to write video memory 16 or 32 bit at time. If you try to write 8 bit at time, they will be aligned to 16 bit boundaries, so the same data will be written two times instead of one. And this is BAD!
OAM: (object attribute memory) starts at $7000000 and is 1kb wide. In this section you can control sprite attributes.
ROM: (read only memory) starts at $8000000 and is max 32Mb wide. As its name says, it can be accessed in read mode only.

GBA has 6 different video mode, 3 allowing single pixel access and 3 allowing handling tiles via hardware. In next few chapters we will see how to use all these modes.

Bitmap Modes 
Bitmap mode allows to access every single pixel of the screen. You can assign a 16 bit color value or an index value to it, that represent the index in the palette of the color you want. Let's look at bitmap modes a bit deeper.

Mode 3:
This is the easiest mode you can use to put something on the screen. In this mode the screen is made of 240*160 pixel of 15 bit color each. You can think it like an array of 240*160 values, so in order to access a pixel, you should access the (width * y + x)th element of the video array, where width is the screen width, x and y the position on the screen. But let see an example: we will try to put a red pixel in the middle of the screen. Look at this code:


program main;

  gba_regs, gba_video;

  SetMode(MODE_3 or BG2_ENABLE);
 // Set Mode3 and activates BG2
  VideoBuffer[240 * 80 + 120] := RGB(31,0,0); // Put a red pixel

That's all! Pretty simple, don't you think so? Now let try to fill the whole screen with a blue color:


program main;

  gba_regs, gba_types, gba_video;

  i: u32;
  SetMode(MODE_3 or BG2_ENABLE);
  for i := 0 to 240*160 - 1 do  // loop trough all pixels
    VideoBuffer[i] := RGB(0,0,31);

Nice, but slooowww! Don't worry, You will use this mode for static screens only. You should have noticed that it is possible to see screen rows being updated, and that looks bad... A trick could be to set Mode 3 *after* that you have filled the video memory. Note that I am using an u32 for a loop varialbe: the gba is a 32 bit machine, so works faster with 32 bits variables.

As last example, we'll try to use mode3 for what it was made: displaying pictures. We need a tool to translate images in a way that our compiler can understand: for this purpose we'll use Usenti. Usenti comes with an exhaustive help, so if you are in trouble, RTFM! :P So let's start: open Usenti and load an image you want to show on the gba (240*160 of course), then export it (image->export...), chose a name (something like 'mode3bmp') then click 'Save'. It should popup a new form with some settings to do:

  1. let checked 'Image' only
  2. select 'bitmap', bpp='16', cprs='none'
  3. file type='GAS (*.s)'
  4. check the 'u16' button
  5. press 'Ok'.


Usenti exporter

Now you should have a new file (mode3bmp.s) that you need to pass to assembler, in order to get an object file to link to our rom. If you want (and I suggest to do so), use the template that you can find in 'examples' directory and modify the makefile as follows:

OBJECTS = prt0.o mode3bmp.o main.o

In this way, the makefile will do the job for you :) Let's look at the code:

program main;
{$L mode3bmp.o}  //Linking the external object file

  gba_regs, gba_types, gba_video;

  //Declaring our external image
  mode3bmpBitmap: array [0..240*160-1] of u16; cvar; external;
  i: u32;
  SetMode(MODE_3 or BG2_ENABLE);
  for i := 0 to 240*160-1 do  // Loop
    VideoBuffer[i] := u32(mode3bmpBitmap[i]);  //Note u32!

We need to cast our bitmap to u32, because we should read and write 2 pixel at time. In this example the screen is refreshed very slowly too, but don't worry! As I already have said, we will introduce a method that will decrease drastically memory reading and writing time access.

Mode 4:
If Mode3 is good to show static images, with Mode4 you can put on the screen some smooth animations. In fact, this mode is palettized and there is some space in video memory to draw two screens (a front buffer and a back buffer), so you can use page flipping and double buffering. Let's look at a first example, where we'll put a static image on the screen. We need a tool that can produce images with a separate optimized palette of 256 colors. Usenti is smart enough to do this stuff: in the 'export' window check 'Pal', start=0, num=256, check 'image', select 'bitmap', bpp=8, cprs=none, type=gas, select 'u32' then press 'Ok'. You should get a new .s file that you should use like seen in mode3. Here the source code:

program main;
{$L mode4.o}

  gba_regs, gba_types, gba_video;

  x: u32;
  // From mode4.s, made by usenti
  mode4Pal: array [0..0] of u16; cvar; external;
  mode4Bitmap: array [0..0] of u16; cvar; external;
  SetMode(MODE_4 or BG2_ENABLE);
  for x := 0 to 255 do      // Filling the palette
  BG_COLORS[x] := u32(mode4Pal[x]); 
  for x := 0 to 19199 do    // Filling the screen
  VideoBuffer[x] := u32(mode4Bitmap[x]); 

Because of VRAM needs to be accessed 2 bytes at time, we need all that typecasts. In the next example we'll try to make a simple animation, by using a page flipping system. Get the frames of an animation you want to show and open the images fame by frame with Usenti. Compared to previous example, you should check 'append'. In this way, you will get an unique file with palette array and a big array with all images.

program main;
{$L walk.o} 
  gba_core, gba_regs, gba_types, gba_video;

  // From walker.s, made by usenti
  palettePal: array [0 .. 0] of u16; cvar; external;
  bmpwalker: array [0 .. 0] of u16; cvar; external;
  x, n: u32;
  SetMode(MODE_4 or BG2_ENABLE);
  for x := 0 to 255 do
    BG_COLORS[x] := u32(palettePal[x]); 
  while true do  //animation stuff
    for x := 0 to 19199 do
    VideoBuffer[x] := u32(bmpWalker[x + (n * 19200)]); 
    Flip();  // Flip backbuffer to frontbuffer
    n := (n + 1) mod 17;  // We have 17 frames in our movie

Aaaargh! Awful. It looks like a slow motion movie... Well, I think that we should do something to improve speed. Let's look at the following example:


program main;
{$L walk.o}  

  gba_core, gba_regs, gba_types, gba_video;

  // From walker.s, made by usenti
  palettePal: array [0 .. 0] of u16; cvar; external;
  bmpwalker: array [0 .. 0] of u32; cvar; external;

  x, n: u32;

  SetMode(MODE_4 or BG2_ENABLE);

  for x := 0 to 255 do
    BGPal[x] := u32(palettePal[x]); 

  while true do
    memcpy32(VideoBuffer, @bmpWalker[(n * 9600)], 9600);
    n:=(n+1) mod 17;

Note that I have added a call to Wait() function in order to slow down the execution speed :). We are using an asm optimized routine (memcpy32) that copies our 32 bits data in the right memory location (for 16 bits data, there is memcpy16). However, for small amounts of data you can use a loop, as you can see in the example for palette filling. This isn't the only way to move around data chuncks, because the GBA BIOS allows DMA copy, that is faster than memcpy, but its execution stops the CPU. This can cause problems with interrupts, so pay attention when you are using it.

Mode 5:
I don't go in too much deep for this mode. In fact, if you have understood Mode 3 and 4, then you know all you need for Mode 5. In fact, here you have a fullcoror screen like Mode 3, but you have room for a back buffer too. This is possible because in Mode 5 we have only a reduced screen of 160*128 pixels. This screen mode is useful when you have to show a movie in fullcolor: you need more than 256 colors of Mode 4 and you need also a back buffer. You are too smart to want an example here :P

Tile Modes
Stop jocking now and let start the serious work! Indeed, if you want to make a game, you need to use a tile mode. Unlike bitmap modes, in tile modes you can access several layers (or background), that you can draw, manipulate, scale, rotate, according to the mode you are using. Following table shows layer's features:

Standard Mode
Advanced Mode
Amount of tiles
1024*16 colors
256*256 colors
256*256 colors
Screen size (tiles)
32*32; 32*64;
64*32; 64*64
16*16; 32*32;
64*64; 128*128
Screen size (pixels)
Scrolling; flipping;
fading; alpha blending;

fading; alpha blending;
rotation; scaling

In each mode, each background can be accessed in standard mode or in advanced mode, following this table:

Mode 0
Mode 1
Mode 2

In each tile mode you need to load both tileset and map in memory, that are stored both in VRAM, starting at $6000000. Whole VRAM can be subdivided in 4 Char Blocks (that store the tiles) or in 32 Screen Blocks (that store the map). Following table helps to have a better understand of VRAM:

Char Blocks (16kb)
Memory Address (2kb)
Screen Blocks
Char Block 0
Screen Block 00
Screen Block 01
Screen Block 02
Screen Block 03
Screen Block 04
Screen Block 05
Screen Block 06
Screen Block 07
Char Block 1
Screen Block 08
Screen Block 09
Screen Block 10
Screen Block 11
Screen Block 12
Screen Block 13
Screen Block 14
Screen Block 15
Char Block 2
Screen Block 16
Screen Block 17
Screen Block 18
Screen Block 19
Screen Block 20
Screen Block 21
Screen Block 22
Screen Block 23
Char Block 3
Screen Block 24
Screen Block 25
Screen Block 26
Screen Block 27
Screen Block 28
Screen Block 29
Screen Block 30
Screen Block 31

Each screen block is 2kb wide ($800=2048 bites); each char block is 8 screen blocks wide (8 screen block * 2kb = 16kb). Ok, now we know how much memory we have... but how many tiles we can use? Easy to calculate: first of all, a tile is 8*8 pixels wide; in order to specify a palette index for a 16 color pixel, we need values from 0 to 15, so 4 bits are enough. Let's do some calculations:

a 16 colors tile = 8*8 pixels = 64 * 4 bits = 256 bits = 32 bytes 

(Phew) So: each 16 color tile takes 32 bytes. What about 256 colors tiles (that you need for advanced mode)? In this case, we need 8 bits in order to specify a 256 index value for the palette.

256 colors tile = 8*8 pixels = 64 * 8 bits = 512 bits = 64 bytes 

The table below could be useful as reference:

Block type (Memory)
16 colors
256 colors
1 Screen Block (2kb)
a map of 64 tiles
a map of 32 tiles
1 Char Block (16kb)
512 tiles
256 tiles

Now you should have an idea about how tiles are working in memory. Of course you can't store tiles and maps in the same memory location, so be careful: overlapping is a bad thing! The memory taken by a tilemap varies, according to screen size and standard or advanced mode you are using.

Screen size
Screen size

How much blabbing... Well, now we can try to go deeper in tile mode, showing some small examples. Basically, the three tile modes are pretty similar, so we will analyse the differences between standard and advanced background only.

Standard background
Tiled modes aren't so difficult to understand. In addiction to the code we already have seen for bitmap modes, you should set some attributes to the layer you want to use. I think that a line of code explains better that a thousand of words, so let's show a basic example:

program main;

  gba_regs, gba_types, gba_video, gba_bg;

  i: u32;
  pal: pu16;
  tiles: pu16;
  map: pu16;
  pal := MEM_BG_PAL;        // pointer to te palette 
  tiles := MEM_BG_CHAR(0);  // pointer to char block
  map := MEM_BG_MAP(24);    // pointer to screen block (map)
  SetMode(MODE_0 or BG0_ENABLE);

  // BG0 settings:
  REG_BG0CNT^ := (BG_SIZEA_256_256 or // screen size is 256*256 pixels
                  BG_COLOR_256 or     // we want a 256 colors palette
                  BG_CHARBASE(0) or   // we are using char block nr.0
                  BG_MAPBASE(24));    // we are using screen block nr.24
  pal[0] := RGB(0, 0, 0);          // stores black color in the palette
  pal[1] := RGB(0, 0, 31);         // stores blue color in the palette
  pal[2] := RGB(0, 31, 0);         // stores green color in the palette
  // drawing 3 colored tiles in screen block
  // remember that in VRAM we are writting 2 pixels at time, so each 
  // pair of pixels is n + (n shl 8), where n is palette index
  for i := 0 to 31 do
    tiles[i] := 0 + (0 shl 8);
  for i := 32 to 63 do
    tiles[i] := 1 + (1 shl 8);
  for i := 64 to 95 do
    tiles[i] := 2 + (2 shl 8);

  map[32 * 9 + 15] := 1;    // drawing a tile stored in pos. 1
  map[32 * 10 + 15] := 2;  // drawing a tile stored in pos. 2