Thursday, March 26, 2009

In this project i have written more macros than ever before. The main reason is the similarities between opcodes, that the way the read operands are similar. By using macros the opcode implementations can be kept down to 2-3 lines of code instead of 6-7, which makes it much easier to maintain. It is also generally faster implementing it as macros as real functions.

Below you can see macros helping with data retrieval. The code is a bit more linebreaked here, just to fit the stupid fixed blog width.

// Macros for addressing modes
#define READ_BYTE_IMM() read_byte( PC++ )

// Read addresses
#define READ_ADDR_ZP() (READ_BYTE_IMM())
#define READ_ADDR_ZP_X() ((READ_BYTE_IMM() + X) & 0xff)
#define READ_ADDR_ZP_Y() ((READ_BYTE_IMM() + Y) & 0xff)

#define READ_ADDR_ABS() (READ_BYTE_IMM() \
| READ_BYTE_IMM()<<8)
#define READ_ADDR_ABS_X() (READ_ADDR_ABS() + X)
#define READ_ADDR_ABS_Y() (READ_ADDR_ABS() + Y)

#define READ_ADDR_IND_X() (read_word_zp(\
READ_BYTE_IMM() + X))
#define READ_ADDR_IND_Y() (read_word_zp(\
READ_BYTE_IMM()) + Y)

#define READ_JUMP_ADDR() (b1 = READ_BYTE_IMM(), b1 & \
0x80 ? (PC - ((b1 ^ 0xff)+1)) : (PC + b1))

// Read data
#define READ_BYTE_ZP() read_byte_zp(READ_ADDR_ZP())
#define READ_BYTE_ZP_X() read_byte_zp(READ_ADDR_ZP_X())
#define READ_BYTE_ZP_Y() read_byte_zp(READ_ADDR_ZP_Y())

#define READ_BYTE_ABS() read_byte(READ_ADDR_ABS())
#define READ_BYTE_ABS_X() read_byte(READ_ADDR_ABS_X())
#define READ_BYTE_ABS_Y() read_byte(READ_ADDR_ABS_Y())

#define READ_BYTE_IND_X() read_byte(READ_ADDR_IND_X())
#define READ_BYTE_IND_Y() read_byte(READ_ADDR_IND_Y())

#define PUSH_BYTE_STACK(b) (memory->mem[ (SP--) \
| STACK_BOTTOM ] = (b))
#define POP_BYTE_STACK(b) memory->mem[ (++SP) \
| STACK_BOTTOM ]

// Macros for flag handling
#define SET_FLAG_NZ(B) (N = Z = B)

#define IS_ZERO (!Z)
#define IS_NEGATIVE (!!(N & 0x80))


This allows opcode implementations like:

 // And accumulator with memory
case AND_IMM:
SET_FLAG_NZ(A &= READ_BYTE_IMM());
break;
case AND_ZP:
SET_FLAG_NZ(A &= READ_BYTE_ZP());
break;
case AND_ZP_X:
SET_FLAG_NZ(A &= READ_BYTE_ZP_X());
break;
case AND_ABS:
SET_FLAG_NZ(A &= READ_BYTE_ABS());
break;
case AND_ABS_X:
SET_FLAG_NZ(A &= READ_BYTE_ABS_X());
break;
case AND_ABS_Y:
SET_FLAG_NZ(A &= READ_BYTE_ABS_Y());
break;
case AND_IND_X:
SET_FLAG_NZ(A &= READ_BYTE_IND_X());
break;
case AND_IND_Y :
SET_FLAG_NZ(A &= READ_BYTE_IND_Y());
break;


and

 case JMP_ABS:
PC = READ_ADDR_ABS();
break;
case JMP_IND:
PC = read_word(READ_ADDR_ABS());
break;

case JSR:
PUSH_BYTE_STACK( PC+1 >> 8 );
PUSH_BYTE_STACK( PC+1 );
PC = READ_ADDR_ABS();
break;



Nice!

Booting Basic!

Some week ago I tried to load the Basic ROM from an Oric1 and run it with my CPU emulator. First it behaved a bit strange. It didn't return from subroutines as expected. After some debugging  I found  that I had forgotten to add the necessary one to the PC stored on stack when doing RTS. After fixing that it seems to run! I need to read some more on the Oric ROM:s to actually know that it does what expected, but sweet progression after all!

Friday, March 13, 2009

Total recall

First time ever syndome... When completing the flag handling and the decimal arithmetic mode I realized that my opcode implementations were inefficient. I had already used macros for such things as memory address decoding and memory retrieval, but still every opcode was around 4-6 lines of code, code that had to be correct and maintained. I have now rewritten it with better macros and the opcodes are now around 2-3 lines of code. The execution speed, mainly be using a more lazy flag approach, seems to be faster as well. Yesterday I implemented the decimal version of the ADC and ABC functions, allowing addition and subtraction decimally. I borrowed some idead, mainly the binary arithmetics, from Frodo here.

Plans now are to add a new thread to my emulator, allowing access to memory and registers during execution. I need something like that to test and debug my processor. 

After that I will implement some initial binary loading for my memory class. I should at least add a function that loads binaries to specified memory locations. There seems to be binary formats for this that I probably want to support as well. 

Good is that my CPU seems to be fully implemented! Now I need to investigate the rest!

Wednesday, March 11, 2009

Waving flags

Yesterday I read some on performance tuning for emulator programming. I realized that I did at least one thing in a not optimal way. The N and Z flags, telling if the last operation resulted in a negative or zero result, should be handled lazy. My opcodes had macros calculating those flags for all instructions where it was applicable. That means that I called "Z = !!result" and "N = !!(result & 0x80)" a lot of times. Those flags are only read through branch instructions and used in a few other cases. Now the flags internally contain the value of the latest operation that can affect them. Then, when needed, I decode them as shown above.

I need to refactor some of the memory decoding handling. It has shown out to not be optimal when flag handling comes in count. I still need to add handling of the C and V flag, but after reading some on it it should be no big problem. Other things yet to fix is decimal mode. The 6502 can be set in a decimal mode where arithmetic operations go by base 10 instead of 16. 

Finally a nice Oric 1 motherboard:


Tuesday, March 10, 2009

Interrupts!

Time to quit for the night. I now have interrupts working as expected. The following small program runs fine:

.ORG $1000
; initial IRQ/BRK interrupt vector to handler 1
LDA #$00
STA $FFFE
LDA #$04
STA $FFFF

; Nestled loop, 16*16
LDY #$10
LDX #$10
DEX
BNE #-3
BRK
NOP
DEY
BNE #-10

; Set interrupt vector to handler 2
LDA #$00
STA $FFFE
LDA #$05
STA $FFFF

BRK

.ORG $0400
; interrupt handler 1
NOP
NOP
NOP
RTI

.ORG $0500
; interrupt handler 2 - loop forever
JMP $0500

I really need an assembler for my small programs. It is pretty painful to assembler them by hand. I also need to add more testing of my opcodes. Guess this might be the time for me to learn unit testing.

Monday, March 9, 2009

Macros

Good progression this far! All the op-codes are now implemented. I have written more C macros than I ever thought I would do, but with 151 op codes with similar contents it really helps! I still need to go through them all to make sure they act correctly on flags, but I am able to run more and more advanced programs. It was very nice to see a nestled loop with software BRK-interrupts to work properly!

Some things still puzzle me. Why should the JSR instruction store only PC+2 (JSR uses 3 bytes) on stack and RTS add that one byte again? Especially since interrupts store the correct address to next instruction, so that RTI can directly read it to PC. If someone reads this and knows, please tell!

Another 6502 based beauty, the Apple II.

Instructions

Some days I started my work on this project. I will write it in C++ under Linux, my favorite platform. I began trying to write it in KDevelop, the integrated development environment used by the KDE project. But after some hours I realized that I missed my Emacs editor so much that I switched to my normal setup with Emacs and make. I am pretty sure that KDevelop is a great environment, but I love Emacs and the productivity I get there.

As a first thing I created classes for a MOS6502 processor and for Memory. I then copies a list of opcodes from a 6502 FAQ and created defines for each opcode mapped to the hexcode, so I can use that instead of hexcodes in my emulator. I then began the big work to implement the opcodes. I have now implemented LDA, LDX, LDY, STA, STX, STY and ADC with all the available addressing modes. I have also implemented some parts of the Memory class. Together with registers A, X and Y as well as program counter I managed to run a small assembler program today. The first program running in my emulator ever was:

LDA #80
STA $1000
BRK