Sieve of Eratosthenes ‐ a prime number generator - retrotruestory/M1DEV GitHub Wiki

5/8/2004 by Bill Buzbee

Lost the use of the kitchen table again. Monica finished the costume for Maia's kindergarten production, but accidentally ruined it with a too-hot iron - so she's having to do it all over again. So, I've been poking around with the tool chain. Back in the early 80's, Byte Magazine published a set of small CPU benchmarks. The benchmarks themselves are pretty worthless using today's standards, but I thought it would be fun to compare Magic-1 to some of the early 80's machines. It's not a fair comparison - I'm using vastly faster memory than was typically available then - but what the hell....

Anyway, the most famous of the Byte bencharks was the Sieve of Eratosthenes - a prime number generator. I found a copy and stripped out all of the timing, error-checking and target dependent code. Here's what's left:

    #define SIZE	8191
    #define M	(SIZE+1)
    #define N	10
    #define TRUE	1
    #define FALSE	0

    int
    main() {
        int i,prime,count,k;
        int iter;
        static char flags[M];

        for (iter=1; iter <= N; iter++) {
            count = 0; 
            for (i=0; i <= SIZE; i++) {
                flags[i] = TRUE;
            }
            for (i=0; i <= SIZE; i++) {
                if (flags[i]) {
                    count++;
    		prime = i + i + 3;
                    // remove all multiples of prime
                    for (k=i+prime; k <= SIZE; k+=prime) {
                        flags[k] = FALSE;
                    }
                }
            }
        }
        //printf("Count: %d, 0x%x\n", count,count);
        return(count);
    }

I'm having main "return(count)" to ensure the result winds up in register A - my integer return register, and the benchmark is set up to do 10 iterations - each of which should find 1899 primes.

Given that I already have a C compiler, I thought this would be easy - just compile and go. However, the exercise led me to find and fix several microcode bugs, two C compiler problems and do some enhancements to my hack assembler. However, it works now (at least in simulation - have to wait for kitchen table space to try it on real hardware).

Following is the assembler listing, with embedded source code. I've highlighed the original C code in red. Note that loop code is embedded at the beginning and end of loops, so you'll see some duplication. Also, this is a one-pass assembler which generates the listing as it goes, so forward references show up as "----" in the code listing. You can find the final values in the fixup section, which is generated at the end when forward references are backpatched.

Overall, the code quality isn't too bad considering that I haven't done much tuning and lcc really isn't an optimizing compiler. Until I get a linker up and running, I'm also having lcc inject some startup code, including a placeholder for the interrupt vector. Based on short runs on my simulator, I expect to complete 10 iterations in about 4.7 seconds on M-1 running at 3.68 MHz. Also, for this workload I'm averaging one M-1 instruction every 5 clocks.

In real terms, that means I'll be running somewhere between .5 and .75 million instructions per second - though that metric has little to do with the [in]famous Vax MIPs rating. Eventually, I'll run dhrystone on this beast. Just for kicks, I tried compiling dhrystone, but failed about 90% through the main file. Another C compiler retargeting bug to track down. Anyway, here's sieve:

         :            | ;	Magic-1 assembly file, generated by lcc 4.2
         :            | _start:
    0000 : 83 ----    | 	br	over_ivec
    0003 : ----       | 	defw	unhandled_exception	; Interrupt vector 1
    0005 : ----       | 	defw	unhandled_exception	; Interrupt vector 2
    0007 : ----       | 	defw	unhandled_exception	; Interrupt vector 3
    0009 : ----       | 	defw	unhandled_exception	; Interrupt vector 4
    000b : ----       | 	defw	unhandled_exception	; Interrupt vector 5
    000d : ----       | 	defw	unhandled_exception	; Interrupt vector 6
    000f : ----       | 	defw	unhandled_exception	; Interrupt vector 7
    0011 : ----       | 	defw	unhandled_exception	; Interrupt vector 8
    0013 : ----       | 	defw	unhandled_exception	; Interrupt vector 9
    0015 : ----       | 	defw	unhandled_exception	; Interrupt vector a
    0017 : ----       | 	defw	unhandled_exception	; Interrupt vector b
    0019 : ----       | 	defw	unhandled_exception	; Interrupt vector c
    001b : ----       | 	defw	unhandled_exception	; Interrupt vector d
    001d : ----       | 	defw	unhandled_exception	; Interrupt vector e
    001f : ----       | 	defw	unhandled_exception	; Interrupt vector f
         :            | unhandled_exception:
    0021 : 00         | 	halt
         :            | over_ivec:
    0022 : 7c 8000    | 	ld.16	a,0x8000
    0025 : cb         | 	copy	sp,a
    0026 : 7c 4000    | 	ld.16	a,0x4000
    0029 : b6         | 	copy	dp,a
    002a : 80 ----    | 	call	_main
    002d : 00         | 	halt
         :            | 	bss
         :            | L2:
         :		  | 	defs 8192
         :            | 	global _main
         :            | 	cseg
         :            | _main:
    002e : e5 f4      | 	enter	12
         :            | ;main() {
         :            | ;    for (iter=1; iter <= N; iter++) {
    0030 : 7a 01      | 	ld.16	a,1
    0032 : d9 04      | 	st.16	-10+14(sp),a
         :            | L3:
         :            | ;        count = 0; 
    0034 : 7a 00      | 	ld.16	a,0
    0036 : d9 06      | 	st.16	-8+14(sp),a
         :            | ;        for (i=0; i <= SIZE; i++) {
    0038 : 7a 00      | 	ld.16	a,0
    003a : d9 0a      | 	st.16	-4+14(sp),a
         :            | L7:
         :            | ;            flags[i] = TRUE;
    003c : 81 0a      | 	ld.16	a,-4+14(sp)
    003e : 74 002e    | 	lea	b,L2(dp)
    0041 : 3f         | 	add.16	a,b
    0042 : 79 01      | 	ld.8	b,1
    0044 : d6 00      | 	st.8	0(a),b
         :            | ;        }
         :            | L8:
         :            | ;        for (i=0; i <= SIZE; i++) {
    0046 : 81 0a      | 	ld.16	a,-4+14(sp)
    0048 : 3d 01      | 	add.16	a,1
    004a : d9 0a      | 	st.16	-4+14(sp),a
    004c : 81 0a      | 	ld.16	a,-4+14(sp)
    004e : bc 1fff ea | 	cmpb.le.16	a,8191,L7
         :            | ;        for (i=0; i <= SIZE; i++) {
    0052 : 7a 00      | 	ld.16	a,0
    0054 : d9 0a      | 	st.16	-4+14(sp),a
         :            | L11:
         :            | ;            if (flags[i]) {
    0056 : 81 0a      | 	ld.16	a,-4+14(sp)
    0058 : 74 002e    | 	lea	b,L2(dp)
    005b : 3f         | 	add.16	a,b
    005c : 12 00      | 	ld.8	a,0(a)
    005e : 95  --     | 	cmpb.eq.8	a,0,L15
         :            | ;                count++;
    0060 : 81 06      | 	ld.16	a,-8+14(sp)
    0062 : 3d 01      | 	add.16	a,1
    0064 : d9 06      | 	st.16	-8+14(sp),a
         :            | ;		prime = i + i + 3;
    0066 : 81 0a      | 	ld.16	a,-4+14(sp)
    0068 : d9 02      | 	st.16	-12+14(sp),a
    006a : 85 02      | 	ld.16	b,-12+14(sp)
    006c : 3f         | 	add.16	a,b
    006d : 3d 03      | 	add.16	a,3
    006f : d9 08      | 	st.16	-6+14(sp),a
         :            | ;                for (k=i+prime; k <= SIZE; k+=prime) {
    0071 : 81 0a      | 	ld.16	a,-4+14(sp)
    0073 : 39 08      | 	add.16	a,-6+14(sp)
    0075 : d9 0c      | 	st.16	-2+14(sp),a
    0077 : 83 ----    | 	br	L20
         :            | L17:
         :            | ;                    flags[k] = FALSE;
    007a : 81 0c      | 	ld.16	a,-2+14(sp)
    007c : 74 002e    | 	lea	b,L2(dp)
    007f : 3f         | 	add.16	a,b
    0080 : 79 00      | 	ld.8	b,0
    0082 : d6 00      | 	st.8	0(a),b
         :            | ;                }
         :            | L18:
         :            | ;                for (k=i+prime; k <= SIZE; k+=prime) {
    0084 : 81 0c      | 	ld.16	a,-2+14(sp)
    0086 : 39 08      | 	add.16	a,-6+14(sp)
    0088 : d9 0c      | 	st.16	-2+14(sp),a
         :            | L20:
    008a : 81 0c      | 	ld.16	a,-2+14(sp)
    008c : bc 1fff ea | 	cmpb.le.16	a,8191,L17
         :            | ;            }
         :            | L15:
         :            | ;        }
         :            | L12:
         :            | ;        for (i=0; i <= SIZE; i++) {
    0090 : 81 0a      | 	ld.16	a,-4+14(sp)
    0092 : 3d 01      | 	add.16	a,1
    0094 : d9 0a      | 	st.16	-4+14(sp),a
    0096 : 81 0a      | 	ld.16	a,-4+14(sp)
    0098 : bc 1fff ba | 	cmpb.le.16	a,8191,L11
         :            | ;    }
         :            | L4:
         :            | ;    for (iter=1; iter <= N; iter++) {
    009c : 81 04      | 	ld.16	a,-10+14(sp)
    009e : 3d 01      | 	add.16	a,1
    00a0 : d9 04      | 	st.16	-10+14(sp),a
    00a2 : 81 04      | 	ld.16	a,-10+14(sp)
    00a4 : bd 0a 8d   | 	cmpb.le.16	a,10,L3
         :            | ;    return(count);
    00a7 : 81 06      | 	ld.16	a,-8+14(sp)
         :            | L1:
    00a9 : 0d         | 	pop	sp
    00aa : 0b         | 	pop	pc
         :            | 	end
    =========================== Fixups =================================
    Fixup applied, 2-byte store of 0x0010 to 0x0078
    Fixup applied, 1-byte store of 0x0030 to 0x005f
    Fixup applied, 2-byte store of 0x0001 to 0x002b
    Fixup applied, 2-byte store of 0x0021 to 0x001f
    Fixup applied, 2-byte store of 0x0021 to 0x001d
    Fixup applied, 2-byte store of 0x0021 to 0x001b
    Fixup applied, 2-byte store of 0x0021 to 0x0019
    Fixup applied, 2-byte store of 0x0021 to 0x0017
    Fixup applied, 2-byte store of 0x0021 to 0x0015
    Fixup applied, 2-byte store of 0x0021 to 0x0013
    Fixup applied, 2-byte store of 0x0021 to 0x0011
    Fixup applied, 2-byte store of 0x0021 to 0x000f
    Fixup applied, 2-byte store of 0x0021 to 0x000d
    Fixup applied, 2-byte store of 0x0021 to 0x000b
    Fixup applied, 2-byte store of 0x0021 to 0x0009
    Fixup applied, 2-byte store of 0x0021 to 0x0007
    Fixup applied, 2-byte store of 0x0021 to 0x0005
    Fixup applied, 2-byte store of 0x0021 to 0x0003
    Fixup applied, 2-byte store of 0x001f to 0x0001
    =========================== Symbols =================================
    L1 	-> 0x00a9
    L4 	-> 0x009c
    L12 	-> 0x0090
    L18 	-> 0x0084
    L17 	-> 0x007a
    L20 	-> 0x008a
    L15 	-> 0x0090
    L11 	-> 0x0056
    L8 	-> 0x0046
    L7 	-> 0x003c
    L3 	-> 0x0034
    L2 	-> 0x002e
    _main 	-> 0x002e
    unhandled_exception 	-> 0x0021
    over_ivec 	-> 0x0022
    _start 	-> 0x0000

 
⚠️ **GitHub.com Fallback** ⚠️