RetroBSD

2.11BSD operating system for microcontrollers
It is currently Mon May 29, 2017 9:37 am

All times are UTC




Post new topic Reply to topic  [ 18 posts ] 
Author Message
 Post subject: 8Queens - experiments
PostPosted: Tue Feb 09, 2016 1:24 pm 
Contributor
User avatar

Joined: Thu Nov 08, 2012 7:04 am
Posts: 2393
Location: Rapa Nui
Let us compare some 8queens algorithms with different languages (because we can).. :)
On Rosetta code I found this as the fastest in C:
Code:
#include <stdio.h>
#define MAXN 31
 
int nqueens(int n)
{
  int q0,q1;
  int cols[MAXN], diagl[MAXN], diagr[MAXN], posibs[MAXN]; // Our backtracking 'stack'
  int num=0;
  //
  // The top level is two fors, to save one bit of symmetry in the enumeration by forcing second queen to
  // be AFTER the first queen.
  //
  for (q0=0; q0<n-2; q0++) {
    for (q1=q0+2; q1<n; q1++){
      int bit0 = 1<<q0;
      int bit1 = 1<<q1;
      int d=0; // d is our depth in the backtrack stack
      cols[0] = bit0 | bit1 | (-1<<n); // The -1 here is used to fill all 'coloumn' bits after n ...
      diagl[0]= (bit0<<1 | bit1)<<1;
      diagr[0]= (bit0>>1 | bit1)>>1;
 
      //  The variable posib contains the bitmask of possibilities we still have to try in a given row ...
      int posib = ~(cols[0] | diagl[0] | diagr[0]);
 
      while (d >= 0) {
        while(posib) {
          int bit = posib & -posib; // The standard trick for getting the rightmost bit in the mask
          int ncols= cols[d] | bit;
          int ndiagl = (diagl[d] | bit) << 1;
          int ndiagr = (diagr[d] | bit) >> 1;
          int nposib = ~(ncols | ndiagl | ndiagr);
          posib^=bit; // Eliminate the tried possibility.
 
          // The following is the main additional trick here, as recognizing solution can not be done using stack level (d),
          // since we save the depth+backtrack time at the end of the enumeration loop. However by noticing all coloumns are
          // filled (comparison to -1) we know a solution was reached ...
          // Notice also that avoiding an if on the ncols==-1 comparison is more efficient!
          num += ncols==-1;
 
          if (nposib) {
            if (posib) { // This if saves stack depth + backtrack operations when we passed the last possibility in a row.
              posibs[d++] = posib; // Go lower in stack ..
            }
            cols[d] = ncols;
            diagl[d] = ndiagl;
            diagr[d] = ndiagr;
            posib = nposib;
          }
        }
        posib = posibs[--d]; // backtrack ...
      }
    }
  }
  return num*2;
}
 
 
main(int ac , char **av)
{
  if(ac != 2) {
    printf("usage: nq n\n");
    return 1;
  }
  int n = atoi(av[1]);
  if(n<1 || n > MAXN) {
    printf("n must be between 2 and 31!\n");
  }
  printf("Number of solution for %d is %d\n",n,nqueens(n));
}
 

Compiled straight in PellesC and it did
Code:
C:\Users\pito>C:\Users\pito\MyCode\C\8queens\8queens.exe 16
Number of solution for 16 is 14772512

in 8secs on my i5 notebook.

I've mentioned Mecrisp Forth - with rather conservative textbook (slow?) algorithm
Code:
0 variable solutions
0 variable nodes
 
: bits ( n -- mask ) 1 swap lshift 1- ;
: lowBit  ( mask -- bit ) dup negate and ;
: lowBit- ( mask -- bits ) dup 1- and ;
 
: next3 ( dl dr f files -- dl dr f dl' dr' f' )
  not >r
  2 pick r@ and 2* 1+
  2 pick r@ and 2/
  2 pick r> and ;
 
: try ( dl dr f -- )
  dup if
    1 nodes +!
    dup 2over and and
    begin ?dup while
      dup >r lowBit next3 recurse r> lowBit-
    repeat
  else 1 solutions +! then
  drop 2drop ;
 
: queens ( n -- )
  0 solutions ! 0 nodes !
  -1 -1 rot bits try
  solutions @ . ." solutions, " nodes @ . ." nodes" ;
 
: testQ ( N -- ) millis @ swap queens millis @ swap - ."   t= " . ." ms" ;

 \ STM32F407, 168MHz clock, serial 115k2:
 \ ==============================================
 \ 8 testQ 92 solutions, 1965 nodes  t= 4 ms ok.
 \ 10 testQ 724 solutions, 34815 nodes  t= 43 ms ok.
 \ 12 testQ 14200 solutions, 841989 nodes  t= 1002 ms ok.
 \ 13 testQ 73712 solutions, 4601178 nodes  t= 5457 ms ok.
 \ 14 testQ 365596 solutions, 26992957 nodes  t= 31936 ms ok.
 \ 16 testQ 14772512 solutions, 1126417791 nodes  t= 1331997 ms ok.

from Rosetta Code.

I think we may try with SmallerC and RetroBSD, and with GCC and LiteBSD with above C code.
And with others..

_________________
Pukao Hats Cleaning Services Ltd.


Top
 Profile  
 
PostPosted: Tue Feb 09, 2016 2:02 pm 
Contributor
User avatar

Joined: Thu Nov 08, 2012 7:04 am
Posts: 2393
Location: Rapa Nui
Above C code with pic32mx @80MHz, XC32 1.34, simulator:
Code:
Number of solution for 8 is 92            982us
Number of solution for 10 is 724          15.93ms
Number of solution for 12 is 14200        380.8ms
Number of solution for 13 is 73712        2.077s

_________________
Pukao Hats Cleaning Services Ltd.


Top
 Profile  
 
PostPosted: Fri Feb 12, 2016 11:05 am 
Contributor
User avatar

Joined: Thu Nov 08, 2012 7:04 am
Posts: 2393
Location: Rapa Nui
RetroBSD 48MHz, SmallerC, the above fast C:
Code:
# cc -o 8q 8qf.c
# ./8q
Number of solution for 8 is 92, elapsed 0 us
Number of solution for 9 is 352, elapsed 10000 us
Number of solution for 10 is 724, elapsed 50000 us
Number of solution for 11 is 2680, elapsed 210000 us
Number of solution for 12 is 14200, elapsed 1050000 us
Number of solution for 13 is 73712, elapsed 5770000 us
Number of solution for 14 is 365596, elapsed 34190000 us

Similar results to XC32 1.34.

_________________
Pukao Hats Cleaning Services Ltd.


Top
 Profile  
 
PostPosted: Fri Feb 12, 2016 12:12 pm 
Committer
User avatar

Joined: Thu Oct 11, 2012 8:45 am
Posts: 1790
Location: Room 217, Floor 8, Arm 8, Wheel S7, Mars Base Alpha 3
Code:
/*
 * Clock ticks per second. The HZ value must be an integer factor of 1000.
 */
#ifndef HZ
#define HZ              200
#endif

200Hz is one tick every 5ms.

If you have changed that setting with
Code:
HZ=100

in your board config you will get 10ms ticking. With 1000 you will get 1ms ticking but more resources wasted doing the ticking.

_________________
Why not visit my eBay shop? http://stores.ebay.co.uk/Majenko-Technologies
Universal IDE: http://uecide.org
"I was trying to find out if it was possible to only eat one Jaffa Cake. I had to abandon the experiment because I ran out of Jaffa Cakes".


Top
 Profile  
 
PostPosted: Fri Feb 12, 2016 1:48 pm 
Contributor
User avatar

Joined: Thu Nov 08, 2012 7:04 am
Posts: 2393
Location: Rapa Nui
I know, but that is the OS "tick".
We have CPU tick which is 25ns (at 80MHz).
We may read the cpu tick - we do it somewhere, afaik..
I've started a new topic viewtopic.php?f=7&t=37533&p=43343#p43343
It seems to me we can read the cpu tick register and use it for time.tv_usecs.

_________________
Pukao Hats Cleaning Services Ltd.


Top
 Profile  
 
PostPosted: Wed Sep 21, 2016 8:25 pm 
Contributor
User avatar

Joined: Thu Nov 08, 2012 7:04 am
Posts: 2393
Location: Rapa Nui
Microblaze-MCS @100Mhz on Spartan6, the C source as in the first post above:
Code:
Hello World from Microblaze-MCS !!!
8 Queens problem for N=8 :)
Number of solutions for N=8 is 92
Elapsed 1 msecs..

Hello World from Microblaze-MCS !!!
8 Queens problem for N=9 :)
Number of solutions for N=9 is 352
Elapsed 3 msecs..

Hello World from Microblaze-MCS !!!
8 Queens problem for N=10 :)
Number of solutions for N=10 is 724
Elapsed 13 msecs..

Hello World from Microblaze-MCS !!!
8 Queens problem for N=11 :)
Number of solutions for N=11 is 2680
Elapsed 61 msecs..

Hello World from Microblaze-MCS !!!
8 Queens problem for N=12 :)
Number of solutions for N=12 is 14200
Elapsed 311 msecs..

Hello World from Microblaze-MCS !!!
8 Queens problem for N=13 :)
Number of solutions for N=13 is 73712
Elapsed 1695 msecs..

Hello World from Microblaze-MCS !!!
8 Queens problem for N=14 :)
Number of solutions for N=14 is 365596
Elapsed 9911 msecs..

Hello World from Microblaze-MCS !!!
8 Queens problem for N=15 :)
Number of solutions for N=15 is 2279184
Elapsed 61972 msecs..

Hello World from Microblaze-MCS !!!
8 Queens problem for N=16 :)
Number of solutions for N=16 is 14772512
Elapsed 413194 msecs..


Code:
mb-size hello_world_from_microblaze.elf  |tee "hello_world_from_microblaze.elf.size"
   text      data       bss       dec       hex   filename
   9994       412      2810     13216      33a0   hello_world_from_microblaze.elf
'Finished building: hello_world_from_microblaze.elf.size'


Aprox 1.7x faster clock to clock than the retrobsd with SmallerC (N=14).

_________________
Pukao Hats Cleaning Services Ltd.


Top
 Profile  
 
PostPosted: Tue Sep 27, 2016 7:32 pm 
Contributor
User avatar

Joined: Thu Nov 08, 2012 7:04 am
Posts: 2393
Location: Rapa Nui
Microblaze-MCS @200MHz, XC6LX9:
Code:
Hello World from Microblaze-MCS !!!
8 Queens problem for N=8 :)
Number of solutions for N=8 is 92
Elapsed 1 msecs..

Hello World from Microblaze-MCS !!!
8 Queens problem for N=9 :)
Number of solutions for N=9 is 352
Elapsed 2 msecs..

Hello World from Microblaze-MCS !!!
8 Queens problem for N=10 :)
Number of solutions for N=10 is 724
Elapsed 7 msecs..

Hello World from Microblaze-MCS !!!
8 Queens problem for N=11 :)
Number of solutions for N=11 is 2680
Elapsed 31 msecs..

Hello World from Microblaze-MCS !!!
8 Queens problem for N=12 :)
Number of solutions for N=12 is 14200
Elapsed 155 msecs..

Hello World from Microblaze-MCS !!!
8 Queens problem for N=13 :)
Number of solutions for N=13 is 73712
Elapsed 845 msecs..

Hello World from Microblaze-MCS !!!
8 Queens problem for N=14 :)
Number of solutions for N=14 is 365596
Elapsed 4941 msecs..


Interestingly still 1.7x faster than SmallerC on Retrobsd clock2clock.. :)

_________________
Pukao Hats Cleaning Services Ltd.


Top
 Profile  
 
PostPosted: Thu Sep 29, 2016 5:06 am 
Contributor
User avatar

Joined: Sun Oct 20, 2013 3:15 am
Posts: 322
I would be interested in seeing the assembly produced by all the different C compilers you're using, if you could post that here (or just email it to me). It wouldn't account for the entire difference, but I'm sure some non-trivial difference is because of the assembly generated by each compiler, and how (un-)optimized each is.

_________________
@__briancallahan on Twitter


Top
 Profile  
 
PostPosted: Thu Sep 29, 2016 8:39 am 
Contributor
User avatar

Joined: Thu Nov 08, 2012 7:04 am
Posts: 2393
Location: Rapa Nui
The SmallerC is tiny so no big deal with optimization, afaik.
For N=13 the XC32 1.34 @80MHz is 2.1secs (-Os), while uBlaze @100MHz (-O2) on Spartan6 is 1.7secs.
That are basically the same results clock to clock.
From experience the fastest code I ever got was from Microchip's XC32 (when talking pic32).
SmallC below is the SmallerC.
Attachment:
res1.JPG
res1.JPG [ 20.87 KiB | Viewed 11893 times ]

_________________
Pukao Hats Cleaning Services Ltd.


Top
 Profile  
 
PostPosted: Thu Sep 29, 2016 2:33 pm 
Contributor
User avatar

Joined: Sun Oct 20, 2013 3:15 am
Posts: 322
Right--but I do really mean like, post the assembly each one makes, because I think there would be interesting comparisons to make, and might point out what things are just "the compilers make different code, so we have that performance loss no matter what" and how much that performance loss might be.

Btw, are you using SmallC or alexfru's SmallerC? You mentioned both before in this thread, but you only mention SmallC in your chart.

_________________
@__briancallahan on Twitter


Top
 Profile  
 
PostPosted: Thu Sep 29, 2016 3:41 pm 
Contributor
User avatar

Joined: Thu Nov 08, 2012 7:04 am
Posts: 2393
Location: Rapa Nui
Yes - SmallerC from Alex, of course :) I will fix that.
Assembler files - the SmallerC and XC32 - I do not have them handy, the uBlaze-MCS - maybe yes, I have to look into the folders somewhere (but that is not MIPS).

_________________
Pukao Hats Cleaning Services Ltd.


Top
 Profile  
 
PostPosted: Thu Sep 29, 2016 5:33 pm 
Contributor
User avatar

Joined: Thu Nov 08, 2012 7:04 am
Posts: 2393
Location: Rapa Nui
Attached the XC32 files I've found from the above experiment.


Attachments:
default.zip [94.19 KiB]
Downloaded 70 times

_________________
Pukao Hats Cleaning Services Ltd.
Top
 Profile  
 
PostPosted: Thu Sep 29, 2016 5:47 pm 
Contributor
User avatar

Joined: Sun Oct 20, 2013 3:15 am
Posts: 322
Thanks. I can generate the SmallerC assembly code here, since you gave us the C program earlier in this thread.

_________________
@__briancallahan on Twitter


Top
 Profile  
 
PostPosted: Thu Sep 29, 2016 7:02 pm 
Contributor
User avatar

Joined: Thu Nov 08, 2012 7:04 am
Posts: 2393
Location: Rapa Nui
I see only an .elf file from uBlaze-MCS, is that interesting for you?
Attached the Source for SmallerC and XC32.


Attachments:
8queens.zip [1.19 KiB]
Downloaded 77 times

_________________
Pukao Hats Cleaning Services Ltd.
Top
 Profile  
 
PostPosted: Thu Sep 29, 2016 10:27 pm 
Contributor
User avatar

Joined: Sun Oct 20, 2013 3:15 am
Posts: 322
The .elf file is less interesting, but that's OK. There was assembly in the other thing you posted.
Here's SmallerC's take at generating assembly from your 8queens program: http://devio.us/~bcallah/nq.s

_________________
@__briancallahan on Twitter


Top
 Profile  
 
PostPosted: Fri Sep 30, 2016 1:21 am 
Contributor

Joined: Mon Nov 12, 2012 1:34 pm
Posts: 1079
Hi Brian and Pito,

Nice. MIPS is pretty obscure code. For me the real question is how much smaller and faster the can code be if I 'optimize' it myself.

While I don't have as much experience with MIPS, the other micros I have experience with a reduction in size of 50% has been pretty easy.

I am curious what your experience has been?

Lots of fun :).

Wiz


Top
 Profile  
 
PostPosted: Fri Sep 30, 2016 6:57 am 
Contributor

Joined: Mon Apr 29, 2013 1:56 am
Posts: 196
Pito wrote:
Interestingly still 1.7x faster than SmallerC on Retrobsd clock2clock.. :)


Why am I not surprised?
Do you have a point to make?


Top
 Profile  
 
PostPosted: Fri Sep 30, 2016 9:17 am 
Contributor
User avatar

Joined: Thu Nov 08, 2012 7:04 am
Posts: 2393
Location: Rapa Nui
Overclocking does not improve clock2clock ratio :lol:
Happy to see all still alive and watching this site!

_________________
Pukao Hats Cleaning Services Ltd.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 18 posts ] 

All times are UTC


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:




Powered by phpBB® Forum Software © phpBB Group

BSD Daemon used with permission