Friday, June 15, 2012

Mapgen in C, C++, LISP, Haskell, Factor, and Forth

I set out to pick up a few new languages (the first being LISP), but my overall goals require time, too. I want to make a simple roguelike and one simple sub-problem in doing so is generating a map. Rather than writing a library to generate a map, i'd prefer to have an application that does it instead.

Here's a sample:

mapgen$ ./mapgen -n 3 -d '(30,30)'                                                                                                


##############################
##############################
##############################
##############################
#######################.....##
#######################.....##
#######################.....##
###############.............##
###############.#######.....##
###############.#######.....##
###############.#######.....##
###############.########.#####
###############.########.#####
###############.########.#####
###############.########.#####
###############.########.#####
###############.########.#####
###############.########.#####
###############.########.#####
###############.####.....#####
###########......###.....#####
###########......###.....#####
###########......###.....#####
###########......###....######
###########......###....######
####################....######
####################....######
####################....######
##############################
##############################





For this simple test, i don't care that the rooms overlap--this just puts (-n) 3 boxes of '.' chars in a map that's (-d) 10x10 tiles. What fallows is a quick look at how the experiment went. You can look at all the implementations here. Among them, C and C++ were my only string languages. I started on a Python version (my first language), but it felt unnecessary. I'd appreciate feedback on any of these examples so that they may be improved.

The problem is overly simplified and some of the implementations are more complete than others, but i thought someone might find it useful to see how one can solve the same problem with a bunch of different languages. Rosetta Code lets you do this, but only for examples even more trivial.

Size and Complexity

Line counts are notoriously bad at measuring code quality, but it does give some useful information.

  • C++ : 374 (574 if you count my pre-built vector class)
  • C : 176
  • LISP: 75
  • Haskell: 125
  • Forth: 96 (incomplete)
  • Factor: 123
mapgen.cpp actually has 134 lines, but the overhead of writing a generic Grid class (which might get used later) and iterators to go along with it had a cost. However, some functions were almost entirely reduced, thanks to this method. For example, dig_room() takes a room (a struct with the fields left, right, up, down) and replaces every wall tile with a floor tile. In C:

void dig_room( Grid g, const Room r )
{
    int x, y;
    for( x = r.left; x < r.right; x++ ) {
        for( y = r.up; y < r.down; y++ ) {
            Vector v = { x, y };
            *grid_get( g, v ) = '.';
        }
    }
}

And C++

void dig_room( const Room& r )
{
    std::fill( mgMap.reg_begin(r), mgMap.reg_end(r), '.' );
}

In Grid.h, i was able to define an iterator the incremented over a room and any function that requires such an iteration will be able to use it easily. However, in C, i will probably use the above loop in every necessary instance.

Theoretically, the templated Grid and generic iterators should save lines of code and developer time down the road, but without the ability to run wild with that in C, i made a very succinct Grid that required almost no time to make at all (in fact, mapgen.c only took an hour to write, if that).


void init_grid( Grid* g, Vector dims )
{
    const int AREA = dims.x * dims.y;
    
    g->dimensions = dims;
    g->tiles = malloc( AREA );
    memset( g->tiles, '#', AREA );
}


void destroy_grid( Grid* g )
{
    free( g->tiles );
    g->tiles = 0;
}


char* grid_get( Grid g, Vector p )
{
    return g.tiles + g.dimensions.x*p.y + p.x;
}


Comparing LISP and Haskell, i find that Haskell's purity added to the amount of work done. 



splitGap :: Int -> Int -> [a] -> ([a],[a],[a])
splitGap start size lst = (before, middle, after)
  where 
    (before,rest) = splitAt start lst
    (middle,after) = splitAt (abs size) rest


digRow :: Range -> MRow -> MRow
digRow (start,end) row = 
  before ++ replicate size TFloor ++ after
  where
    size = end - start + 1
    (before,_,after) = splitGap start size row


digRoom :: RMap -> Area -> RMap
digRoom rmap ((x,y),(u,v)) =
  ybefore ++ map (digRow (x,u)) rows ++ yend
  where 
    (ybefore,rows,yend) = splitGap y (v-y+1) rmap

verses


(defun dig-at (m x y)
  (setf (row-major-aref m (+ x (* y (array-dimension m 0)))) #\.))


(defun dig-room (map room)
    (loop for j from (area-up room) below (1+ (area-down room)) do
          (loop for i from (area-left room) below (1+ (area-right room))
                do (dig-at map i j))))

While Haskell's random monad also produces several extra lines, like in C++, some algorithms get reduced in code-size thanks to higher-level featuers:


(defun splatter-pattern (map n)
  (let* ((height (array-dimension map 1))
         (width  (array-dimension map 0))
         (rooms (loop for i from 1 to n 
                      collect (random-area width height))))
    (loop for i from 0 below (length rooms) do
          (dig-room map (elt rooms i))
          (when (> (-(length rooms)i) 1)
            (let* ((a (random-point (elt rooms i)))
                   (b (random-point
                       (elt rooms (randr (1+ i)
                                         (- (length rooms) 1))))))
              (dig-hallway map a b))))))

verses



splatter :: RandomGen r => Int -> r -> RMap -> RMap
splatter n gen m = 
  digRandomHallways (foldl digRoom m rooms) g2 rooms
  where 
    (g1,g2) = split gen
    rooms = take n $ randomRooms g1 (length (m!!0),length m)
    center ((x,y),(u,v)) = ((x+u) `quot` 2, (y+v) `quot` 2)

Next: Forth and Factor. For those who don't know, Forth is from the 70's and Factor is based on a recent academic language, Concat. Concat seems very Forth based, and so Factor and Forth seemed like logical analogues. It seems like this type of language gets ignored almost entirely by the mainstream community, and in research, but i find fascination in them.  Factor, in particular, which shows a working high-level, functional Forth, with first-class functions.

Unfortunately, one unfamiliar with these languages will find the fallowing source code indecipherable.

Forth:

char . CONSTANT FLOOR
...



: ij>tile *width* * + *tiles* + ; 
...


: (dig) ij>tile FLOOR swap C! ;
: dig { room }
    room down  @ 1+ room  up  @ DO 
    room right @ 1+ room left @ DO 
        I J (dig)
    LOOP LOOP ;

Factor: (comments removed) 
: ij>tile ( i j -- i' map   ) *width* * + *map* get ;
...

: left-right ( room -- l r ) [ left ] [ right ] bi ;
: up-down    ( room -- u d ) [  up  ] [ down  ] bi ;
...

: (room-map) ( room quot: ( i j -- ) -- )
    [ [ up-down [a,b] ] keep ] dip 
    '[  _ left-right [a,b] swap
        _ curry each ] 
    each ; inline recursive

: (dig) ( i  j -- ) [ FLOOR ] 2dip ij>tile set-nth ;
: dig   ( room -- ) [ (dig) ] (room-map)  ;

Again, we see that we have written more lines in the higher level language! However, Factor's left-right is used in several places and adds clarity, though admittedly, (room-map) just felt like a good thing to implement when i could have simplified it by not making a generic map function. (The idea that something like that would be useful violates the YAGNI principal.) 

But both my Factor and Forth solutions could possibly be improved, i didn't invest as much time as i would have liked to.

Even though higher level languages tend towards more lines in this small example, their theory lies more in scalability. But then again, in compared to C, there's no reason i couldn't have done the exact same thing in C++.

Future Plans

For my roguelike, mapgen will read stdin to initialize the options for constructing the map. For now, the mapgen repository acts as a scratchpad for trying different approaches. C has probably been my favorite so far. While i enjoy sculpting a clean interface in C++, C shines in its simplicity. Russia's pencil to NASA's zero-gravity pen. (But you can't blame me for thinking a zero-gravity pen is neat!) 

I want to continue to develop the C and C++ versions in parallel, but at least one other as well. Of all the languages i tried, i had the most hope for Factor. Stack-based languages take a unique approach that piques my interest. Programmers commonly write procedures that take one variable, convert it to an intermediate, and then an output value. In an imperative language, it might look like this:

a = f()
b = g(a)
c = h(b)

Functionally,

c = h( g( f(a) ) )

But, stack based:

a f g h c

This leads to interesting quirks, and it's fun to play around with. Stack manipulation is like a puzzle on top of your otherwise normal, every-day algorithm. It can lead to cryptic functions and terse generalizations, but still, i think deserves much more attention.

And yet i don't plan to continue to continue implementing either Forth or Factor. I will learn them and keep an eye of the development of concatenative languages.

Conclusions

If high level languages do provide productivity gains, then they should scale better than lower levels. (I.E., given enough complexity, the C implementation would grow larger than C++, LISP larger than Haskell, and Forth larger than Factor.) Still, for a short program, low-level languages show an elegant brevity.