Inquiry 2009!

  • April 14, 2009 20:50

The fifth annual issue of  the Inquiry magazine from the University of New Hampshire is online now.  I wrote a commentary for the magazine, discussing my research experience in Germany in the summer of 2007, and some of the misconceptions people have about computer science.  You can read it online here.

Subtle laziness issues.

  • April 13, 2009 12:38

I am using heuristic search for a combinatorial optimization problem I’m playing with. The search space is enormous, far too large to navigate with traditional A* (unless I can think up a ridiculously good heuristic). Because I don’t have a machine with unlimited memory, I am using Iterative Deepening A*, described by Richard Korf in a 1985 paper. With a reasonable heuristic, iterative deepening A* has memory complexity linear with the depth of the shallowest solution.

I coded up IDA* in Haskell, hoping I could then solve my problem on my laptop, which has a paltry 1GB of memory. My first attempt used the State monad from Control.Monad.State.Strict to keep track of search statistics, e.g. number of nodes expanded. This is cleaner and less error-prone than threading the state manually as an extra parameter.

{-# LANGUAGE BangPatterns #-}
module Search where
 
import Control.Monad.State.Strict
import Data.List (foldl')
 
 
data Node s e c = Node { fVal        :: {-# UNPACK #-} !c  -- f = g + h
                       , gVal        :: {-# UNPACK #-} !c  -- cost so far
                       , hVal        :: {-# UNPACK #-} !c  -- est. cost to go
                       , state       :: !s
                       , pathToStart :: ![e] -- path from start, in reverse
                       }
  deriving (Show)
 
path :: Node s e c -> [e]
path = reverse . pathToStart
 
-- Constructs a start node from the given heuristic function and state.
mkStart :: (Num c) => (s -> c) -> s -> Node s e c
mkStart h s = Node { fVal = hv
                   , gVal = 0
                   , hVal = hv
                   , state = s
                   , pathToStart = [] }
  where hv = h s
 
-- Expands a node with the given heuristic and state expansion function.
expandNode :: (Num c) =>
              (s -> c)                -- heuristic function
              -> (s -> [(s, e, c)])   -- state expansion function
              -> Node s e c           -- node to expand
              -> [Node s e c]         -- list of children
 
expandNode !heur !exp !n = map mkNode $ exp $ state n
    where mkNode !(s,e,c) = let g = gVal n + c
                                h = heur s
                            in n { state = s
                                 , pathToStart = e : pathToStart n
                                 , fVal = g + h
                                 , gVal = g
                                 , hVal = h
                                 }
 
 
-- Search statistics.
data Stats = Stats { numExpanded  :: {-# UNPACK #-} !Int
                   , numGenerated :: {-# UNPACK #-} !Int
                   , numPruned    :: {-# UNPACK #-} !Int
                   }
 
-- Initial search statistics.
mkStats :: Stats
mkStats = Stats { numExpanded = 1, numGenerated = 0, numPruned = 0 }
 
 
-- For the result of depth-limited search.
data DLS s e c = Solution {-# UNPACK #-} !(Node s e c)  -- solution found
               | Cutoff {-# UNPACK #-} !c               -- more nodes available
               | DeadEnd                                -- no solution; no more
                                                        -- nodes
 
-- Iterative-deepening A* search.  Supposed to have worst-case B*D
-- memory complexity, where B is the maximum branching factor in the
-- search space, and D is the solution depth.
idaStar :: (Num c, Ord s, Ord e, Ord c) =>
         s                                -- start state
         -> (s -> Bool)                   -- goal predicate
         -> (s -> [(s, e, c)])            -- expand function
         -> (s -> c)                      -- heuristic function
         -> (Maybe (Node s e c), Stats)   -- possible solution, and statistics
 
idaStar start isGoal expand heur =
    case runState (go $ heur start) mkStats of
      (Solution n, s) -> (Just n, s)
      (DeadEnd, s) -> (Nothing, s)
  where startNode = mkStart heur start
 
        go !c = do
          res <- dls c startNode
          case res of
            Cutoff c' -> go c'
            other -> return other
 
        dls !c !n
            | c < fVal n       = return $ Cutoff $ fVal n
            | isGoal (state n) = return $ Solution n
            | otherwise        = do
          let kids = expandNode heur expand n
          modify $ \s -> s { numExpanded = numExpanded s + 1
                           , numGenerated = numGenerated s + length kids
                           }
          let f DeadEnd n = dls c n
              f s@(Solution _) _ = return s
              f a@(Cutoff v) n = do r <- dls c n
                                    case r of
                                      DeadEnd -> return a
                                      Cutoff v' -> return $ Cutoff $ min v v'
                                      s@(Solution _) -> return s
          foldM f DeadEnd kids

However, this version showed higher memory usage than normal A*! Within seconds my poor laptop would be swapping furiously. There is a space leak. I tried heap profiling without success (I probably just don’t know how to use the heap profiler properly), and asked about space leaks with the State monads on #haskell, to which I heard just the crickets.

Frustrated, I rewrote the code without use of the State monad, by threading the Stats parameter by hand. That version showed the linear memory usage that was expected, resident process size less than 20MB.

I searched through the Haskell Cafe archives and came across an old thread where Dean Herington asked about the State monad and strictness. He was running into similar problems.

It turns out that laziness and strictness are subtle things. Heinrich Apfelmus wrote an excellent response in the thread. In summary, the State monad in Control.Monad.State.Strict is strict in the (result, state) tuple, but not strict in the components of the tuple. Hence, in my first attempt at IDA*, where I use the modify function, rather than updating the state strictly, a thunk is created, causing the space leak I was experiencing.

In my first attempt I updated the state in dls using modify like so:

          modify $ \s -> s { numExpanded = numExpanded s + 1
                           , numGenerated = numGenerated s + length kids
                           }

Rather than doing that, if I force the state before updating it, the space leak disappears, and the performance is comparable to the hand-threaded version:

          s <- get
          put $! s { numExpanded = numExpanded s + 1
                   , numGenerated = numGenerated s + length kids
                   }

Tricky.