SEARCH ALGORITHMS ================= Introduction ------------ Many problems in AI involve searching a state-space, in which each node is connected to successor or child nodes by arcs which represent some kind of operation. For example, in solving Rubik's cube, the nodes are the different configurations of the cube, and the arcs are the operations used to move from one configuration to another (in this case, twisting a side of the cube). Other examples of problems which may require search are parsing a natural language sentence (because of local ambiguity) and "look-ahead" in a game such as chess. The several search algorithms all are based on two basic "prototypes" -- depth first search with backtracking, and breadth-first search. Neither is very efficient on its own. Variants of depth-first search use less memory, but normally take longer to execute; the particular algorithm chosen should depend on the characteristics of the problem. ---------- Depth-First Search ------------------ Depth first search involves searching down the first (in the tree below, the leftmost) branch before searching any other nodes. If the goal is not reached, the next branch is searched, and so on, until the goal is reached, or the whole tree has been searched. Once a dead-end (a node which has no successors) has been reached, backtracking is necessary unless it is a goal node. Illustrated below is a search tree, with the numbers alongside the node representing the order of the search if depth-first search is used. A (1) | +---------+---------------+ | | B (2) C (14) | | +------------+-------------+ | | | | | D (3) E (7) F (8) G (15) | | +---+---+ +-------+---+---+-------+ | | | | | | H (4) I (5) L (9) M (10) N (12) O (13) | | J (6) P (11) The recursive algorithm for depth-first search is shown below. depth-first-search(S, GOAL) = if S = GOAL then return success else Generate the descendents of S; (***) Search through every descendent P of S until one is found such that depth-first-search(P, GOAL) is true or there are no more descendents to search; if there are no more descendents to search then return failure end if end if; ---------- Hill Climbing ------------- This is a variant on depth-first search, and can be used wherever heuristic information is available (information which assists the search process). The difference is that, instead of searching branches in the order in which they are found, a heuristic function is used to sort the branches in the order most-promising to least- promising. It is called hill-climbing because, if the heuristic function is height above sea level, this algorithm always searches by the steepest route upwards first before trying other routes. Hill-climb(S, GOAL, HEURISTIC-FN) = same as above routine except for the extra argument HEURISTIC-FN, and insertion of the following at (***): Use HEURISTIC-FN to sort the descendents of S, so that the ones with the highest value are tried first. The heuristic function is normally a static evaluation function: i.e. its evaluation does not involve search of the tree from a edescendent of the current node. The heuristic function then takes a single argument: the node it is evaluating. It may sometimes be worthwhile, however, to search at least part of the way into the tree. In this case, the heuristic function must take an extra argument: the depth to which it must search. The heuristic-function must eventually rely on a static evaluation function when it reaches its depth bound. The search algorithm of a heuristic function is not itself heuristically guided for two reasons: (1) exhaustive search is required, and so there is no need for heuristics as all nodes are searched up to the depth bound; (2) The heuristic function cannot know its own results until it has completed execution. (The static evaluation function it uses for evaluation could be used to guide the search could be used, but this is unnecessary because of (1).) The algorithm for a heuristic function which uses search is: heuristic-function(S, DEPTH-BOUND) = if DEPTH-BOUND = 0 then return static-evaluation-function(S) else Generate the descendents of S; Search through every descendent P of S and return the maximum value of heuristic-function(P, DEPTH-BOUND - 1) end if; Static evaluation is, of course, domain-independent. For chess, experimentation with a weighted sum of several variables, such as (1) Material advantage over opponent. Pieces might be weighed as follows: King: infinity Queen: 9 Rook: 5 Bishop, Knight: 3 (2) Positional advantage over opponent. This could crudely be evaluated by subtracting the number of squares your opponent controls (could move to without being captured) from the number of squares that you control. More sophisticated techniques could be aware of the greater importance of the centre squares in the opening and middle-game, and control of the seventh rank in the end-game. might be useful. ---------- Breadth-First Search -------------------- Breadth-first search involves searching all the branches in parallel. First the start node is tested, and then each of its successors in turn, and then each of their successors, and so on. It tests all of the nodes at a given depth before moving on to nodes at the next depth. Illustrated below is the order of expansion for breadth-first search. A (1) | +---------+---------------+ | | B (2) C (3) | | +------------+-------------+ | | | | | D (4) E (5) F (6) G (7) | | +---+---+ +-------+---+---+-------+ | | | | | | H (8) I (9) L (10) M (11) N (12) O (13) | | J (14) P (15) The algorithm for a heuristic function which uses search is: breadth-first-search(GOAL, S) = Initialize AGENDA to contain only S; Initialize NODE to S; until AGENDA is empty or NODE = GOAL do Take the first item off AGENDA and call it NODE; Generate the descendents of NODE; Add the descendents of NODE to the end of AGENDA (***) end until; ---------- Best-First Search ----------------- best-first-search(GOAL, S, HEURISTIC-FN) = same as breadth-first search routine except for the extra argument HEURISTIC-FN, and replacement of the line marked (***) by Use HEURISTIC-FN to decide where to insert the descendents of S, in AGENDA. The items on the agenda for which HEURISTIC-FN is highest are placed nearest the front of the agenda. If the heuristic function is to look ahead, it is reasonable for it to use depth-first search, for reasons already explained. ---------- Graph Search ------------ Usually, search spaces are graphs, possibly cyclic, rather than trees. This complicates the search slightly, as, in order to avoid revisiting a node and searching all its descendents again, it is important to keep track of which nodes have already been searched. This can be done by using a variable to hold the nodes which have been searched. The best-first search algorithm can then be modified to graph-search(GOAL, S, HEURISTIC-FN) = initialize AGENDA to contain only S; initialize NODE to S; until AGENDA is empty or NODE = GOAL do Take the first item off AGENDA and call it NODE; Add NODE to list SEARCHED; Generate the descendent nodes of NODE; Use HEURISTIC-FN to decide where to insert the descendents of NODE in AGENDA. The items on the agenda for which HEURISTIC-FN is highest are placed nearest the front of the agenda. If a descendent is a member of SEARCHED, do not add it to AGENDA. (***) end until; ---------- Best Path --------- The algorithms discussed so far have been designed to reach the goal without remembering the path they took. If the path is needed, the following must be added at (***) to the graph-search algorithm: If the descendent node is not in either AGENDA or on SEARCHED, then establish a link from the descendent to its parent node when adding it to the agenda. If a less costly route to the descendent has been found, then redirect the pointers which mark the route. In order to find the best path to a goal, it is necessary to use a different type of heuristic function from those previously considered. The new heuristic function is given by the formula f(N) = g(N) + h(N) where g(N) is the cost of the path from the start node S to node N, and h(N) is the estimated cost of the path from node N to the goal node. The cost of a path is evaluated by summing the costs of each of the different steps in the path. Also unlike the previous heuristic functions, f(N) should be minimised. Other Search Strategies ----------------------- There are certain circumstances in which other search strategies might be necessary. Programs which play games such as chess, draughts (chequers), and othello must take into account that there are two players with opposing objectives. From white's point of view, when it is white's turn to play, he or she will want to maximise the value of white's heuristic function, while black will want to minimise white's heuristic function when he or she plays. The basic search algorithm for games is called minimax, but there is a more efficient algorithm called alpha-beta. Some problems can be decomposed. An example of one is integration. Whenever a sum appears inside an integral sign, it is possible to separate the integral and search each part independently. Another consideration in search problems is that often it is useful to know the reasons the failure of a potential solution path, as a different solution path may fail for the same reason, and also because most of the solution path may be quite satisfactory. Dependency-directed backtracking takes account of this. Lastly, constraint propagation problems such as analysing images often search from several start points, and aim to build up a graph, rather than find a single goal node or a path to that node. This is called opportunistic search. ----------