Representations for Classical Planning

A restricted state-transition system is one that meets all of the restrictive assumptions A0 through A7 given in before. It is a deterministic, static, finite, and fully observable state-transition system with restricted goals and implicit time.

Such a system is denoted (S,A, F) instead of (S,A, E, Y) because there are no contingent events.

Classical planning (i.e. STRIPS planning) refers generically to planning for restricted state-transition systems.

Motivations for studying classical planning:

  1. As usual in science when one is facing a very complex problem, it is very useful to make restrictive assumptions in order to work out well-founded models and approaches. In planning, assumptions A0 through A7 led to this baseline class.

Main issues (i.e. problems) in classical planning are:

  1. How to represent the states and the actions in a way that does not explicitly enumerate S,A, and y. Without such a representation, it is not possible to develop domain-independent approaches to planning.
  2. How to perform the search for a solution efficiently: which search space, which algorithm, and what heuristics and control techniques to use for finding a solution.

Why we need problem representation?

A necessary input to any planning algorithm is a description of the problem to be solved. In practice, it usually would be impossible for this problem description to include an explicit enumeration of all the possible states and state transitions: Such a problem description would be exceedingly large, and generating it would usually require more work than solving the planning problem. Instead, a problem representation is needed that does not explicitly enumerate the states and state transitions but makes it easy to compute them on-the-fly.

There are three different ways of to represent classical planning problems:

  1. In a set-theoretic representation.
    Each state of the world is a set of propositions, and each action is a syntactic expression specifying which propositions belong to the state in order for the action to be applicable and which propositions the action will add or remove in order to make a new state of the world.
  2. In a classical representation.
    The states and actions are like the ones described for set-theoretic representations except that first-order literals and logical connectives are used instead of propositions, This is the most popular choice for restricted state-transition systems.
  3. In a state-variable representation.
    Each state is represented by a tuple of values of n state variables {Xl,…, Xn}, and each action is represented by a partial function that maps this tuple into some other tuple of values of the n state variables. This approach is especially useful for representing domains in which a state is a set of attributes that range over finite domains and whose values change over time.


Set-Theoretic Representation

We will usually call such problems set-theoretic planning problems, and we will refer to the representation scheme as set-theoretic planning.

Note that minimal solution can’t be redundant.

Properties of the set-theoretic representations:

  1. Readability.
    One advantage of the set-theoretic representation is that it provides a more concise and readable representation of the state-transition system than we would get by enumerating all of the states and transitions explicitly.
  2. Ease of Computation.
    Most of computations depend on basic set theory operations.
  3. Expressively.
    Some problems can’t be expressed in this form (i.e. finding prime number problem)


Classical Representation

The classical representation scheme generalizes the set-theoretic representation scheme using notation derived from first-order logic. States are represented as sets of logical atoms that are true or false within some interpretation. Actions are represented by planning operators that change the truth values of these atoms.

A state is a set of ground atoms of L. Since L has no function symbols, the set S of all possible states is guaranteed to be finite. As in the set-theoretic representation scheme, an atom p holds in s iff p belongs to s. If g is a set of literals (i.e., atoms and negated atoms), we will say that s satisfies g (denoted s |= g) when there is a substitution SEGMA such that every positive literal of SEGMA (g) is in s and no negated literal of SEGMA (g) is in s.


The predicate which can be considered as a function of the set of states; will be called a fluent or flexible relation.


Predicates that are constant from state to state are called rigid relation.


Closed-world assumption means that an atom that is not explicitly specified in a state does not hold in that state.


Rigid relations cannot appear in the effects of any operator o because they are invariant over all the states; they can be used only in precond(o). In other words, any predicate in effects(o) is a flexible relation.

An action a is relevant for g, i.e., a can produce a state that satisfies g, if:    

DWR = Dock-Worker Robots.

Extensions to classical representation:

  1. Simple Syntactical Extensions.
  2. Conditional Planning Operators.
  3. Quantified Expressions.
  4. Disjunctive Preconditions.
  5. Axiomatic Inference.
    1. This technique distinct two new classes of flexible relations: Primary Relations and Secondary Relations.
  6. Function Symbols.
  7. Attached procedures.
  8. Extended Goals.


State-Variable Representation

A state-variable representation relies on the following ingredients:

Search Procedures and Computational Complexity


Some of problem-solving properties:

  1. Soundness:
    A deterministic procedure is sound if, whenever it is invoked on some problem P and returns a value v not equal to failure, v is guaranteed to be a solution for P. A nondeterministic procedure is sound if every successful execution trace returns a value that is guaranteed to be a solution to P.
  2. Completeness:
    A deterministic procedure is complete if, whenever it is invoked on a solvable problem P, it is guaranteed to return a value v not equal to failure. A nondeterministic procedure is complete if, whenever it is invoked on a solvable problem P, at least one of its execution traces will return a value v not equal to failure whenever P is solvable.
  3. Admissibility:
    If there is some measure of optimality for solutions to a problem P, then a deterministic procedure is admissible if it is guaranteed to return an optimal solution whenever P is solvable. Note that if a procedure is admissible, then it is also sound and complete.

A well-known class of search problems is state-space search problems. The state space is a set of nodes called states, and the objective is to find a state s that satisfies some goal condition g.

The space complexity of breadth first search is bd where b is the number of children for each node and d is the depth of the tree. Whereas the space complexity of depth-first search is d.

Search Procedures:

Breadth-First Search.

Depth-First Search.

Best First Search:

In some cases, the objective is to find a goal state s that minimizes some objective function f(s). In this case, a nondeterministic search will not work properly: instead of returning the solution found in a single execution trace, we must look at all of the execution traces to see which one leads to the best solution.

One way to do this deterministically is to use a best-first search. This is like a breadth-first search in the sense that it maintains an active list of nodes that have been generated but not yet visited. However, instead of using this list as a queue the way a breadth-first search does a best-first search instead uses the list as a priority queue: the next node chosen from the active list will be the one whose f value is smallest. Best-first search is sound. If f is monotonically increasing, i.e., if f(s) <= f(s’) whenever s’ is a child of s, then a best-first search will never return a non-optimal solution and is admissible in finite search spaces. If there is a number Gamma > 0 such that if f(s) + Gamma <= f(s’) whenever s’ is a child of s, then a best-first search is admissible even in infinite search spaces.

The well-known A* search procedure is a special case of best-first state-space search, with some modifications to handle situations where there are multiple paths to the same state.

Depth-First Branch-and-Bound Search:

Another search procedure for minimizing an objective function is branch-and-bound. The most general form of branch-and bound is general enough to include nearly all top-down search procedures as special cases. The best-known version of branch-and-bound is a simple depth-first version similar to the procedure shown in Figure 1. In this procedure, s* is a global variable that holds the best solution seen so far, with s* equal to some dummy value and f (s*) = infinity when the procedure is initially invoked. If the state space is finite and acyclic and if f is monotonically increasing, then Depth-first-BB is admissible. If f is non-monotonic, then Depth-first-BB may not be admissible, and if the state space is infinite, then Depth-first-BB may fail to terminate.

Fig. 1: A branch-and-bound version of state space search.


Greedy Search:

A greedy search is a depth-first search procedure with no backtracking. It works as follows. If s is a solution, then return it; otherwise, repeat the search at the child o(s) whose f value is smallest. There are no guarantees of whether this procedure will find an optimal solution, but it can sometimes save a huge amount of time over what would be needed to find a guaranteed optimal solution.

Hill-Climbing Search:

This is similar to greedy search, except that in a hill-climbing problem, every node is a solution, and a hill-climbing procedure will only go from s to o(s) if f(o(s)) < f(s).


Most of the above search procedures can be modified using a variety of heuristic techniques. These can be divided roughly into two classes:

  1. Pruning techniques:
    These are ways to determine that some nodes will not lead to a solution (or to a desirable solution), so that the procedure can prune these nodes (i.e., remove them from the search space).
  2. Node-selection techniques:
    These are ways to guess which nodes will lead to desirable solutions (e.g., solutions that are optimal or near-optimal, or solutions that can be found quickly), so that the procedure can visit these nodes first.

There are a number of special cases in which these properties can be preserved.

Iterative deepening is a technique that can be used in conjunction with a depth-first search procedure to make it complete. It can be done in two ways: breadth-first or best-first.

Breadth-first iterative deepening does a depth-first search that backtracks whenever it reaches depth i and repeats this search for i = 1, 2 … until a solution is found. Like ordinary breadth-first search, breadth-first iterative deepening is complete but not admissible.

Best-first iterative deepening does a depth-first search that backtracks during its ith iteration whenever it reaches a node s such that f(s) >= fi, where f0 =f(s0), and for i > 0,

fi = min{ f(s) | the search backtracked at s during iteration i – 1 }

The well-known IDA* procedure uses best-first iterative deepening.


Problem-Reduction Search

Another kind of search space is a problem-reduction space, in which each state s represents a problem to be solved, and each operator o(s) produces not just a single child state s’ but an entire set of children {Si . . . . . Sk} (the number of children may vary from one state to another). The children are called subproblems of s, and a solution for one of them represents a portion of the solution to s. Thus, to solve s it is not sufficient just to find a solution below some descendant of s. Instead, the search space is an AND/OR graph, and a solution for s consists of a set of solutions {Vl . . . . . Vk} that are the leaf nodes of a solution graph rooted at s.


A function f(n) is said to be logarithmically bounded if f (n)= O(log n); polynomially bounded if there is a constant c such that f(n)= O(nC); and exponentially bounded if there is a constant c such that f(n) = O(cn).


Computational Complexity of Problems

Given a character string s, is s belongs to L (i.e. language).

Here are some complexity classes dealing with time:

  • P is the set of all languages L such that L has a deterministic recognition procedure whose worst-case running time is polynomially bounded.
  • NP is the set of all languages L such that L has a nondeterministic recognition procedure whose worst-case running time is polynomially bounded.
  • EXPTIME is the set of all languages L such that L has a deterministic recognition procedure whose worst-case running time is exponentially bounded.
  • NEXPTIME is the set of all languages L such that L has a nondeterministic recognition procedure whose worst-case running time is exponentially bounded.

Here are some complexity classes dealing with space:

  • NLOGSPACE is the set of all languages L such that L has a nondeterministic recognition procedure whose worst-case space requirement is logarithmically bounded.
  • PSPACE is the set of all languages L such that L has a recognition procedure whose worst-case space requirement is polynomially bounded. It makes no difference whether the procedure is deterministic or nondeterministic; in either case we will get the same set of languages.
  • EXPSPACE is the set of all languages L such that L has a recognition procedure whose worst-case space requirement is exponentially bounded. It makes no difference whether the procedure is deterministic or nondeterministic; in either case we will get the same set of languages.

If C is one of the complexity classes and L is a language, then L is C-hard if every language in C is reducible to L in polynomial time. L is C-complete if C is L-hard and L belongs to C. Intuitively, if L is C-complete, then L is one of the hardest languages in C: if we can recognize L, then we can also recognize any other language in C with at most a polynomial amount of additional overhead.


Planning Domains as Language-Recognition Problems

Given an alphabet L in which to write statements of planning problems, we can define the following languages:

  • PLAN-EXISTENCE is the set of all strings s belongs to L* such that s is the statement of a solvable planning problem.
  • PLAN-LENGTH is the set of all strings of the form (s,k) such that s is the statement of a solvable planning problem, k is a nonnegative integer, and s has a solution plan that contains no more than k actions

The definition of PLAN-LENGTH follows the standard procedure for converting optimization problems into language-recognition problems.

What really interests us, of course, is not the problem of determining whether there is a plan of length k or less but the problem of finding the shortest plan. If the length of the shortest plan is polynomially bounded, then it can be shown that the two problems are polynomially reducible to each other. However, if the length of the shortest plan is not polynomially bounded, then finding the shortest plan can be much harder than determining whether there is a plan of length k or less. For example, in the well-known Towers of Hanoi problem and certain generalizations of it, the length of the shortest plan can be found in low-order polynomial time, but actually producing a plan of that length requires exponential time and space because the plan has exponential length.

Preface, Introduction and Overview

Motivations of studying planning:

  1. Need for information processing tools that provide affordable and efficient planning resources (Practice)
  2. Planning is an important component of rational behavior. One of the important purposes of AI (Theory)


First Intuitions on Planning



Planning is the reasoning side of acting. It is an abstract, explicit deliberation process that chooses and organizes actions by anticipating their expected outcomes. This deliberation aims at achieving as best as possible some presented objectives. Automated planning is an area of Artificial Intelligence (AI) that studies this deliberation process computationally.


When we don’t need planning:

When the purpose of an action is immediate given our knowledge of that action, or when we perform well-trained behaviors for which we have presented plans, or when the course of an action can be freely adapted while acting, then we usually act and adapt our actions without explicitly planning them.


Reactive vs. Deliberative agents:

Reactive agents simply retrieve pre-set behaviors similar to reflexes without maintaining any internal state. On the other hand, deliberative agents behave more like they are thinking, by searching through a space of behaviors, maintaining internal state, and predicting the effects of actions. Although the line between reactive and deliberative agents can be somewhat blurry, an agent with no internal state is certainly reactive, and one which bases its actions on the predicted actions of other agents is deliberative.


Forms of Planning:

  1. Path and motion planning:
    Is concerned with the synthesis of a geometric path from a starting position in space to a goal and of a control trajectory along that path that specifies the state variables in the configuration space of a mobile system, such as a truck, a mechanical arm, a robot, or a virtual character.
  2. Perception planning:
    Is concerned with plans involving sensing actions for gathering information. Perception planning addresses questions such as which information is needed and when it is needed, where to look for it, which sensors are most adequate for this particular task, and how to use them.
  3. Navigation planning:
    Combines the two previous problems of motion and perception planning in order to reach a goal or to explore an area.
  4. Manipulation planning:
    Is concerned with handling objects, e.g., to build assemblies.
  5. Communication Planning:
    Arises in dialog and in cooperation problems between several agents, human or artificial. It addresses issues such as when and how to query needed information and which feedback should be provided.


Domain Specific Approaches

Disadvantages of domain specific approaches:

  1. They require external understanding of general planning forms.
  2. It is more costly to address each planning problem anew instead of relying on and adapting some general tools.
  3. Domain-specific approaches are not satisfactory for studying and designing an autonomous intelligent machine.

For all these reasons, automated planning is interested in domain-independent general approaches to planning.

Models used in domain-independent planning:

  1. Project planning:
    In which models of actions are reduced mainly to temporal and precedence constraints, e.g., the earliest and latest start times of an action or its latency with respect to another action
  2. Scheduling and resource allocation:
    In which the action models include the above types of constraints plus constraints on the resources to be used by each action.
  3. Plan Synthesis:
    In which the action models enrich the precedent models with the conditions needed for the applicability of an action and the effects of the action on the state of the world.


Conceptual Model for Planning

A conceptual model is a simple theoretical device for describing the main elements of a problem.

Since planning is concerned with choosing and organizing actions for changing the state of a system, a conceptual model for planning requires a general model for a dynamic system.

We’ll use state-transition system (also called discrete-event systems) to describe the conceptual model of the planning.


Fig. 1: A simple conceptual model for planning.


A plan is a structure that gives the appropriate actions that leads you to the objective. The objective can be specified in several different ways:

  • Specifying goal state or set of goal states.
  • More generally, the objective is to satisfy some condition over the sequence of states followed by the system.
  • An alternative specification is through a utility function attached to states, with penalties and rewards. The goal is to optimize some compound function of these utilities (e.g., sum or maximum) over the sequence of states followed by the system.
  • Another alternative is to specify the objective as tasks that the system should perform. These tasks can be defined recursively as sets of actions and other tasks.

A more realistic model interleaves planning and acting, with plan supervision, plan revision, and replanning mechanisms. There is a need for a closed loop between the planner and the controller (Figure 1). The latter returns to the planner the execution status of the plan to enable dynamic planning.

Fig2: A conceptual model for dynamic planning.


Fully observable systems are systems were the controller is able to know all information about the system. Otherwise it’s called partially observable.

The restricted conceptual model plan is unconditional, and the controller executing the plan is an open-loop controller, i.e., it does not get any feedback about the state of the system.

Restricted Planning Model:

  1. Finite system.
  2. Fully observable.
  3. Deterministic.
  4. Static.
  5. Restricted goals.
  6. Sequential plan.
  7. Implicit time.
  8. Offline planning.

Conformant planning deals with nondeterministic and partially observable systems.

Hierarchical Task Network (HTN)

In artificial intelligence, the hierarchical task network, or HTN, is an approach to automated planning in which the dependency among actions can be given in the form of networks.

Planning problems are specified in the hierarchical task network approach by providing a set of tasks, which can be:

  1. primitive tasks, which roughly correspond to the actions of STRIPS;
  2. compound tasks, which can be seen as composed of a set of simpler tasks;
  3. goal tasks, which roughly corresponds to the goals of STRIPS, but are more general.

A primitive task is an action that can be executed. A compound task is a complex task composed of a sequence of actions. A goal task is a task of satisfying a condition. The difference between primitive and other tasks is that the primitive actions can be directly executed. Compound and goal tasks both require a sequence of primitive actions to be performed; however, goal tasks are specified in terms of conditions that have to be made true, while compound tasks can only be specified in terms of other tasks via the task network outlined below.

Constraints among tasks are expressed in form of networks, called task networks. A task network is a set of tasks and constraints among them. Such a network can be used as the precondition for another compound or goal task to be feasible. This way, one can express that a given task is feasible only if a set of other actions (those mentioned in the network) are done, and they are done in such a way that the constraints among them (specified by the network) are satisfied. One particular formalism for representing hierarchical task networks that has been fairly widely used is TAEMS.

A task network can for example specify that a condition is necessary for a primitive action to be executed. When this network is used as the precondition for a compound or goal task, it means that the compound or goal task requires the primitive action to be executed and that the condition must be true for its execution to successfully achieve the compound or goal task.

The best-known domain-independent HTN-planning software is:

HTN is a useful way to provide the planning engine with information about the hierarchical structure of the planning domain. HTN-like planning (as it is practically used) has the same expressivity (i.e. can solve the same domains) as STRIPS. The theoretical model of HTN is strictly more expressive than STRIPS, but cannot be directly used because of its undecidability.

Example of HTN: