Self-Organization in Large Population of Mobile Robots: Behavioral Self-Organization

4. Behavioral Self-Organization

In this chapter, we will describe several problems which may be encountered in Army-ant scenario, and propose solutions to some of these problems. Other than spatial self-organization, Army-ant robots have to act cooperatively to find and transport the pallets. In order to accomplish such tasks, they must form a team which can operate as a single entity, i.e. they have to achieve a behavioral self-organization. A robot will have several behavioral modules which enable it to carry some functions. The data a robot receives from its teammates, directly affects these modules, and they affect the teammates' behaviors reciprocally.

4.1 Behavioral Model of Army-ant Robots

Similar to the gravitational forces in spatial self-organization, each robot's position in the behavioral space is affected by activation or inhibition forces coming from teammates (and obstacles). Figure 4.1.1 shows the behavioral modules of an Army-ant robot as discussed in this chapter. As seen in the figure, robot behaviors are affected by other agents' status/behaviors as well as their own sensory data. The lines define the activation caused by a specific behavior module or sensory data. For example, avoid-obstacle behavior is induced by search and move-to-goal behaviors, and is also affected by sensor data signaling the presence of other agents and/or obstacles. On the other hand, the overall behavior of the team being affected by each individual behavior to some degree, the size of the team and the environment changes as the robots operate. Therefore, Army-ant robot behaviors have to be modeled (if possible) as a complex dynamic system with continuously changing parameters. Although there are artificial intelligence methods to define decision systems by using behavioral models, the behavioral model of an Army-ant team will be much more complicated than a single agent case [26].

Figure 4.1.1 Behavior modules of the Army-ant robots as discussed in this chapter


4.2 System-level analysis of Army-ant problem

4.2.1 Finding pallets in the warehouse

Although positions of each pallet to be transported are signaled by beacons, it may still not be easy for army-ant robots to locate the pallets in a warehouse because of obstacles. We will explain this problem by using the sample warehouse plans, starting with the most simple case (
Figure 4.2.1). If there were no obstacles in the warehouse (Figure 4.2.1a), finding the pallets would not be a problem for robots. If the size of the warehouse is relatively larger than the detection region of the robots, robots can still find the pallets by following teammates which are able to detect a goal. If the robots are able to signal their status (i.e., whether they are detecting a beacon or not), agents that cannot see any beacon, can follow the agents that detect a beacon. The decision tree for such a scenario will be:

if a beacon is detected
    move toward beacon
else if one or more robots that move toward a beacon are detected
    move toward the closest robot
else if
    search for pallet
end
This simple algorithm alone can increase the detection region of the "swarm" significantly. In the example shown in Figure 4.2.2, robots spread in a 1000x1000 unit space have a detection region defined by a radius of 250 units, and agents that cannot detect the goal initially (marked by +) can find the goal by following the agents which are initially moving toward the goal (marked by *).


Figure 4.2.1 Sample warehouse plans


Figure 4.2.2 Initially only three robots detected the beacon. Three more robots were able to find the beacon by following one of those robots

However, following robots and wandering may not always lead to a satisfactory result if the warehouse has a more complex floor plan as in figures
4.2.1b and 4.2.1c. Fortunately, robots can find the beacons by following the walls for these types of floor plans. Yet again, following signaling teammates and walls is not enough to find the beacons if the floor plan of the warehouse is closer to a "maze" (Figure 4.2.1d). As shown in the figure, robot A which follows the outer wall can reach the beacon, while robot B is circling the lower left block without seeing the beacon. If robot B keeps following the same wall, its only chance is to detect robot A as it moves toward the beacon. It is obvious that, for a more realistic case (Figure 4.2.1e), finding the pallets by using simple navigational rules (such as following other teammates and walls) will be highly difficult for Army-ant robots.

One method which may improve the result is letting the robots that are following their teammates to signal. Signals from the agents moving toward the beacon will have higher priority than the signals coming from the agents that are following their teammates. Using moving-toward-goal and moving-toward-robot signals, it is possible to create a network which we hope will include all agents. Also the moving-toward-robot signals will have different priorities depending on the signal coming from the robot followed. In a way, signals will indicate the number of nodes between a robot and the goal. As seen in Figure 4.2.3, only one robot can detect the beacon initially. From the rest, one is following that robot, another is tracking this second robot, etc. Robots are following their teammate at different levels, and are able to find the beacon. When a robot following a teammate detects another agent with higher level signal, it automatically starts following this higher level signal (Figure 4.2.3). This method also proves to be highly rewarding in more complex cases given in figures 4.2.1d and 4.2.1e.

Figure 4.2.3 Although initially only one robot (*) detects the beacon, the rest of the pack finds the beacon following that robot, and other teammates.

However, again, enabling robots to signal their status cannot completely solve the problem of finding the pallets in warehouse. To increase the chances of finding a pallet or a signaling robot, we can define more elaborate navigation algorithms or increase the communication abilities of the agents.

In order to explain what we mean by more elaborate navigation models, we will give some examples. In the warehouse plans shown in figures 4.2.4-4.2.8, initial positions of agents are marked by *'s. Defining the algorithm, we assume that robots can signal their status, and that they are able to follow signaling teammates. Agents that can not detect any signal, simply follow the walls (left or right, defined randomly or by given ID numbers). However, if a robot following the walls cannot still detect any signal after a number of turns with an angle greater than a predefined value, it enters the first alley on the right or on the left, to stop a possible vicious circle. In Matlab simulations given here, this behavior becomes active after five 60-degree turns.


Figure 4.2.4 One agent searching a beacon in the warehouse


Figure 4.2.5 One agent searching a beacon in the warehouse.

Figures 4.2.4 and 4.2.5 show one agent searching for a beacon in a warehouse. As seen from the traces, robots were able to find the beacon (pallet) using the described behavior instead of continuously circling the warehouse.

When the number of agents increases, the (average) time to find the beacon decreases. Figure 4.2.6 shows eight agents searching for a beacon. After a short time, seven of those agents have found the pallet; some by themselves, some by following other robots. Figures 4.2.7 and 4.2.8 show the first five steps, final positions, and four typical paths followed by agents. Eight out of ten agents have found the pallet after 300 steps. Some of the robots, as seen in Figure 4.2.8, were able to find the pallet because of the search behavior we defined. These paths are obviously not the closest routes to the beacon; but considering the lack of map knowledge, the result is quite satisfactory.


Figure 4.2.6 Eight agents searching for a beacon.


Figure 4.2.7 First five steps and final positions of ten agents searching for a beacon.


Figure 4.2.8 Four typical paths from the simulation with initial conditions given in Figure 4.2.7.

Although the behavior we defined for these simulations is quite simple, floor area scanned by robots increased significantly in contrast with simple wall-following algorithms. By defining more complex algorithms, the search method can be improved. It is also possible to create algorithms with self-adjusting parameters, in order to have a swarm which can adapt to its environment for faster search phases.

Furthermore, with a more complex communication structure, agents can broadcast messages, and can affect the behavior of other agents. In the examples given in figures 4.2.6-4.2.8, each agent can follow each other; but there is always the possibility of losing the track of a teammate for a robot, when the former disappears behind an obstacle. Although we force the robots that lose the signal coming from another agent to continue in the same direction -hoping that they will redetect that agent-, this may not be enough. However, broadcasting specific signals enables robots to alert their teammates that they have been followed. A robot following a teammate which moves toward another beacon (goal or robot) may send signals to stop its teammate until the former moves and detects the higher level signal seen by its teammate (This behavior is the same as the human behavior of shouting to someone while following him/her). Upon detection of the higher level signal, the teammate can be "released" by another coded message (See Figure 4.2.9). Content of such a coded signal is explained in Chapter 5.


Figure 4.2.9 Wait-for-me signal: Agent B receiving the signal from agent A waits, and upon detection of the goal by agent A, agent B is released.

Another problem we did not consider here is the structure of the obstacles. They may physically stop the agents, but pass beacon signals. In such a case, the agents must be able to find their way to the signal source around the obstacle. However, most of the problems we listed here may not be highly problematic if the number of robots is large, which will be the case in the Army-ant scenario.

4.2.2 Navigating Aisles and Collision Avoidance

Since the navigation of Army-ant robots is not based on a map, there is no need to keep track of the distance traveled. Lack of a map simplifies navigation procedures while creating a problem in finding destination points -such as pallets, set-down points- in the warehouse. Army-ant robots, equipped with two range finders in front (one on the left, the other on the right), will be able to change their direction by simply comparing the output of the two range detectors. In addition to two detectors in front, it is possible to design the robots with additional range detectors on both sides. Such detectors can be used to find alleys on the sides, as is necessary for the algorithm described in the previous section (
Figure 4.2.10).

Figure 4.2.10 Use of range sensors for navigation

Of course, a special decision routine has to be added to enable robots to change their direction in case of head-on encounter. Robots are able to maneuver in small spaces since the navigation mechanism is based on tracks.

Other than walls and pallets, robots also create (moving) obstacles for other teammates. There are several methods for coordination of multiple mobile robots in surrounding with moving obstacles. However, all these methods are based on central/hierarchical controller and/or map knowledge. Our scenario having a decentralized control approach, we can only use range sensors, and the robots' ability to sense teammates for obstacle avoidance and to solve deadlocks in robot-to-robot confrontations. Assigned ID numbers can be very useful to solve the "right-of-way" problem.

4.3 Team Coordination

4.3.1 Leader Characteristics

Although the Army-ant swarm is a homogeneous population, it is sometimes necessary to define a "leader". Army-ant robots carrying a pallet toward a destination point need to elect a leader which is responsible for beacon detection. Even though all agents are able to detect the beacon, election of a leader is necessary due to the physical limitation of sensors which complicates the determination of the direction of travel [
34]. Furthermore, the last algorithm defined in Chapter 3 requires a leader (a first) for formation of teams around beacons.

Leader election can be based on different criteria. It may be related to assigned ID numbers, or agents accomplishing a task first can be awarded as leader. Any robot may become a leader, and leadership of a robot is temporary; it ends after set-down of the pallet, if not ended earlier by a malfunction.

Besides the role assigned in Chapter 3, a leader will have other functions. Suppose that a team of robots could crawl under a pallet, but they are incapable of lifting and/or carrying the load. The robot chosen as a leader can broadcast a high priority signal, which force other robots moving toward other beacons to change their destination and join the team. A leader may also free robots that are committed to a particular pallet. This can be done by sending a do-not-consider-this-beacon signal to other robots or by silencing the beacon (depending on the team formation procedures and communication abilities).

4.3.2 A Collective Behavior Example: Distributed Load Bearing

The next phase after formation of a team around a beacon/pallet is lifting and carrying it. Robots need to act together to establish a stable transportation mechanism. A method for controlling the collective behavior of Army-ant robots for load transportation is described in [[34]. It is achieved by selecting a leader, without direct communications, using only force sensors mounted at the point of contact with the pallet. The method described in [[34] results in stable behavior in the horizontal movement of agents under the pallet.

On the other hand, during lifting and transportation of pallets it may be important that the pallet be supported equally (to some degree) by all robots. It is possible to detect this condition by comparing the sensed vertical force on agents. However, since there is no central controller, this has to be done collectively.

Assume that each robot under the pallet can receive signals from a particular teammate, and robots can form a ring network (probably by using assigned ID numbers) where the signals correspond to vertical force sensors of each robot (Figure 4.3.1). Using the incoming signal and output of the vertical sensor signal, robots can determine the difference between their own sensor and the sensor of the "previous" robot. Robots that detect a difference beyond the predefined parameters can broadcast a warning signal. Transportation starts and continues as long as there is no robots broadcast a warning signal caused by this or any other problem. However, this simple approach is far from guaranteeing the detection of an undesirable configuration. For a given error range, it is possible that the forces sensed by team members differ greatly while no robot is detecting violation (Figure 4.3.2). Therefore robots must somehow be able to detect the range violations no matter the relative positions of the two robots on the communication ring.


Figure 4.3.1 Ring formation and coupling coefficients for eight agents.


Figure 4.3.2 Difference between 1st and 4th sensor is quite large while all sensor readings between successive robots are in allowable range.

To solve this problem, we will use the idea of coupled van der Pol oscillators. VDP oscillators have been used to model both electrical and biological systems (such as vacuum-tube circuits and locomotion generation in fishes). Another aspect of VDP oscillator networks is also useful in Army-ant scenario: It may be important to synchronize the robots for a particular task. Bay [5] developed the idea of "heartbeat" for Army-ant robots, where each robot initially oscillates at a different frequency -related to its pre-assigned ID number. Then a team of robots can reach a common oscillator frequency (entrainment) when coupled together. Several coupling methods and results are given in [5].

For our purpose, we will assume the coupling is done in a ring configuration, i.e., coupling coefficents (lambda) are different than zero for (i,j) pairs if i=j+1(mod N) (Figure 4.3.1). We are forced to use a ring configuration, because full coupling inhibits agents to detect the variations accurately. VDP oscillators used here are slightly different from a general VDP oscillator. We added a feedback gain factor (lambda ii) which can be changed over time as a function of the output of the force sensor. Therefore, feedback gain of the each oscillator will change according to the vertical force component sensed by the robot. Oscillator outputs xi will be broadcast signals "heard" by agents with non-zero coupling coefficients (lambda ij different than 0).


Figure 4.3.3 Coupled VDP oscillator with added feedback gain


Figure 4.3.4 Mapping from force sensor output to feedback gain

We define a mapping from sensed force to feedback gain lambda-ii as shown in Figure 4.3.4. In a ring formation with lambda-ij's are chosen between 0 and 0.3, the entrainment is reached after a time with frequencies in the range 0.8-1.2 [5]. For our purpose, we set all oscillator frequencies w2 to 1. However, the amplitude of the oscillations will differ from agent to agent if the feedback gains are not the same. Each robot is again able to compare the output of its force sensor with the output of the coupled agent. In addition, the ring configuration enables each robot to detect the sensor output variations occurring in any other teammate by comparing its sensor output only to the previous robot in the configuration.

We will try to demonstrate the performance of this method by some examples. lambda-ij's are taken as 0.3 in all simulations given here. Output and error signals (i.e., [x(i)-x(i-1)] mod N) of the eight oscillators in ring formation are shown in Figure 4.3.5. Starting with different initial conditions, the oscillator outputs match one another and error signals approach zero in time with all feedback gains lambda-ii are set to 1.


Figure 4.3.5 Oscillator outputs and error signals of eight agents in ring formation


Figure 4.3.6a Oscillator feedback gains lambda-ii and outputs of eight agents in ring formation


Figure 4.3.6b Oscillator feedback gains lambda-ii and outputs of eight agents in ring formation

Figures 4.3.6a and 4.3.6b show the changes of oscillator outputs and error signals when feedback gains (or force sensor outputs) of the eight oscillators are changing in time. If the gains approach to a common value (1 in the simulations), all error signals approach zero (Figure 3.4.6a).

To see the effect of one oscillator on others, we change the gain of one oscillator (number 1) after all gains reach the steady-state (Figure 4.3.7a). Such a change in the feedback gain corresponds to a situation where robots move over uneven surfaces. Oscillation magnitudes still being approximately 1.5, the change in oscillator 1 is detected by all agents. Figure 4.3.7b shows the error signal computed by agents 5 and 8. There is a time lag between detection times of agent 5 and agent 8; it takes time for a change in the feedback gain of one agent to affect other agents.


Figure 4.3.7a Oscillator feedback gains lambda-ii and outputs of eight agents in ring formation.


Figure 4.3.7b Oscillator feedback gains lambda-ii and outputs of eight agents in ring formation, and error signals of two agents.


Figure 4.3.8 Oscillator feedback gains lambda-ii and outputs of eight agents in ring formation, and error signals of two agents.

In the case of two oscillators with "significantly" different gain factors, we found that at least one agent can detect the problem. Figure 4.3.8 shows that the gains of oscillators 1 and 2 differ from the others. Although the difference between oscillators 8 and 1 or oscillators 2 and 3 are not large, the difference between oscillator gains 1 and 2 is. As the error signals for agents 3 and 4 reveal, the problem is detected by agent 3 (as well as agents 1 and 2). Even after the feedback gain of the second oscillator changes to decrease the gain (force) range, error signal 4 shows that forces are not equal.

The errors in feedback gains (force sensors) can be detected by other oscillators, even if they are not nearby. In a more problematic case, three agents' sensor outputs differ from their teammates as shown in Figure 4.3.9. Again error signals of agents 4 and 8 show the changes as well as agents 1, 2, and 3. The ring coupling of the oscillator outputs enables a team of robots to sense the "stability" of pallet-robot system. Robots may adjust their oscillator feedback gains by exerting more of less force *thereby equalizing vertical forces until oscillators agree.


Figire 4.3.9 Oscillator feedback gains lambda-ii of eight agents in ring formation and error signals of two agents.


Figire 4.3.10 Oscillator feedback gains lambda-ii of five agents in ring formation and error signal of an agent.

It is obviously necessary to adjust the allowable ranges for error signals. It should be adjusted so that the problems may be detected no matter what the number of robots is. Figure 4.3.10 shows a similar case to simulation given in Figure 4.3.7b with only five agents. Again the error signal in the "farthest" agent shows the problem in one agent, and the level of the error signal is slightly greater than the case with eight agents. As long as the number of agents is more than or equal to 5, the same parameters for allowable ranges are likely to work for all cases.

When additional oscillators are successively added to the system, the change in the signal levels is less and less. Addition of new oscillators to ring network of eight oscillators does not have a perceivable effect. Similar characteristics are observed in the period and amplitude of the entrained oscillations by Bay [5]. Furthermore, a discrete mapping function from sensor output to the feedback gain may improve the detection of range violations. Yet another problem is that the parameters working successfully in the horizontal floor will not work on an inclined surface. Use of inclinometers may solve the problem. Inclinometer output can be used to adapt the range parameters.

Although we defined the feedback gain range to be 0.7-1.3 and assumed that the gains generally approach unity, this will not be the case in Army-ant operation, because the average load on robots will change depending on the payload and number of agents. However, the method described here works no matter where the average value of the gain is with respect to range limits.

4.3.3 Carrying the pallets

In transporting the pallets, Army-ant robots may encounter several problems. Besides the problems of aligning all robots along a common direction of travel [34] or distributing the load equally, there still exists problems such as acceleration/deceleration or obstacle avoidance during a pallet transport. A group of robots forming a team has to move in accord.

The idea of "heartbeats" seems to be a solution to this kind of problem. Again, a broadcast warning signal can be used to create a team "conscience." Yet, the use of such a signal must be analyzed in depth. Suppose one of the robots encounters a small obstacle (perhaps a malfunctioning robot) creating problem for only that robot during the transportation of a pallet. Should the robot "warn" its teammates and force the whole team to stop and take action (Figure 4.3.11a), or should it simply "leave" the team temporarily to solve the problem by itself -by circling the obstacle? (Figure 4.3.11b)


Figure 4.3.11 A small obstacle is encountered during pallet transport

The second approach seems to be more appropriate for our scenario -although it may be more difficult to carry out. Enabling agents to stop a whole team may destroy the robustness of this decentralized system. For example, a malfunctioning agent can stop a pallet transportation by an erroneous warning signal. Therefore, we must keep the "responsibility" of the agents as low as possible. A team should be able to leave that agent and continue the process.

Furthermore, two teams of robots carrying pallets may come across one another and block each other's way. In such a case, ID numbers of the leaders may be used to solve the conflict. If an individual robot encounters a team carrying a pallet, it should give the right-of-way to the team by sensing the leader's signal indicating that it is committed to load transportation.

Another problem may be finding the set-down point in the warehouse. Use of extra beacons defining specific points in the warehouse may be chosen as a solution. Teams carrying loads can find their way to the destination point by following special beacons deployed so that robots can find the set-down point if they can detect one of the beacon signals initially. This method is explained in more detail in section 4.3.4.2.

4.3.4 Other Issues

4.3.4.1 Dispersal and Rearrangement to Other Teams

It will probably be the leader's task to decide whether the set-down point is reached or not. The set-down condition will be signaled by the leader, and another specific signal can be used to "release" the agents. Receiving "release" signals, a search behavior will again become active, suppressing the leader-following behavior, and agents will disperse in the warehouse.

4.3.4.2 Battery Recharge

Army-ant robots will be powered with batteries. Therefore, it will be necessary to recharge the units during operation. Robots can check their battery levels before engaging in a search or a team forming, and decide when to go to the recharge site. Since the Army-ant robots do not have a map knowledge of the warehouse, it may be difficult to find this site. We may define a behavior (e.g., search-for-recharge-site) which becomes active if the battery level is below a preset value. This behavior will force the agent to look for a different type of beacon which is deployed in the warehouse. It will work as a "reflex." A robot will approach the first special beacon it sees, and then lock onto the next beacon signal with higher "priority." Assigning priorities to beacons will enable robots to move toward the recharge site, no matter which beacon they detect.


Figure 4.3.12 An example of beacon positioning and priority assignment

4.3.4.3 Complexity

In this chapter, we offered solutions to some of the problems that will probably be encountered while realizing Army-ant robots. Several problems in different phases of the scenario can be solved using beacons, communication methods, and simple decision algorithms. The Army-ant swarm will take its final shape during realization. We expect more problems to emerge. On the other hand, some of the problems described here may become insignificant due to technical capabilities of Army-ant robots. A brief chapter of technical assessment, related to solutions given here, follows.


Back to Thesis Main Page

To Chapter 3

To Chapter 5