From Agentgroup
Jump to: navigation, search

In the current proposal for the new PIM architecture, we identified the following tradeoffs between opposing requirements.

Caching mechanism

where benefits disadvantages proposals
  • speeds up the execution of the CP
  • reduces the number of migrations
  • improves the overall reactivity
  • enlarges the CP state
  • reduces the reactivity to local changes
  • possible loss of sequentiality
  • introduction of enable/disableCaching() methods
  • forcing sequential execution of critical commands
  • prevent the caching of volatile data
  • prevent the caching of large data collections
  • incremental caching

Node discovery mechanism (based on single broadcasts)

where benefits disadvantages proposals
  • NodeMonitor
  • GroupManager
  • NodeLifeCycleListener
  • NodeManager
  • the arrival of a node is detected by every neighbor
  • decreases the network traffic (in case of almost-full connected topologies)
  • reduces the power consumption (in case of almost-full connected topologies)
  • the detection of a new node is delayed until the CP migrates to one of its neighbors
  • relying only on 1st level broadcasting may be inadequate (e.g. Figure1)
  • it doesn’t detect new available routes (e.g. Figure2)
  • introduces an high communication overhead in case of frequent changes in a partially-connected topology (e.g. a group of robots moving inside a building)
  • the CP is obliged to periodically visit each node, also the ones not included in the topology but marked as “available”
  • the CP can call a lookForAvailableNodes() method, triggering a ping-broadcast storm in order to obtain an up-to-date list of all available nodes
8-ring topology.png

Figure1: the CP cycles around node 1, 2 and 3. At some point, it needs to include more blue nodes to the team, but it isn’t aware of the presence of node A, B and C. This because the 1st level broadcast doesn’t propagate this information through node X to node 2 and 3.

Changing topology.png

Figure2: at time t = x, there is no direct connection between node 3 and node A. At time t = x + 1, node A is closer to node 3, so that a direct connection between them is now possible. Unfortunately, this route cannot be detected by the migration system because the broadcast has already occurred, so the neighbor-list of node 3 cannot be updated until node A stops receiving the CP.

Node departure detection mechanism (without keep-alive packets)

where benefits disadvantages proposals
  • MigrationManager
  • NodeMonitor
  • GroupManager
  • NodeLifeCycleListener
  • decreases the network traffic (if departures aren’t a frequent event)
  • reduces the power consumption (if departures aren’t a frequent event)
  • slows down the detection of node departures
  • the detection mechanism requires a failure during the migration process
  • introduces an high communication overhead in case of frequent failures
  • cannot distinguish between a real failure or a simple topology change
  • we need to avoid that the detection of a node departure is so slow that every other node will broadcast a request for the presence of the CP
  • use this mechanism only if we are sure that nodes departure or topology changes are not frequent events
  • consider the use of ICMP transport-level protocol

Dynamic migration manager

where benefits disadvantages proposals
  • MigrationScheduler
  • MigrationManager
  • migrates the CP right where its presence is needed
  • introduces the NegletTolerance concept
  • nodes can require the presence of the CP as soon as possible
  • slows down the migration process
  • increases the execution state of the movable part
  • it is not simple to define a list of critical events that can require the presence of the CP. Evan a minor change in this list can deeply change the overall behavior of the team.
  • let the CP free to migrate just where it wants, letting the nodes free to decide whether to require the presence of the CP or not. This could both improve the reactivity of the system and reduce the overhead required by the caching mechanism

Fault-tolerance mechanism

where benefits disadvantages proposals
  • NodeMonitor
  • GroupManager
  • MigrationManager
  • MigrationScheduler
  • NodeManager
  • NodeLifeCycleListener
  • makes the PIM model able to work in dynamic and hostile environments
  • broadcasting the CP speeds up the system restart after a node failure
  • as for now, this mechanism should interact with almost every PIM module (e.g. look at the “where” subsection). This reduces the modularity of the architecture, because we cannot be sure that the substitution of one module wouldn’t break the fault-tolerance mechanism
  • broadcasting the CP increases the power consumption of the system, because every neighbor has to receive the broadcasted packets