17-811 Self-Healing Systems: Class Discussion Summary

David Garlan
Spring Semester 2003

Summary of Class Discussion for March 31 by Justin Kenlon

Revisiting basic components of self-healing systems;

What kind of signature should we require?
Is the I/O, I’/O’, M/A, M’A’ general enough?
Should there be a separate architecture style for such components?
Should SHS components be distinguished from normal system components?
Identify what is going on, and what needs to happen:
  1. Monitoring -> What do we do with this data?
    1.  Problems – detection
       Opportunities – resolution
  2. Adaptation

Architecture-Based Repair Discussion

SHS as an external control mechanism on an executing system
Should it be piggy-backed, internal, or some mix of the two?
Where is each strategy useful? Valuable? Possible?
How specific or general should the style be for a class of SHS?
What we need: Generalizable control models
How do we enforce compliance through architectural levels?
Can we apply a Chinese menu approach to SHS architecture?
How does the specification of SHS architecture constrain the self-healing?
What needs to be maintained?
What can change, anyway?
 How do we analyze changes?

Rainbow Discussion

What’s the deal with formatting/converting/translating monitor data?
Introduced abstractor and refiner for general model; rejected as too “layer” or “hierarchy” specific
What happens when we have a peer-to-peer architecture?
How do we abstract the notion of translation?  Bridges between components for SHS must often translate, sometimes data format, sometimes granularity; this introduces possibilities of limited context knowledge between components, and both “upward” and “downward” control mechanisms and monitoring
Bi-directional support in the basic component?
How do we certify/validate adaptation opportunities?
 Where in the architecture do we validate?
 What do we try to validate?
 Should we adapt? (possible, correct, necessary, beneficial, long/short term goals?)
 Is this internal or external?
“Naïve systems” with separation between system and “healing mechanism”
What about environment?
 Placement
 Context/scope
 Vehicle (I, M, direct, etc)
 Environment as a component (using M/A feeds)
Battery example w/ internal change meter, change propagation
Engineered adaptive systems, or legacy piggyback SHS mechanisms
 This is an important distinction.  Does this imply two different architectures?
How much do we monitor?  What is Useful?  How can lazy/j.i.t monitoring help or hurt?  What about diagnostic/predictive healing?
 

Discussion of C2/Weaves

Rainbow(Central control) vs. Distributed authority
Is distributed authority a degenerate case of the rainbow idea?
Maybe the other way around?

Discussion of event-bus w/ “component managers”…

 What is the layout mapped to out basic component?
 Internalized M/A channels?
  Distinction between internal and external control flow/connections
Vocabulary of M/A channels and their abilities, functions, limitations

Keep environment model as simple as possible to limit amount of knowledge each component requires (paper 3 for today had to limit complexity to 100 components)
Consider client/server where client is blind to server environment, only sees “interface” or “server”, not the whole tiered system…

Touched upon “figure 8”, separation into levels, etc delayed until next time.