Subject |
Re: [GLIF controlplane] RE: Network Control Architecture |
From |
Gigi Karmous-Edwards <gigi@xxxxxxxx> |
Date |
Sat, 28 Apr 2007 06:20:58 -0400 |
Harvey and All,
I like your list of points on service-oriented architecture. I think the
framework presented provides a strategy to accomplish each of your
stated points. The first two is really the policy of the resources which
need to be defined by the resource manager and honored by the resource
broker. The rest of your points relate mainly to the monitoring system's
ability to check for SLA violations, topology discovery and performance,
and the feedback loop between the monitored information and the resource
allocation.
Regarding ML vs. PerfSONAR, I think they will both co-exist and we must
assure interoperations between them for global interoperability to work.
In my opinion based on the use of MonALISA in the Enlightened Computing
project, it works very well and we were able to request the addition of
some features for checking lightpath connectivity on our testbed (they
were implemented by the ML team). I do think ML has extremely well
developed architecture and implementation and is feature rich. However,
we found that moving forward, the number of new features we require will
need to implemented by ourselves rather than have others (ML team) do it
for us, due to time constraints. We therefore concluded that an open
source monitoring solution will be necessary for us. Within the GLIF
community, the solution for monitoring will be one that is constantly
evolving to meet the needs of the emerging applications, and emerging
network architectures and technologies. An open source solution will
allow for many teams to develop rich feature sets which then can be
shared by others for their specific needs. It will be really helpful if
ML can be made to be open source for the GLIF community.
Kind regards,
Gigi
--------------------------------------------
Gigi Karmous-Edwards
Principal Scientist
Advanced Technology Group
http://www.mcnc.org
MCNC
RTP, NC, USA
+1 919-248 -4121
gigi@xxxxxxxx
--------------------------------------------
Harvey Newman wrote:
This is a limited view that will run into the same problems as are
well-known
from RSVP. One will never get to reserve a multi-domain path this way.
Operational steps in a services-oriented architecture:
(reservations are stateful, time-dependent, and responsive to
capability to use the allocated resource):
(1) AAA, with priority schemes and policy expressed by each VO.
(2) Inter-VO allocations according to quotas; coupled to tracking of
what has been used during a specified time period
(3) Service to verify end-system capability and load as being
consistent with the request
(4) Agents to build the path and verify its state (up, down,
which segment(s) are down or impaired) also agents to
verify end-system capability (hardware, system and kernel
config., network interface and settings); verification
of end-to-end capability with an active probe (viz.
FDT); build or tear down circuits in parallel in a
time < the TCP timeout.
(5) Tracking of capability (if relevant, as in large scale data
transfer)
(6) Adjustment of channel capability if allowed, according to
performance end-to-end. For example with LCAS
[allocation of a non-adjustable channel takes longer,
and becomes an economic question.]
(7) Adjustments driven by (a) entry of higher priority
allocation-requests; these could affect many or even
all channels or (b) re-routing of certain flows if better
paths become available (c) optimization of workflow according
to deadline scheduling for certain flows
Except for the higher-level "strategic" parts above (policy and
quotas; which need to come from the VOs), many of the technical pieces
above exist, and will be hard to match.
Harvey
Steve Thorpe wrote:
Hello Bert, everyone,
The point Bert made "...if the pre-reservation of resources is not an
atomic action..." is very important.
My belief is the pre-reservation of resources, or Phase 1 of a
2-phase commit protocol, *must* be atomic. That is, there must be a
guarantee that at most one requestor will ever be granted a
pre-reservation of a given resource. Then, the requestor should come
back with a subsequent "Yes, commit the pre-reservation", or "No, I
release the pre-reservation". In the case where the requestor does
not come back within a certain amount of time, then the
pre-reservation could expire and some other requestor could then
begin the 2-phase commit process on the given resource.
There may be situations where a resource broker can not get the
desired resource reservation(s) booked. But, I don't see deadlocking
here - where both resources can *never* be booked. Unless of course,
a resource broker books them once and is allowed on to them forever.
The atomicity of the pre-reservation (phase 1) stage of the 2-phase
commit process is a very critical part for this to work.
Steve
PS I have also added Jon MacLaren to this thread, as I'm not sure
he's on the GLIF email list(s).
Bert Andree wrote:
Hi Gigi,
What exactly dou you mean with one RB per request.
Suppose there are two independant RB's,RB-A and RB-B and two
resources, RS-1 and RS-2.
Suppose that there is a request to RB-A to book both resources and a
request to RB-B to do the same. Now, if the pre-reservation of
resources is not an atomic action, two different strategies may
introduce specific problems.
Stategy 1: an availibility request does not reserve the resource:
RB-A asks for RS-1 (available)
RB-B asks for RS-2 (available)
RB-A asks for RS-2 (available)
RB-B asks for RS-1 (available)
RB-A confirms RS-1 (success)
RB-B confirms RS-2 (success)
RB-A confirms RS-2 (fail)
RB-B confirms RS-1 (fail)
The obvious solution would be to free all resources and try again.
In complex systems there is a fair chance that both resources can
never be booked (deadlock).
Stategy 2: an availibility request reserves the resource:
RB-A asks for RS-1 (available)
RB-B asks for RS-2 (available)
RB-A asks for RS-2 (not available)
RB-B asks for RS-1 (not available)
RB-A and RB-B free all resources and try again. In complex systems
there is a fair chance that both resources can never be booked
(deadlock).
The only way to prevent is, is to have some queing of requests and
even then "individual starvation", e.g. RB-A can never book any
resources is possible in complex systems.
Best regards,
Bert
Gigi Karmous-Edwards wrote:
Hi Admela,
I agree, there are two phases, 1) check availability from xRM, and
2) If all xRMs give an ack. then go th second phase of commit, 2')
if one or more xRM gives a nack, then do not proceed to the phase
two commit. In the architecture sent out, the responsibility of
coordinating and administering the two phases is in ONE RB per
request. Each xRM will rely on the RB to tell them whether to
proceed to a commit or not. If they get a commit from an RB, it
then becomes the xRM's responsibility to make the reservation and
allocation in the actual resources. I think if for example RB-A
talks to an xRM in domain "B", then it may be the responsibility of
the xRM-B to tell its own RB-B of its interaction with RB-A. Is
this in line with your thoughts?
Gigi