Subject |
Re: [GLIF controlplane] RE: Network Control Architecture |
From |
Cees de Laat <delaat@xxxxxxxxxxxxxx> |
Date |
Sun, 6 May 2007 17:03:57 +0200 |
Hi,
What I see here is a discission between two fundamentally different
systems: a tree like allocation and authorization method versus a
chain like model. In Phosphorus we identify those and also need to
work out how to bridge between neighboring domains that implement
different models.
We need big steps and rough code there.
Best regards,
Cees.
At 09:32 -0500 06-05-2007, Joe Mambretti wrote:
Hello:
I agree with your suggestion that it is important to start with
small steps. However, with any
steps, there must be some assumptions behind the design. One reason
that these designs have been
challenging is that different communities have varying ideas about
resource costs, the higher the
cost the greater the consideration for advanced scheduling (e.g.,
airline travel vs the local metro
- note that UvA has create a token based ticketing system). Some
communities where resources are
ubiquitous do not want to have major considerations about scheduling
at all. Also, there are
difference design approaches, such as chained authorization vs
simultaneous pushing or pulling
credentials across domains. (There are many other issues as well.)
These types of issues have
slowed progress toward an actual prototype implementation. I suggest
that during your proposed
call, the participants agree to design and implement "a prototype"
(vs perhaps the ultimate
prototype) by agreeing on some of these basic concepts, as the IETF
says "rough consensus and
running code."
Thanks.
==============Original message text===============
On Sun, 06 May 2007 8:18:59 am CDT Gigi Karmous-Edwards wrote:
All,
I forgot to mention one more thing: As was discussed in the meeting in
February, both strategies can co-exist. We drew this up on the
whiteboard the first day and then decided not to have it initially as
part of the architecture. If those who were present remember when we
drew two separate network domain clouds, (Domain Network Resource
Manager ) NRM-A and DNRM-B. Then we discussed, that if they had an
agreement between each other such as "inter-domain Dragon" testbed, then
we can have another DNRM-AB (one cloud that encapsulates the two smaller
ones) for advertising and therefore configuring. In this case if a user
request comes in that requires a lightpath across domains A and B, the
RB on behalf of the user can make a single request to DNRM-AB. Let me
know what the community's thoughts are ....
Kind regards,
Gigi
--------------------------------------------
Gigi Karmous-Edwards
Principal Scientist
Advanced Technology Group
http://www.mcnc.orgMCNC
RTP, NC, USA
+1 919-248 -4121
gigi@xxxxxxxx
--------------------------------------------
Gigi Karmous-Edwards wrote:
Hi Jerry and All,
Ok Jerry, I stuck with you on your insightful email ( I started your
email a couple of weeks ago and just finished it this morning :-) ).
If I can summarize your assertions : When an interdomain lightpath is
requested, the resource broker (RB) (which is a servant of a user
rather than a domain) talks only to the first Domain's NRM (network
resource manager) and then that NRM talks to the second NRM, and so on
till the destination. This requires each domain to have established
some sort of agreement with all adjacent domains. In your second
scenario it seems the user requested a source RM that is not in the
RB's "domain" and that the RB will have to forward it to the right RM,
then a repeat of the above process.
I think what you described is the ultimate goal of the community,
however, due to complexities of the current infrastructures (NRENs,
Research testbeds, Global government networks, etc) that require
interoperation, it seems that we first need to take small "baby
> steps". Existing infrastructures include a variety of technologies,
> different management (TL1, SNMP, CLI, etc.) and control plane (very
few deployments of GMPLS) tools for configuration and fault
management, also current procedures for information exchange between
network domains range from protocols to phone calls/emails. These
complexities and other "policy" related challenges force us to break
the problem up into smaller functional blocks. I think the framework
presented will give us a path forward based on "baby steps" to finally
reach the scenario you describe.
I see the problem as having three key challenges:
1) Information dissemination (where is what resource? what are its
characteristics? what are its policies for use?)
2) Capability to request reservations on resources globally once
discovered ( standard interfaces to query resource managers, with "NO"
restrictions on how each resource manager accommodates each request,
reuse of existing implementations)
3) Scalability ( division of labor among functional components and
responsibilities per domain)
The assumption in the framework sent out has been that an RB takes
requests from a particular domain's user/application but behaves as a
servant of the domain not a single user. In this case there will be
several RBs worldwide, but not one for each user, rather one or two
per domain. It is assumed that the knowledge of the different
resources globally will be published per domain in a very distributed
fashion (each RB will publish the resources and their characteristics
hopefully using the schema from the OGF Network Markup Language
working group. A query from one RB to the "distributed GLIF resources"
will use a type of crawl mechanism to match the requested resources
with the "published" resource information that each domain RB
publishes on behalf of its RMs. The assumption is, the information
published by the RBs is not static and will be updated by each RB when
necessary. This email is already getting too long, I suggest that we
have a conference call and use a WEB based slide sharing application
to go through some scenarios. Any interest?
To summarize, the strategy in your email will be the goal of the
community but it will take a while. I think, as a community we can
start to develop standard interfaces for the various RMs such as the
Generic Network Interface (GNI), this will help us towards
interoperability in today's environment.
Please let me know if we should have a GLIF control plane conference
call in the next few weeks?
Kind regards,
Gigi
--------------------------------------------
Gigi Karmous-Edwards
Principal Scientist
Advanced Technology Group
http://www.mcnc.org> MCNC RTP, NC, USA
+1 919-248 -4121
gigi@xxxxxxxx
--------------------------------------------
Jerry Sobieski wrote:
Good comments both Steve and Bert...let me chime in: (this is a
bit long, but I think it is relevant)
I too think the reservation phase in each domain must be atomic -
there are effective ways to do this. The overall process though
becomes two phase: HOLD a resource for some finite holding time and
provide an ACK to the requestor. At some later time the RM will
receive a CONFIRM from the requester, or a RELEASE. If the hold time
expires, the resource is released unilaterally. On a macro basis,
the reservation of the entire end-to-end lightpath must also be held
in the HOLD state while the rest of the application resources are
reserved as there may be a dependency between availability of
non-network resources and the reserved lightpath.
As Steve suggests, this atomic two phase mechanism is used in many
other similar reservations systems.
The issue I am concerned about is the roles of the RB and RM. I think
the RBs will be numerous - possibly one for every user. I believe
we must assume that all networks will default to a stringent "self
secure" stance and will only allow access to its RM from known and
>> trusted peers. It doesn't scale for every network to "know" about
every other RB in the world (RBs are agents of the user - not of the
network) Therefore, for scalability and security reasons, these
resource reservation requests must be made between directly peering
networks, and each network is responsible for recursively reserving
the resources forward toward the destination. This is still a two
stage commit as described above but it solves two problems: a) it
scales much better as each network only needs to expect queries from
its direct peers (and customers) and b) it allows each network to
negotiate aggregation policies with its peers for services (enabling
economies of scale and global reach). This is not unlike how we
place a phone call to anywhere in the world - we don't go asking each
network if we can use it, we ask our service provider to do so, they
ask theirs, and so on, and so on,...
The above scenario assumes the RB poses the service request to the RM
serving the source end of a path. There is a [common?] case where
the RB is not at the endpoint(s) and does not know of any RMs at the
endpoint (or in the middle for that matter). This brings us to
another assumption I think we must make: a RB only knows its *local*
network RM. An appropriately designed algorithm should/could
forward the request to the source address RM using the same
forwarding process as the reservation (but crossgrain toward toward
the source), and then the request can be serviced forward normally as
described above. (This is the "third party" provisioning scenario.)
An alternative model asumes a "minion" agent at the path endpoints
that is owned by the end user and knows of its local RM- the minion
agent acts as proxy for the RB and makes the reservation request to
the minion's RM. (got that?:-) I think we *can* assume that the
RB knows of these minions since they reside at the end points (source
or destination) at a well known port.
It is important to note that this process relies on each network RM
(not the RB) knowing constrained reachability of all endpoints - not
unlike current interdomain routing protocols. This allows the RM to
postulate which "nexthop" network will provide the best path and try
that first. If the RM knows more than just reachability - i.e. if it
knows topology, then the RM can select a more specific candidate
path and, via authorized recursive querires, can reserve the
resource. Only the RM responsible for a network knows the state and
availability details associated with the internal network resources,
and therefore only the local RM can authoritatively and atomically
reserve the resources in that network.
The beauty of this process is that from the RB perspective, the RB
need only ask one RM for the entire end-to-end network path. The RM
will either return a ticket indicating a path was successfully
reserved that meets the requested service characteristics, or a NACK
indicating that the resource was not available for some reason. The
user must change the requested services parameters somehow before
trying again - i.e. change the source or destination addr, the start
time, the capacity, etc.)
As Gigi states, once all application resources are reserved in the
HOLD state, then all must be CONFIRM'ed which will lock in the
reservation.
At some delta-t later (which could be 0) there is a separate process
that causes the reconfiguration of the network elements to make the
reserved resources available for actual use (i.e. the provisioning or
signaling process). This process must be correlated to a previous
reservation and so the provisioning request (separate from the
reservation request) must contain some indicator that is trusted by
the network and indicates which reservation is being placed into
service (see Leon's work on AAA)
Note that none of the above is predicated on any particular routing
>> or signaling protocol... That being said (:-), DRAGON has
implemented much of this functionality using GMPLS protocols.
-The DRAGON Network Aware Resource Broker (NARB) is analogous to the
network RM and performs the path computation recursively reserving
the resources along the way.. It returns a path reservation in the
form of an Explicit Route Object (ERO) to the source requestor. This
loose hop ERO specifies a path consisting of ingress and egress
points at each network boundary. -RSVP then uses this ERO to
provision the multi-domain end-to-end path. -The DRAGON
Application Specific Topology "Master" is an agent analogous to the
RB mentioned above. AST Master queries all the various resource
managers (compute nodes, storage, instruments, network, etc) to
reserve groups of dependent resources. There is a significant
protocol exchange defined for ASTs to construct a workable physical
resource grid for the application. What DRAGON has not yet
implemented: We have implemented scheduling and policy constraints
in the traffic engineering database, but we have not yet implemented
the path computation to use those constraints (this will be coming
soon).
We have atomic reservations, but have not implemented the two phase
commit - though we have long recognized it as critical to the
bookahead capability and a robust integrated resource scheduling
process.
Thanks for sticking with me on this ...:-)
Jerry
===========End of original message text===========
--
http://www.science.uva.nl/~delaat/