Legion: Lessons Learned Building a Grid Operating System

descargar 74.15 Kb.
títuloLegion: Lessons Learned Building a Grid Operating System
fecha de publicación31.01.2016
tamaño74.15 Kb.
b.se-todo.com > Ley > Documentos
  1   2   3   4   5   6   7   8
Legion: Lessons Learned Building a Grid Operating System

Andrew S. Grimshaw

Anand Natrajan

Department of Computer Science

University of Virginia


Legion was the first integrated grid middleware architected from first principles to address the complexity of grid environments. Just as a traditional operating system provides an abstract interface to the underlying physical resources of a machine, Legion was designed to provide a powerful virtual machine interface layered over the distributed, heterogeneous, autonomous and fault-prone physical and logical resources that constitute a grid. We believe that without a solid, integrated, operating system-like grid middleware, grids will fail to cross the chasm from bleeding-edge supercomputing users to more mainstream computing. This paper provides an overview of the architectural principles that drove Legion, a high-level description of the system with complete references to more detailed explanations, and the history of Legion from first inception in August of 1993 through commercialization. We present a number of important lessons, both technical and sociological, learned during the course of developing and deploying Legion.


Grids (once called Metasystems [20-23]) are collections of interconnected resources harnessed together in order to satisfy various needs of users [24, 25]. The resources may be administered by different organizations and may be distributed, heterogeneous and fault-prone. The manner in which users interact with these resources as well as the usage policies for the resources may vary widely. A grid infrastructure must manage this complexity so that users can interact with resources as easily and smoothly as possible.

Our definition, and indeed a popular definition, is: A grid system, also called a grid, gathers resources – desktop and hand-held hosts, devices with embedded processing resources such as digital cameras and phones or tera-scale supercomputers – and makes them accessible to users and applications in order to reduce overhead and accelerate projects. A grid application can be defined as an application that operates in a grid environment or is "on" a grid system. Grid system software (or middleware), is software that facilitates writing grid applications and manages the underlying grid infrastructure. The resources in a grid typically share at least some of the following characteristics:

  • They are numerous.

  • They are owned and managed by different, potentially mutually-distrustful organizations and individuals.

  • They are potentially faulty.

  • They have different security requirements and policies.

  • They are heterogeneous, e.g., they have different CPU architectures, are running different operating systems, and have different amounts of memory and disk.

  • They are connected by heterogeneous, multi-level networks.

  • They have different resource management policies.

  • They are likely to be geographically-separated (on a campus, in an enterprise, on a continent).

The above definitions of a grid and a grid infrastructure are necessarily general. What constitutes a "resource" is a deep question, and the actions performed by a user on a resource can vary widely. For example, a traditional definition of a resource has been "machine", or more specifically "CPU cycles on a machine". The actions users perform on such a resource can be "running a job", "checking availability in terms of load", and so on. These definitions and actions are legitimate, but limiting. Today, resources can be as diverse as "biotechnology application", "stock market database" and "wide-angle telescope", with actions being "run if license is available", "join with user profiles" and "procure data from specified sector" respectively. A grid can encompass all such resources and user actions. Therefore a grid infrastructure must be designed to accommodate these varieties of resources and actions without compromising on some basic principles such as ease of use, security, autonomy, etc.

A grid enables users to collaborate securely by sharing processing, applications and data across systems with the above characteristics in order to facilitate collaboration, faster application execution and easier access to data. More concretely this means being able to:

  • Find and share data. Access to remote data should be as simple as access to local data. Incidental system boundaries should be invisible to users who have been granted legitimate access.

  • Find and share applications. Many development, engineering and research efforts consist of custom applications – permanent or experimental, new or legacy, public-domain or proprietary – each with its own requirements. Users should be able to share applications with their own data sets.

  • Find and share computing resources. Providers should be able to grant access to their computing cycles to users who need them without compromising the rest of the network.

This paper describes one of the major Grid projects of the last decade – Legion – from its roots as an academic Grid project to its current status as the only commercial complete Grid offering [3, 5, 6, 8-11, 14, 17-19, 22, 23, 26-29, 31-53].

Legion is built on the decades of research in distributed and object-oriented systems, and borrows many, if not most, of its concepts from the literature [54-88]. Rather than re-invent the wheel, the Legion team sought to combine solutions and ideas from a variety of different projects such as Eden/Emerald [54, 59, 61, 89], Clouds [73], AFS [78], Coda [90], CHOICES [91], PLITS [69], Locus [82, 87] and many others. What differentiates Legion from its progenitors is the scope and scale of its vision. While most previous projects focus on a particular aspect of distributed systems such as distributed file systems, fault-tolerance, or heterogeneity management, the Legion team strove to build a complete system that addressed all of the significant challenges presented by a grid environment. To do less would mean that the end-user and applications developer would need to deal with the problem. In a sense, Legion was modeled after the power grid system – the underlying infrastructure manages all the complexity of power generation, distribution, transmission and fault-management so that end-users can focus on issues more relevant to them, such as which appliance to plug in and how long to use it. Similarly, Legion was designed to operate on a massive scale, across wide-area networks, and between mutually-distrustful administrative domains, while most earlier distributed systems focused on the local area, typically a single administrative domain.

Beyond merely expanding the scale and scope of the vision for distributed systems, Legion contributed technically in a range of areas as diverse as resource scheduling and high-performance I/O. Three of the more significant technical contributions were 1) the extension of the traditional event model to ExoEvents [13], 2) the naming and binding scheme that supports both flexible semantics and lazy cache coherence [11], and 3) a novel security model [16] that started with the premise that there is no trusted third party.

What differentiates Legion first and foremost from its contemporary Grid projects such as Globus1 [92-99] is that Legion was designed and engineered from first principles to meet a set of articulated requirements, and that Legion focused from the beginning on ease-of-use and extensibility. The Legion architecture and implementation was the result of a software engineering process that followed the usual form of:

  1. Develop and document requirements.

  2. Design and document solution.

  3. Test design on paper against all requirements and interactions of requirements.

  4. Repeat 1-3 until there exists a mapping from all requirements onto the architecture and design.

  5. Build and document 1.0 version of the implementation.

  6. Test against application use cases.

  7. Modify design and implementation based on test results and new user requirements.

  8. Repeat steps 6-7.

This is in contrast to the approach used in other projects of starting with some basic functionality, seeing how it works, adding/removing functionality, and iterating towards a solution.

Secondly, Legion focused from the very beginning on the end-user experience via the provisioning of a transparent, reflective, abstract virtual machine that could be readily extended to support different application requirements. In contrast, the Globus approach was to provide a basic set of tools to enable the user to write grid applications, and manage the underlying tools explicitly.

The remainder of this paper is organized as follows. We begin with a discussion of the fundamental requirements for any complete Grid architecture. These fundamental requirements continue to guide the evolution of our Grid software. We then present some of the principles and philosophy underlying the design of Legion. We then introduce some of the architectural features of Legion and delve slightly deeper into implementation in order to give an understanding of grids and Legion. Detailed technical descriptions exit elsewhere in the literature and are cited. We then present a brief history of Legion and its transformation into a commercial grid product, Avaki 2.5. We then present the major lessons, not all technical, learned during the course of the project. We then summarize with a few observations on trends in grid computing.

Keep in mind that the objective here is not to provide a detailed description of Legion, but to provide a perspective with complete references to papers that provide much more detail.
  1   2   3   4   5   6   7   8


Legion: Lessons Learned Building a Grid Operating System iconLearned languages: English, French, Spanish, and Italian

Legion: Lessons Learned Building a Grid Operating System iconGeneral Surgery and the Digestive System

Legion: Lessons Learned Building a Grid Operating System iconModo nids (Network Intrusion Detection System)

Legion: Lessons Learned Building a Grid Operating System iconScience is progressing at a rapid pace. We have learned more in the...

Legion: Lessons Learned Building a Grid Operating System iconIdentifying institutional relationships in a geographically distributed...

Legion: Lessons Learned Building a Grid Operating System iconPlan de manejo organic / organic system plan

Todos los derechos reservados. Copyright © 2019