The following text is an English translation of part of the PIRATES project proposal. For more complete information, please see the original proposal. PIRATES: Methods and Tools for Network-transparent Dependable Distributed Programming ------------------------------------------------------ Peter Van Roy Department of Computing Science and Engineering Universit\'{e} catholique de Louvain Place Sainte-Barbe, 2 B-1348 Louvain-la-Neuve Tel: (+32) (10) 47 83 74 Fax: (+32) (10) 45 03 45 Email: pvr@info.ucl.ac.be Abstract -------- The objective of PIRATES is to create a platform for distributed computing that is transparent, dependable (i.e., fault tolerant and secure), efficient, and open. A principal cause of the difficulty of distributed programming is that existing platforms are all based around a {\em centralized} kernel. The programmer has to manage explicitly the distribution, the details of fault tolerance, the details of the security model, and the details of the network protocol used. This makes many levels to be managed simultaneously. This will become more and more difficult in the future as the number of computers and networks of computers continues to grow exponentially. In the future, the lone computer will disappear completely to be replaced by a network of networks. The approach of PIRATES is to start from a transparent view of the network, not from a centralized kernel. We say a language is transparent for distribution (i.e., network-transparent) if the behavior of a program does not depend on how it is distributed across sites. The multiplicity of models (distribution, fault tolerance, security, network protocol) will be replaced by a single model. This simplification has many advantages: one can develop and validate an application on a single site, there are fewer possibilities for errors, productivity is increased, and the dependencies between different parts of a project are reduced. Nevertheless, a practical implementation cannot completely hide the different models. One has to keep control over the network communications and over resources. The approach of PIRATES is therefore to {\em separate} this control from the rest. In an application, the bulk of the code does the work of the application itself. This is augmented with small specifications of the distribution, fault tolerance, security, and network layer. The behavior of the application code will not depend on these specifications. In order to maximize results with a minimum of means, PIRATES is divided into two parts: a middleware part and an applications part. In the applications part, PIRATES will develop two completely different applications: "Mobile Groups" and "Software on Demand". These applications will serve to validate and exploit the middleware. The term middleware refers to a portable layer between the operating system and the applications. A particular implementation of such a layer is called a platform. To start the project with a maximum of advantages, the middleware part will be based on Distributed Oz, a recently-developed platform for transparent distributed programming. In the middleware part, PIRATES will supply four modules: support for fault tolerance, support for security, support for new network technologies (multicast and high-bandwidth networking) and support for open applications (services and interoperability). The four modules will be integrated with a maximum of transparency. The application "Software on Demand" will be developed in collaboration with Prof. Axel van Lamsweerde. PIRATES is planned over three years. The first year will be used for the design and first prototypes. The second and third years will be used for development, and therefore need more personnel. Each team member requires a workstation equipped for high-performance networking and processing of video/audio. Introduction ------------ ... The introduction of a higher level language does not diminish the interoperability, contrary to what one may think. The {\em interoperability} of a system is the capacity of the system to connect to other systems. The new language offers possibilities that can only be expressed with difficulty in existing languages. Similarly, the possibility to program in C does not exclude programming in assembly language. The world of C is both more expressive and more portable than the world of assembly language. But of course, a program may contain parts written in both C and assembly language. ... Parallelism is not an explicit goal of PIRATES. It is true that a parallel execution can be expressed in PIRATES. Nevertheless, given the rapid increase of hardware speeds at constant price, we are convinced that the general questions of geographic distribution are more pertinent than the sole question of increasing the execution speed of some applications. [I do not mention the possibility of parallelizing constraint programs; although it will be done in the larger context of the collaboration with SICS and DFKI, it is outside the scope of PIRATES.] ... State of the art ---------------- ... DCE and CORBA: These systems do not easily support the exchange of complex data (for example, of variable size) and code (procedures) between the address spaces of the processes that are linked together. Furthermore, it is not possible to pass pointers between the address spaces. The heart of the problem is that distribution is implemented as a set of services added to a centralized kernel. These services often have a poorer model of communication between them than is possible within the kernel. ... Ericsson Open Telecom Platform (OTP): The Ericsson OTP is currently one of the most sophisticated platforms for building large concurrent programs. The Swedish company Ericsson has a large experience in the construction of concurrent and distributed systems. Their success in the area of mobile telephony is testimony to this fact. At Ericsson, there has happened recently a revolution in the development of software for telephony. The use of the well-known C++ language has been largely abandoned because of the insurmountable problems to which it led. Large distributed systems at Ericsson are now built with OTP, based on the {\em Erlang} language. One thousand programmers who formerly used C++ are now using Erlang. OTP supplies a number of modules that permit the rapid construction of concurrent programs that are both distributed and robust. A {\em concurrent} program contains multiple entities that can execute simultaneously. Recent Ericsson products (like the {\em Mobility Server}) contain Erlang programs of 300--400 thousand lines, equivalent to 1 million lines of C++. Erlang is a high-level language with an abstract store, threads, first-class procedures, and support to protect itself against external perturbations. Erlang makes a large number of C++ errors impossible (memory leaks, dangling pointers, etc.). The experience of Ericsson shows that it is possible to build large concurrent programs by {\em separating} the sequential parts (which make up 90\% of the program) from the concurrent parts (which make up the remaining 10\%). The ideas of Erlang are a source of inspiration for PIRATES. What Erlang does for concurrency, we hope to do for distribution. Despite its advance over C++, Erlang is not without faults. In particular, distribution is still explicit. Many network errors are not avoided. With its transparency and mobility control, Distributed Oz solves this problem. The developers of Erlang themselves consider Distributed Oz to be a promising direction for the future of distributed programming. ... Research objectives ------------------- PIRATES has three classes of objectives: 1. To design and implement a platform for distributed programming that is transparent, robust, secure, efficient, and open. 2. To validate this platform by designing and implementing two applications. The applications were chosen for their intrinsic merit and will make possible a true validation of the platform. 3. To conceive methods and tools for distributed programming as it is done in PIRATES, as well as disseminating the project results in the international scientific community. To realize these objectives, the project is divided into two major parts: a middleware part and an applications part. The cooperation between these two parts is essential to the success of the project: 1. There will be two distributed applications. In order for an application of realistic size to be implemented with the limited means of the project, the support provided by the middleware must be significant. In particular, the goal is to add fault tolerance and security to the applications with a minimum of effort. Any problem encountered that upon closer examination turns out to be a problem of the middleware will be transferred to the other part of the project and solved there. 2. There will be a middleware layer to supply the base level of functionality for applications. The needs of application developers will be the main guide for adding functionality to the middleware. The same middleware layer will serve the two application of the project, which have been chosen to be quite different from each other. Rather than designing a new operating system (which is a major undertaking, without hope of commercial success if not supported by hardware and software manufacturers), we prefer to add a layer on top of existing systems, without modifying these systems. The middleware part offers a number of services that a standard operating system, such as Solaris or Windows 95, does not offer. In our case, these services are related to distribution. Existing operating systems do not provide these services simply because they were designed at a time when distributed applications were relatively rare. It is important that the two parts (middleware and applications) be attacked independently. This is because the two parts require two ways of thinking that are difficult to combine in the same people. In the applications part, the development must start from the needs of the users. In the middleware part, development must be based on a well-defined model of distributed execution. This must maintain the right level of theoretical rigor. In summary, the organization of the project must take into account its dual character. Middleware ---------- ... The modules fault tolerance and security both attack the problem of external perturbations. These perturbations can be benign (site crashes) or malign (conscious attacks); it is sometimes difficult to distinguish the two (for example, in a denial of service attack). In any case, since the two modules are intended to respond to different kinds of perturbations, they are therefore complementary and it is an advantage to consider them together in the same project. ... Applications ------------ Mobile Groups ------------- A user's computing environment is invariably fixed on a single site. If the user changes site, his environment does not follow him: he loses his files, his links with other users, and so forth. Generally, he must change identification and password. Work has been done to solve this problem. For example, "Virtual Places" is an application developed by America Online. This work is most often based on the World-Wide Web as a support (using HTTP, Java applets, and Netscape plug-ins). Even though the use of Java is an advance over C++, many important questions remain unanswered. For example, solutions for mobility, robustness, and even security are still nonexisting or quite primitive. Sun describes Java as "Network Savvy". This does not mean much more than that Java offers easy network access via common protocols such as HTTP and techniques such as Remote Message Invocation (RMI) to a remote object. Distribution in Java is far from being transparent and robust. The goal of the application "Mobile Groups" is to transform fixed physical workspaces (like offices in a work environment) into logical spaces that can expand and move dynamically. While being as easy to use as a telephone, the application will give multiple opportunities for collaborative work such as videoconferencing and shared tools. Existing applications can be "connected" to Mobile Groups to augment their mobility and robustness. A first use of Mobile Groups will consist in adding mobility to a simple videoconferencing tool. The added value of Mobile Groups, relative to other applications such as Virtual Places or domains such as videoconferencing, finds its origins in the middleware on which it is built: mobility, fault tolerance, security, and efficient utilisation of the network. Mobile Groups will be constructed as an open application, that is, it will be easy to connect existing applications to it. Each of these applications will keep the abilities offered by the platform in which it is written. For example, an application written in PIRATES will have maximal mobility and robustness. A Unix program will generally be less mobile and less robust. Connecting it to Mobile Groups will allow to fix some of its deficiencies. For example, a word processor can be made to act as if it were a true distributed application. Part of the development effort of Mobile Groups will consist in classifying different ways to fix these deficiencies and implementing interfaces for them. ... Links with other projects in the department ------------------------------------------- TELESUN: A World-Wide Multimedia Teleteaching System for Universities BIZNET: Opening and Exploiting the Internet for Business FRISCO: Formal Reasoning in Software Construction IGLOO: Institute of Software Engineering at Charleroi The group Crypto in the department of Electricity has expertise in the area of security and electronic commerce. This group is led by Profs. Jean-Jacques Quisquater and Benoit Macq. ... Workplan -------- Three year duration, to begin in autumn 1997. One of the principal goals of PIRATES is network-transparency. This will permit the application development to start before the middleware part is completely implemented. One of the advantages of network-transparency is to reduce the dependencies between different parts of a big project. We intend to exploit this property as much as possible. ...