COMPONENT STRATEGIES – TRANSITIONING TO COMPONENT TECHNOLOGY AT COPELAND ASSOCIATES

The Copeland Companies is a premier provider of retirement planning products, services, and support. Copeland specializes in assisting not-for-profit organizations in the healthcare, government, and educational markets, as well as for-profit employers with a variety of defined contribution plans. Plans include 403(b) Tax Sheltered Annuity programs, 401(a) plans, 457(b) Deferred Compensation programs, and 401(k) plans. Copeland is focused on providing single source access to multiple products from different financial providers. This provides a complex set of technical demands on their IT staff. Copeland distinguishes itself by offering both diversification and extensive personal counseling. Providing these distinguishing offerings requires robust and complex software systems.

The Copeland Companies are wholly-owned subsidiaries of Travelers Group Inc., one of the nation’s largest diversified financial services companies. As of June 1997, Copeland Associates, Inc. services retirement savings programs through more than 7,000 employers with approximately 940,000 active participants who have accumulated almost $18.3 billion toward their retirement. In addition to supporting record keeping and administration associated with the Copeland Products, Copeland IT staff has recently been tasked with developing applications to manage Traveler’s only products. In the case of Traveler’s only products, administration applications are executed at Copeland’s call center facility. These applications require access the system of record located at Traveler’s in Connecticut. This arrangement has placed an additional set of complex requirements in the lap of Copeland’s IT organization.

Historically, Copeland Associates has managed their participant’s plan data using DB2 running on the AS400. Participants could access plan information or make changes to their plan via a Voice Response Unit (VRU) or by speaking directly with call center representatives. The call center representatives would run AS400 applications developed with RPG. Long development cycles, high maintenance costs, and difficulty in finding and retaining AS400 and RPG talent made Copeland’s decision to migrate away from the proprietary environment and easy choice. Copeland’s senior IT management and technical architects realized that the IT organization should plan an orderly, long-term transition to component based development and open systems. He hoped that component development and open systems would provide a new development platform for his organization. UML, Java, JDBC, CORBA, and NT would all play a role in the new platform.

Transition Overview

Copeland would not reinvent itself overnight, but over many years. The process would take time, but with proper planning, Copeland would achieve interim deliverables and long-term benefits. Copeland employed Genesis Development Corporation to help them plan their transition to this new computing platform. The approach would be based on Genesis’ SureTrackTM transition methodology. The transition would include multiple projects and multiple phases. The initial phase took roughly one month and included a technical, organizational, and business assessment. The assessment identified existing skills and critical systems within the organization. Key business goals and objectives were also identified. Once this information was gathered, a high-level transition plan was put in place.
The plan identified several pilot projects. Delivery of these projects would increase the component skill sets of individual developers and business analysts. Project managers would become intimately familiar with the development lifecycle of component-based projects. The Copeland organizational would make improvements to critical processes such as requirements gathering and quality assurance. Within six months from initiation, tangible business benefits would result from the deployment of critical applications. This would ensure management commitment and help the transition move past the challenges, which would periodically result from attempting to develop application using new technologies and approaches.

Over the first 18 months of the transition, five projects have come on-line. These projects have achieved significant business value and helped Copeland IT organization become more effective through the use of component technology. The business projects have helped refine the development approaches and methodologies, which will be used at Copeland during the years to come. The initial projects are listed below:

* A UML project related to improving requirements gathering and business analysis critical to the success of other business oriented projects.
* A development lifecycle refinement project related to improving and enforcing a more formal development lifecycle on other business oriented projects.
* An Internet based application for allowing participants and plan administrators to manage their benefits programs. * An Intranet based application for supporting customer service representatives assisting participants and plan administrators with the management of their benefits programs.
* A component infrastructure project targeted at reducing development associated with creation and management of enterprise components, which are used in or across other business oriented projects.

Some of these projects, such as the Internet and Intranet administration applications, are directly related to specific business goals. Others are related to the overall transition of Copeland to component technology. These projects, such as the UML project, will benefit Copeland across other business projects.

Leveraging UML for requirements gathering and projects estimates

The first transition project, which was started at Copeland, focused on improving their requirements gathering process. As a result of the initial assessment, it was discovered that Copeland’s requirements gathering process was mostly ad-hoc. This resulted in the delivery of applications, which did not always effectively address business needs. Since Copeland was pursuing a component based approach, UML and particularly a Use Case approach seemed appropriate.

Genesis delivered just-in-time training in UML and the Use Case approach. Training was only given to staff who would immediately be entering into the requirements gathering phase of a real business project. The training was not a traditional course, but rather it was hands-on, and focused specifically at the particular domain being addressed by the project. In addition, at the beginning of this process, select Copeland staff were identified and asked to play the role of facilitators. Genesis provided mentoring through out the requirements gathering process to ensure that progress was being made and the proper techniques were being utilized.

At the conclusion of this project two critical achievements had been made. First, Copeland was confident that business and user needs would be addressed accurately on these projects. Second, Copeland now had a formal approach, along with staff who were experienced and had been successful utilizing this approach. Not surprisingly, project management found that by utilizing the artifacts which resulting from this type of requirements gathering process, they were much more accurate at estimating timeframes associated with the various phases of the project lifecycle.

The biggest challenges associated with leveraging this new and formal approach to requirements gathering, were related to training staff on these new approaches and also to the fact that tools supporting this approach were just being released and were not as stable as was hoped. With respect to helping Copeland staff successfully attack the learning curve, it was critical that training was done just before actual real world work began. It was also critical that mentoring was available during the real world application. The mentoring was front loaded, but reviews needed to occur at periodic points in time.

Copeland also has utilized a variety of modeling tools. None of these tools were without their problems. The closer the inspection, the more problems were found. There was often a tendency to switch from one product to another. This was an issue, and corporate or project wide decisions seemed to drag on. The important thing was that, while not all staff utilized the same tools, a common methodology unified the staff. One of the first business projects to leverage the new approach and tools for gathering requirements and undergoing proper analysis via UML was the Intranet based customer service workstation.

Supporting Intranet application development with Java and CORBA

One of the most immediate business needs was the requirement to quickly release a new application to support in-house customer service representatives (CSRs). This application is known internally as the customer service workstation (CSW). CSW would allow CSRs to service the accounts of plan participants. Typical interactions might include balance inquiry, transfer of funds, allocation changes, re-balancing of accounts, loan inquiry, etc. It was decided that CSW would attempt to leverage open systems platforms, the UML approach, and component technology.

Specifically, the user interface portion of the application would be deployed on NT and 95. It would be developed in Java and utilize CORBA to gain access to a middle-tier application server. The application server would be developed in Java and would initially be deployed on NT. The application server would utilize JDBC to access legacy DB2 tables and stored procedures, which were hosted on the AS400. Since the application server was developed in Java, it would be possible, at a later date, to re-deploy the middle-tier on UNIX or even on the AS400. The high-level application architecture for CSW includes the following types of generic objects:

The objects listed in the table above are described generically. Based on the business requirements, a traditional business object model was created, and then an associated context object and the supporting source, table and/or queue objects would be modeled. Once the Use Cases and Scenarios were created, the façade and controller objects would be modeled. The traditional business object model included objects such as Account, Product, MoneySource, Transaction, OutstandingLoan, ModeledLoan, etc. See figure 1.

These objects are all designated as Context Objects. Since Context Objects are in fact CORBA objects, they were first defined in OMG IDL and then implemented in Java. The Context Object implementations would be deployed within the application server. In an effort to keep the front-end application simpler and in order to reduce network traffic between the front-end and the application server, Context Objects are not utilized directly by the front-end application. Instead, Façade objects were introduced to the application architecture. This ensured that front-end applications would be shielded from some of the complexity of the business object model and that the minimal network operations would performed. Even though the context objects are not currently accessed across a network, Copeland is comfortable knowing that their business objects can be accessed across a network via the IIOP standard.

image

Gamma, et al, defines the Façade pattern as follows: “[the façade pattern] provides a unified interface to a set of interfaces in a subsystem. Façade defines a higher-level interface that makes the subsystem easier to use.” When modeling the Façade layer, a new Façade was created for each Use Case. The Façade would support all of the scenarios associated with the Use Case. Each Façade would be implemented using the underlying set of Context Objects (Business Objects). Being CORBA Objects, Façades would also be defined in OMG IDL and implemented in Java. The Façade objects would also be deployed within the application server. This ensured that all of the calls between the Façade and the Context objects would be optimized to avoid TCP/IP. Calls to the Façade would be made across IIOP (and thus TCP/IP) but these calls would be defined so that repeated distributed calls could be avoided and network performance would not be an issue. In a sense, the Context Object model is a pure object model with no specialization for a particular usage or deployment scenario. The Façade Object model is presented as the programmer’s model and is optimized for a particular usage or deployment scenario. See figure 1. Context Objects would be re-used as new Use Cases were introduced and new Façade Objects implemented.
For example, figure 1 shows an Account object, which is related to many MoneySource objects. The MoneySource object is used to differentiate employee contributions from employer contributions. Each of the valid MoneySource objects is associated with one or more Transactions objects. When the front-end application needs to access transaction history information, it deals only with the TransactionHistory Façade. By doing this, the front-end is shielded from the details associated with the Account, MoneySource and Transaction objects.

Front-end applications would implement a Controller object in Java for each Use Case. The Controller objects would acquire CORBA references to any Facades it needed via a local Java class, known as the FacadeDispenser. The FaçadeDispenser class will ensure that even if two different controllers need the same type of Façade, that both controllers refer to the same Façade object. When the front-end application starts up, the FacadeDispenser would ask the application server to explicitly construct a new Façade Object for each Use Case. This means that the application server has one Façade, of each type, for every front-end application executing. A thread per object policy is used to ensure that clients receive proper server responsiveness without compromising concurrency. Since the Copeland call center has less than 100 CSRs, this architecture does not currently present a scalability problem.
Context objects are business objects and must interact with the legacy systems in order to support their behavior. For example, transaction contexts, used so that CSR can view historical participant interactions, obtain their information from a DB2 table located at Copeland in New Jersey. ModeledLoan contexts, on the other hand, obtain their information from a series of MQ interactions, which ultimately results in access to IMS, which is located at the Traveler in Connecticut. In an effort to isolate context objects from specific calculated data, the location of raw underlying data, several other generic objects were introduced to the application architecture. The first object introduced was the Source Object. Each Context object has an associated Source object. The purpose of the Source object is to provide the Context with its data. The Source object also isolates the Context objects from the details regarding its data. The Source object would in turn leverage specific calculations, conversions, transformations, and several other architectural objects used to obtain legacy data. These additional architectural objects are known as Table and Queue objects. The Source object would utilize any underlying Table or Queue objects it requires in order to provide the object with its legacy data. See figure 2 and figure 3.

Note that the Account Source will leverage a number of Table objects and a Queue object. Some Source objects, such as the TransactionSource object would only use one Table object. Table objects are developed with JDBC and are specific to the legacy DB2 tables at Copeland. Queue objects purpose is to interface to legacy information via MQ. The Queue objects are developed using an infrastructure service, which was developed at Copeland. This service is known as the MQ Data Access Service. A more detailed discussion of this service follows later in this article.

image

image

Supporting Internet application development with Java and CORBA

Along with the need to quickly release an Intranet based application to support CSRs, Copeland wanted to deploy an Internet based application which would allow their clients to directly make inquiries and modifications to their accounts. Java applets downloaded via the Internet and executed within Browsers seemed like a perfect approach. While the actual functionality was not identical to that of the CSW application, there was a lot of overlap. More accurately, the Internet application would be a rough subset of the functionality provided by CSW. It seemed that CORBA would be a perfect mechanism to allow the Java applets to access objects located in an application server. These ideas would form the initial approach to developing applications for Internet deployment scenarios.

Use Case analysis was performed and sets of Façade objects were designed. These Façade objects would leverage the same set of Context objects utilized within the CSW application. In some cases, additional methods were added to the existing Context objects. While some Context methods were only utilized by one of the applications, a great deal of reuse was achieved at the Context/Source/Table level.
The application was developed, tested and ultimately deployed. Once deployment began, a set of issues related to Java, CORBA and the Internet began to bubble up. While the significant percentage of Copeland clients were very happy with the new application. Many Copeland clients were either not happy with certain details or were unable to execute the application. Let us take a closer look at the types of issues, which were involved:

1. Some clients were using dial-up lines and were not able to download Java applets in a reasonable amount of time.
2. Compatibility problems did not allow Java applet to run identically on Netscape and Microsoft web browsers.
3. Some clients were not using web browsers, which supported Java.
4. Some client’s corporate Internet policies did not allow the execution of externally developed Java applets.
5. Some client’s corporate Internet policies did not allow IIOP to be used across their firewalls.
6. Some client’s corporate Internet policies did not allow HTTP tunneling to be used across their firewalls.

While the majority of clients were very happy with the new application, the above issues kept a significant number of clients from leveraging the deployed application. This caused Copeland to revisit options to the Java applet model. Copeland decided that they needed to deploy a pure HTML based Internet applications. While the HTML based application would not support as robust usability features, it would provide all of their clients with direct access to their plans. It was envisioned that the pure HTML based Internet application would be deployed along side the recently developed Java applet based Internet application.

Leveraging Application Server Technology for Internet application development

Now that Copeland had decided to pursue the development of a pure HTML based Internet application, the question was how would Copeland allow existing code to be leveraged while still providing a pure HTML delivery mechanism. Copeland had a number of options:

1. A CGI based solution
2. A Dynamic HTML based solution
3. An NSAPI based solution
4. An Application Server solution

The key to a successful approach would be one which allowed quick development of the HTML based front-ends and also allowed the existing Facades (or perhaps Context objects) to be directly leveraged. Copeland evaluated a number of solutions and based on its ability to directly address these requirements, Copeland decided on the NetDynamics’ product.

The NetDynamics product provides an environment for developing and deploying Internet based applications. It allows components to be developed and deployed within the NetDynamics Application Server. Traditionally, components are developed within the product and the components relying on NetDynamics to provide native RDBMS transactional support. In Copeland’s case, they would only utilize this capability for storing temporary session based information. The existing Context objects already supported access to their persistent data. Copeland would leverage critical features such as automatic load balancing, security, and high availability. Front-ends can be deployed as either Java applets or pure HTML applications. Copeland is currently only utilizing the HTML based variety.
The final critical feature is the tool’s Platform Adapter Component SDK. This is referred to as the PAC. The PAC SDK is a server-side kit that would enable the existing Copeland Facades to be deployed as plug-and-play components in the NetDynamics Application Server. See figure 4. These server objects, known as PACs, would support browsing via the NetDynamics’ wizards. This would make it easy to build applications, which needed access to the set of procedures offered by the PAC. Some of the benefits of the PAC SDK are listed below:

image

Development Benefits of the PAC
* API of external system is visible in the NetDynamics Studio
* Visual development support using wizards and editors
Management Benefits of the PAC
* Automatic integration of the PAC into the NetDynamics Command Center
* Support for real-time monitoring, statistics, logging, and parameter configuration of the PAC

While NetDynamics supported the key critical features (HTML based applications and the PAC SDK for interfacing to existing components), there were some problems associated with its usage. Even though NetDynamics is built around the concepts of components and was in fact developed on top of a CORBA based infrastructure, The PAC SDK is note object oriented but procedurally oriented. The product comes from a Database orientation and the PAC was originally seen as a way to access databases or services not directly supported. PAC objects basically provide a set or procedures. They are inherently stateless and can only return data as opposed to references to other PAC or components. If we look at the Context object and the Façade objects, we see the same type of distinction. The Context objects are truly object oriented. They support methods, which accept and return other context objects. The Façade objects are more procedural. They accept and return parameters, which are basic types. Based on the design of the PAC SDK. The PACs could only support Façade objects. If the PAC were object oriented, then Copeland could develop PACs for the Context objects, which would eliminate an additional layer from the architecture.

Leveraging a Distributed Component Infrastructure

While the application architecture utilized within the CSW and Internet application was successful in meeting its goal, it really defines a development approach or a set of best practices. Developers still have many decisions to make when designing typical application elements. Often different developers make different choices and end up re-implementing particular aspect of the system over and over again. While Context objects have been shown to be reusable, there was almost no code, which is leveraged over and over. Reusable infrastructure would ideally exist for the typical developer tasks. Some examples of potential services would include the following:

1 Application Services
2 Meta-Data Services
3 Relational Data Services
4 Security Service
5 Logging Services
6 Message Queueing Data Services

Since we are operating in a distributed manner, the assumption is that all of these services would be accessible in a distributed fashion. Application services would include functionality related to starting, stopping, and managing middle-tier servers. Meta-data services are related to a repository of data used to control the behavior of middle-tier servers, objects, and other services. The Meta-data (or repository) service defines two critical pieces of information. The bits of information needed by servers, objects, or services, and the format of those bits of information. Copeland has a design and proof of concept developed for both of these services.
Relational Data services are used by objects which need to access data managed by RDBMS. Copeland is still in an exploratory stage with respect to these relational data services. This service can take several different forms. Copeland is evaluating a simplified layer above JDBC, a more abstract persistent data service based on dynamic information obtained via the meta-data service, and a persistent manager approach based on the observer pattern. While several designs and proof of concepts have been developed these services are not yet being leveraged. Copeland has preliminary design specifications in place for both security and logging services.

The Message Queuing Data Service (MQDS) consists of several interfaces which provides a CORBA compliant interface to MQSeries interactions. It allows two bi-directional asynchronous MQSeries interactions to be presented as a single CORBA compliant request/response. The MQDS also supports timer based caching of duplicate MQDS requests. This ensures that redundant MQSeries requests can be eliminated. The MQDS hides the complexities and specifics of MQSeries via a simple interface. This interface provides business objects with a simple mechanism to allow information to be sent and retrieved via MQSeries. These interfaces are MQRegistry, MQAccess, MQParser and MQPersist. Please see figure 5.

image

MQAccess is the administrator of this package by providing access to the external world and provides the interaction between the components. MQRegistry provides for a series of operations, which allows for the retrieval of hierarchical information. In this initial release the transaction variables will be created via editing a text file. MQParser is a lightweight class, which uses the message structure received from the MQRegistry service to create a Name Value Pair objects. MQPersist stores the Name Value Pair objects into a hash table via a transaction identifier, which is formulated at the start of the transaction by the MQAccess package. MQPersist will also provide a method to remove the entry once it’s no longer usable. The MQDS service is currently being used successfully by the CSW application. It is the expectation that additional applications will be able to leverage this service in the event that they need access to MQSeries.

Conclusion

Overall, the transition at Copeland has proceeded very well. Copeland has move from an organization developing RPG based AS400 applications, to an organization capable of leveraging UML, a formal component development approach, the Java programming language, and CORBA based communications for developing new systems. Copeland staff has grown significantly in its ability to design, develop, and manage the delivery of component-based systems. Copeland has also begun the work associated with developing reusable infrastructure elements. These elements will ensure that future development efforts can be delivery faster and more reliably by eliminating the need to redevelop functionality required by different business projects.

In terms of specific business projects, The CSW application has been deployed and extended several times. Business goals have been met and the system is being positioned as a template for future development efforts. The Internet project has also been deployed. While its underlying design has been modified over time, it is now meeting all of the critical customer requirements. The Internet and CSW applications are sharing a common set of business objects (Context Objects). Over time, additional applications will also leverage the common set of business objects. Current candidates include VRU applications. The fact that Copeland is achieving some levels of reuse at a business object level, indicates the success of their transition to component systems. While organizational change has been needed to successfully manage shared objects, the benefits of shorter development lifecycle are starting to be seen.

While the transition at Copeland is moving forward successfully, it is not yet complete. Copeland is still in the process of developing a reusable infrastructure. Additional Copeland staff still needs to increase their component development skill sets. While many of the project managers at Copeland have become quite experienced with managing component based projects, more need to gain skills in these areas. As additional business projects move towards a component based approach, more staff will become exposed to the various aspects related to components. Additional mentoring will ensure that Copeland completes their transition in the shortest amount of time, which is required.

COMPONENT STRATEGIES – THE NEXT GENERATION OF CORBA – Cover Story

In 1990, the Object Management Group (OMG), then just a few dozen members strong, introduced the world to the idea of distributed objects with its Common Object Request Broker Architecture (CORBA) 1.0 specification. Since then, acceptance of CORBA as a distributed computing standard has grown rapidly, culminating in widespread commercial acceptance of the CORBA 2.0 specification, its Internet Interoperability Protocol (IIOP), and associated services, such as the CORBA Object Transaction Service (OTS), Event Service, and Security Service. In addition, CORBA is now being commercially integrated with other popular object models, such as Java/JavaBeans and Microsoft’s DCOM/MTS, and with various traditional systems services, such as messaging middleware and transaction managers.
OMG membership has now soared to over 800 (and still growing), and new CORBA-based specifications are being adopted and implemented on a regular basis. If the OMG were viewed as a company, and CORBA as its product line, nobody would even question its enormous success or marketplace dominance in its chosen area of distributed object computing. However, since OMG does not itself create or market running software, it is ultimately dependent on getting other firms — CORBA vendors — to adopt its specifications and generate working products that sell successfully to end-users with real-life applications.

Therefore, it is an even more important measure of OMG’s success that, since the first of such vendors seriously entered the commercial marketplace in the mid-1990s, CORBA products have already evolved through at least two generations. Compared to even a couple of years ago, today’s CORBA products have matured significantly from a perspective of performance, reliability, scalability, administration, and ease of use. Moreover, the CORBA vendors – and their marketplace — have grown significantly, and larger companies have begun to enter the market through acquisitions and licensing.

In the rest if this article, we will trace the evolution of CORBA vendors and their products over the last few years, and provide an analysis of current and upcoming offerings. We will also provide a glimpse of what CORBA products are likely to evolve into over the next few years, as new specifications now being developed by the OMG reach commercial maturity. As applicable, well tie all this in with developments in Java, the Internet, Microsoft DCOM/OTS, and other related products and standards.
First Generation CORBA (1990-1996)

In 1989, when the OMG was founded, distributed object computing was only a gleam in the eye of its founding members. The new OMG moved quickly, adopting a formal architecture – the Object Management Architecture (OMA) — and its first specification, CORBA 1.0, in just one year. As a result, OMG was able to build a lot of industry momentum and mindshare for distributed objects. Overnight, CORBA’s common Interface Definition Language (IDL) and the Object Request Broker (ORB) specification gained a significant following, and OMG started to grow. Over the next several years, OMG also developed specifications for a few CORBA services, including a Naming Service, an Event Service and a Transaction Service, all designed to support the management of objects running on a commercial ORB. Specifications were also developed to allow access to IDL-defined interfaces from C, C++, and Smalltalk.
During this time, a number of vendors announced their intention to deliver commercial CORBA products, and a few such products, with limited functionality, actually became available. However, real commercial success for CORBA, in the form of successful products, came only with the adoption of CORBA 2.0 in 1995. CORBA 2.0 settled a number of open questions about how to locate and send requests to an object on another ORBs, how to program to the ORB interface, and how to find and connect to ORB services. In addition, CORBA 2.0 also provided IIOP as a standard protocol for providing interoperability among different ORBs across the emerging Internet. This gained the attention of Internet vendors like Netscape, who became a public supporter of the CORBA 2.0 specifications, and a commercial licensee of CORBA products that are still embedded into their Navigator browser.
By 1996, vendors were able to ship – and sell – reasonably robust first generation commercial products supporting the CORBA 2.0 standard. In addition to implementations of the core specification, a number of vendors successfully delivered implementations of several of the key service specifications. At that time, there were between ten and fifteen vendors attempting to deliver compliant products. Products were available from hardware vendors, software vendors, and even a few operating system companies. These vendors included Iona, Visigenic, ICL, Sun, IBM, HP, Digital and a variety of smaller companies. However, available products varied widely in both their support of the standards, their robustness, their performance, their support of rapid development and their ease of administration.
The Evolution of CORBA (1996-1998)

Since then, a lot of excellent work has been done within the OMG to both refine the existing specifications and to provide basic extensions to them. Critical specifications such as the core specification and the C++ language binding have been steadily improved. A Java language binding has been approved and vendors have been able to deliver compliant products. The OMG has also been hard at work to replace several deficient specifications. For example, the original Basic Object Adapter (BOA) specification, first defined in CORBA 1.0, was seriously under-defined, forcing vendors to provide so many extensions that the BOA became non-portable. In response, the OMG adopted a new object adapter specification, called the Portable Object Adapter (POA), which covers the space much more robustly.
Another example of a successful but under-defined specification is the Event Service. While the Event Service met its original objectives as a standardized channel-based distribution mechanism for object events, it included no event-channel filtering, creating the possibility of an unwanted flood of event notifications across a widely distributed network. Naturally, vendors provided non-standard filtering mechanisms, prompting the OMG to issue an RFP for a complementary Notification Service. This standardization process is almost complete and will shortly result in adoption of a formal extension to the original Event Service.

The OMG has also been busy defining adjunct specifications where needed including the Objects-By-Value (OBV), Firewall and Asynchronous Messaging. The OBV submission specifies changes to IDL allowing the declaration of types of objects that are to be passed along in request by value, instead of only by reference, as originally specified by CORBA 1.0 and 2.0. The Firewall submission proposes several changes to IIOP that will enable ORBs to communicate more effectively across today’s firewall-filled Internet. The Asynchronous Messaging proposal delineates several changes to the CORBA core to support asynchronous requests and to designate quality-of-service (QOS) levels for message transport.
Along with the maturing of the OMG specifications, we have recently seen a major consolidation within the CORBA vendor community. This consolidation has occurred in several different ways. In 1995, most hardware companies were doing signification development on ORB products for resale. This included HP, Digital, Sun, IBM, and ICL. Since that time, Digital has sold their ORB technology (Object Broker) to venture-funded BEA Systems, which is now combining it with another acquisition, the Tuxedo transaction manager. HP has stopped promoting its ORB-Plus. Sun has largely dropped their NEO product, although Java Soft has decided to include the basic CORBA standard as part of the actual Java language. Meanwhile, IBM has refocused from their DSOM version of CORBA in favor of CORBA-based business frameworks like San Francisco.

We have also seen consolidation within the software vendor community. Tiny Post-Modern was acquired by slightly larger Visigenic, which was in turn acquired by the much larger Borland (now Inprise). Iona’s sales continue to rise and the market recently responded with a very successful IPO. At this point in time, the CORBA vendor community consists of fewer but stronger vendors. This is very good for the industry. As this consolidation occurs, we are seeing a maturation of compliant products from these vendors.

Rapidly Maturing CORBA Products

As a result of these developments, we are starting to see some true second generation ORBs, ready for robust mission-critical applications. For example, Iona’s Orbix has been totally reengineered in both C++ and Java to improve performance and scalability . Iona has added or improved basic administration capabilities, firewall support, and other important features. Inprise’s VisiBroker has also gone through several iterations in both Java and C++. It has undergone extensive optimization and provides excellent raw performance. ICL DAIS is now supporting both C++ and Java and even has an implementation of the new Portable Object Adapter.

If we look at the CORBA Services market today we can purchase four or more different CORBA services implemented in a variety of languages and layered on top of different underlying ORBs. We have seen the emergence to graphical administrative interfaces to the Naming Service, the Interface Repository, and CORBA/COM Interworking bridges. Both Inprise and Iona are shipping extended Naming Services implemented in both C++ and Java. The extended Naming Service provides the ability to support a straightforward (although limited) approach to load balancing. The basic extension allows groups of objects to be bound to a single COS::Name in order to implement round-robin load balancing at resolve time. NEC is also delivering a preliminary Notification Service built on top of Orbix.

In addition to purely CORBA-compliant services, Iona, NEC and Inprise have all been able to deliver firewall services. Running IIOP over the Internet presents some interesting issues. Some are defined by application requirements (such as the need to utilize callbacks) and some are defined by security requirements (such as the need to block incoming file-descriptor usage).
The Visibroker IIOP Gateway allows CORBA applets to send GIOP requests over HTTP instead of IIOP. This is very important for firewall configurations, which do not allow IIOP communication. Unfortunately it can not be utilized if HTTP Tunneling is not allowed or if ORB interoperability is required. The NEC solution is based on the publicly available SOCKS package for providing circuit gateways. The circuit gateways is installed into an ORB via a library or class files. The gateway allows multiple IIOP conversations to be multiplexed over a single file descriptor and proxy server. This can allow pure IIOP to be easily funneled through Intranet firewalls. Internet applicability of the SOCKS package still requires minor alterations to IIOP itself.

Iona has attacked the Internet/Firewall problem from many different angles. On the first front, Iona’s Wonderwall supports HTTP tunneling in a very similar way as Inprise’s IIOP Gateway. Unlike Visibroker, WonderWall allows the ORB to support callbacks over the single outgoing file-descriptor instead of requiring an additional inbound file-descriptor. This is important since many sites do not allow inbound file-descriptors to be used directly. WonderWall also provide the ability to multiplex IIOP requests via the use of multiple IIOP profiles. This means that WonderWall can be used to authenticate, log and route IIOP traffic. All of these things are important in Internet solutions. While providing a number of excellent features, WonderWall still suffers from some problems. WonderWall features such as Internet callback support and HTTP tunneling require Orbix or OrbixWeb clients. Another problem is that CORBA objects which are accessed (via a single IOR) from both the Internet and Intranet must all flow through Wonderwall. Unlike the NEC/SOCKs solution, WonderWall does not dynamically configure itself and requires security administration.

We are also beginning to see products supporting the OMG Object Transaction Service (OTS) specification. Hitachi/Inprise and Encina/Iona both have generally available products. These products support C++ servers and Java clients. Another interesting entrant into the OTS market is BEA’s new M3 product. M3 combines and updates a major CORBA ORB – formerly Digital’s ObjectBroker – with a proven transaction manager – formerly AT&T’s Tuxedo. This is intended to provide CORBA developers the reliability, scalability and manageability they could expect from a Tuxedo based system, meanwhile leveraging the open OTS interface. In essence, this product represents a second generation ORB implemented on top of a fifth generation transaction manager.

BEA’s M3 supports critical components of the CORBA 2.0 specification including IDL, C++ binding, IIOP, FactoryFinder, OTS, and Security. M3 has been specifically designed to provide a very high degree of scalability. Objects are almost always stateless and rely upon XA-compliant RDBMS for storage and retrieval of persistent state. M3 provides a specialized subset of POA object activation/deactivation policies. Typically, object servants are automatically deactivated at either method-end or at transaction-end. M3 can then automatically re-activate the object servant the next time an appropriate reference is utilized. Re-activating the object allows for dynamic load balancing based upon Tuxedo’s highly optimized distributed blackboard.

image

Looking at Figure 1, we can see that within a particular client application, we can have multiple client stubs leveraging the same object reference via a smart pointer. We can also have multiple client applications leveraging an object reference based on the same object identifier (host, port, key tuple). With M3’s preferred stateless architecture, multiple servants are activated for a given object reference/identifier. This can occur each time a request come in, or each time a request associated with a new transaction comes it.

image

In an effort to make the development of transaction processing CORBA systems, BEA extends the specification with their own services or APIs. The main service included is known as TPFramework. Its job is to coordinate object de-activation/re-activation with state management and to integrate state management with XA transaction management. TPFramework attempts to minimize the amount of coding required by the developer. Base classes are provided so that the developer does not need to implement mainline code associated with basic server functionality. Servant base classes are also provided to minimize the amount of code required to develop the actual objects themselves. While the C++ code developed for services relies on BEA specific frameworks, the objects developed are defined with IDL and are accessible from any compliant client ORB (Java or C++).

The other main service is known as TPBootstrap. Its entire purpose is to allow non-M3 ORBs to gain easy access to M3 services. For example, an OrbixWeb applet could use the BEA supplied Java implementation of TP::Bootstrap to easily gain access to the M3 IIOP gateway. The gateway ensures that even pure IIOP access obtains the benefits of M3’s Tuxedo-based transport, known as TGIOP, a specialization of the CORBA specification that underlies IIOP. Once the Bootstrap is obtained, resolve_initial_reference calls provide access to the M3 TS::Current object (required to start/stop transactions), the Security Current Object, and the M3 FactoryFinder object. These steps are diagrammed in figure 2. The use of the TPBootstrap eliminates the need utilize string_to_object/object_to_string as well as the need to distribute the ‘stringified’ object through a non-ORB mechanism. The Java code associated with foreign ORB bootstrapping is included below:

//Java code
// Initialize non-M3 ORB
ORB orb = ORB.init();
// Create M3 bootstrap object using URL for IIOP gateway
Tobj_Bootstrap bs = new Tobj_Bootstrap(orb, “//myserver.beasys.com:4000”);
//Resolve M3 Factory Finder
org.omg.CORBA.Object off = bs.resolve_initial_references(“FactoryFinder”);
FactoryFinder ff = FactoryFinderHelper.narrow(off);
//Resolve Security Current
org.omg.CORBA.Object osc =bs.resolve_initial_references(“SecurityCurrent”);
org.omg.SecurityLevel2.Current sc = SecurityCurrentHelper.narrow(off);
//Resolve Transaction Current
org.omg.CORBA.Object otc = bs.resolve_initial_references(“TransactionCurrent”);
org.omg.CosTransactions.Current tc = TransactionCurrentHelper.narrow(otc);

Embedded CORBA

A big step in the maturing of CORBA is the emergence of new development tools that embed CORBA, but do not necessarily expose it. One such category of tools has become known as an “Application Server”, a three-tier development package which implements a pre-defined architecture that can allow Java, C++, or even HTML front end applications to access middle-tier components. The Application Server environment provides out-of-the-box capabilities like load balancing, transactional access to back-tier relational data, distributed access to business logic, etc. Examples of these tools include products from NetDynamics, SilverStream, Progress, Persistence Software, Kiva, Vision Software, WebLogic, and others.

image

While many of these tools have been developed using CORBA, they do not typically expose the components via CORBA bindings. For example, NetDynamics generates C++, Java or HTML front ends, which access middle-tier components. While Visibroker is used internally by NetDynamics, users do not develop their own CORBA applications to access the middle-tier components. Access to components is only through the proprietary layer using common languages. NetDynamics does provide a generic mechanism to interface to arbitrary legacy components implemented outside of the NetDynamics realm. The Platform Adapter Components (PAC SDK) has been designed to ensure that new NetDynamics applications can be deployed on top of existing legacy applications or services. The NetDynamics architecture is shown in figure 3.
Like many of the Application Server tools on the market, NetDynamic provides support for management, load balancing, database integration, and different front-end environments. Support for pure HTML front ends ensures that even the most security-conscious — or Java-leery — Internet user can access corporate services via the Application Server paradigm.

Extending CORBA

Within the last few years, the OMG has expanded its focus beyond providing open access to distributed object interfaces, and into the area of modeling those objects and their interfaces. OMG now includes within its charter the specification of common business objects for a wide variety of domains, including finance, healthcare, telecommunications, and manufacturing, to name a few. However, while CORBA IDL is fine for specifying the interfaces to such business objects, it was never designed to adequately express the complex behavior and relationships needed to define a business object model. Therefore, OMG is now extending the notion of CORBA specifications to include object modeling techniques.

OMG’s first major move in this direction was to standardize on a version of the Unified Modeling Language (UML), originally developed by Rationale as a fusion of the popular Booch, Rumbaugh, and Jacobsen modeling methodologies. In addition, OMG is now also in the process of standardizing on a variant of UML called the Component Definition Language (CDL). CDL is based on an extension to CORBA called Business Object Component Architecture (BOCA), which is specifically designed to support CORBA-based common business objects. Both of these modeling specifications are being designed to work with a new Meta-Object Facility (MOF), which will allow tools and infrastructures to store and retrieve model information through a common interface. Figure 4 – New OMG Specifications

image

Figure 4 describes the relationship between UML, CDL, MOF and the existing CORBA 2.0 and CORBA Service specifications. Analysis and design work is done in UML, and mapped to CDL, where additional information associated with business object configuration and deployment is added. All this information is stored in the MOF, and IDL interfaces are generated. Depending on the tools used, implementations of the business objects may also be directly generated, or the objects may be associated with custom implementations. The business objects are then hosted by a Business Object Facility (BOF) running over CORBA and CORBA services. In addition, the BOF has run-time access to all the design and configuration information stored in the Meta-Object Facility.

CORBA is also moving into different territory with the CORBA Component Model (CCM) specification, currently in the early stages of development. While the precise scope of the CCM area is still being decided, it will focus on the area of a common ‘packaging’ model for CORBA components. This is similar to the notion of a ‘bean’ in Java, where a set of related interfaces and their implementations can be placed in a package that can be distributed across any CORBA environment, including, of course, the Internet. Not surprisingly, the idea of a CCM is being promoted by many of the same vendors who support notion of Enterprise Javabeans (EJB). Therefore, it is likely that the two standards will converge, with CCM a generic model for packaging CORBA components, and EJB a particular version designed for packages that contain Java implementations only.

CORBA and Microsoft’s DCOM

Just as the OMG specifications (and products supporting them) have evolved, we have seen a similar maturation of Microsoft’s software environment. Microsoft’s initial delivery of OLE/COM supported a thriving market of desktop development components. These technologies required that different development tools supported a common set of interfaces. Supporting these interfaces allowed a variety of products and components to interoperate within the Microsoft operating system environment. While this was very effective at the desktop level, the lack of support for distributed components greatly limited the ability to support truly enterprise level solutions.

In an effort to address distributed system requirements within the OLE/COM environment, Microsoft developed DCOM and ActiveX. While these technologies provide some low-level distributed computing capabilities, they do not really address the requirements of large-scale, robust software systems built from distributed software components. More recently, Microsoft has introduced its Transaction Server (MTS), a more rigorous attempt to host such components in a transaction-oriented run-time environment. Details about these components and their deployment are stored in an Active Directory, and are used by MTS to manage them at run-time. MTS has yet to be proven as a truly scalable and portable environment, even across Microsoft platforms. However, it at least begins to seriously address in the Microsoft world some of the key areas covered by CORBA services.

For some time, CORBA specifications – and commercial implementations of these specifications — have existed to cover interoperability between CORBA and COM. These specifications mainly describe how CORBA interfaces can be mapped to COM interfaces, and vice versa. The most typical commercial use of this technology is to allow CORBA objects running on a server to appear as COM objects on a desktop. However, as both the CORBA and COM worlds have grown, it is becoming a priority to extend the standards to cover interoperability among services, such as transaction management and security. Most notably, Microsoft itself announced a partnership with Iona to integrate MTS with CORBA OTS, allowing objects to participate in transactions that span the two kinds of transactions managers. Microsoft has also announced support for the OMG’s UML and MOF specifications, meaning that tools and services in both the CORBA and DCOM environments will be able to (at least in theory) share common design and run-time information about objects.

Future Trends

Considering what has occurred over the last ten years, where can we expect CORBA technology to go in the next three or four years? First of all, we must remember that the challenges of improving or increasing business drive the evolution of information technology, and CORBA is certainly no exception. Corporations will continue to increase in size through massive mergers, and will need to support ever-increasing numbers of products, orders, transactions, and volumes of data. In addition, as the world gets smaller, they will require complex global distribution of information and functionality across heterogeneous platforms and networks, including an increasingly robust Internet.
All this will drive the current generation of CORBA products to become more robust in terms of administration, scalability, and ease of development. We will see CORBA applications deployed to tens of thousands of users, and CORBA frameworks of business objects supporting hundreds of related applications. These systems will be capable of supporting millions of distributed objects involved in thousands of concurrent transactions. The security capabilities of these systems will improve significantly in the next years as well. In essence, these future products will provide the back plane for true enterprise-wide computing based on software components. As these environment mature, development tools for building these interoperable components will become available as well, based on the emerging class of CORBA standards in this area, such as the BOCA, the MOF, and CCM.
The next great challenge for OMG, and the software industry as a whole, will be standardization of the business components in each vertical domain. This is a more daunting task than the standardization of systems services that has largely characterized OMG to date. While end-users are more than happy to accept a standard interface to a request broker or a transaction manager, it will be interesting to see how quickly they can reach agreement on standard interfaces to a Customer, Account, Policy, or other business component that directly represents their business domain. However, until there is agreement on such common business objects, it will not be possible to built truly reusable, interoperable business components.

Fortunately, work has already begun within the OMG to define such common frameworks in a variety of business areas, including finance, insurance, manufacturing, healthcare, and transportation. This work is being driven by a number of far-sighted end-users and ISVs who see business component interoperability across their whole industry as a key to the long-term viability of their individual businesses. This is particularly true for industries with complex supply chains, and/or where mergers and acquisitions are major business drivers. The more these industries standardize on business components, the lower the overall cost of doing business for everyone.

Bibliography
1: page 14 Business Object Component Architecture (BOCA) Proposal OMG Document: bom/98-01-07
2: CORBA Component Model Multiple Interfaces and Composition OMG TC Document orbos/97-12-21
3: Combined Business Object Facility Interoperability Specification OMG Document Number: bom/ 98-01-10
4: “Selecting Java App Servers”, Object Magazine June 1998: Dan Kara
5: Graphics and bootstrap code sample from BEA Iceberg EAP Documentation
6: Graphics from NetDynamics 4.0 Whitepapers

Harvard University: Parallel Database Systems and a Distributed Data Coherence Strategy

In the following paper I will explore how database applications attempt to exploit a variety of parallel architectures. I will look briefly at two approaches to the text search problem. Next, I will explore more deeply, commercial Relational Database Management Systems (RDBMS) which sit on a layer below a variety of database applications. The actual database applications layered on top of the RDBMS might require high user high transaction volume or lengthy batch type processing. A scalable database application should support both scaleup and speedup. A scaleup would enable the data volume to be increased along with the hardware without impacting execution time. Speedup requires execution time to decrease as hardware is increased [2]. It is important to note that it is the RDBMS which actually exploits the parallel hardware.

The actual hardware architectures focused on will include smp architectures, clustered systems, loosely coupled systems, and a combination of the three. In each case, Several software strategies address a particular hardware architecture. A detailed look at shared memory utilization, using a Distributed Lock Manager (DLM), and data partitioning follows. These strategies allow a RDBMS to utilize many nodes and/or processes in parallel.

These systems all provide some level of scalablity by utilizing parallel hardware. At the same time the stratagies do suffer from several drawbacks. A clustered system is limited by high contention for a single critical resource or process. There can be a large administrative complexity associated data partitioning for loosely coupled machines. I will also propose a solution which builds on an area of extensive research.

Much work has been done on protocols to ensure cache coherence in shared memory systems Implementing these in the RDBMS layer would provide scalability across a large number of distributed computers. The RDBMS would take advantage of an existing network of data and application servers. A particular data server might run the RDBMS on a 32 node hypercube or a single risc processor. I propose that the RDBMS run in parallel across the application and data servers. The line between data servers and application servers would be merged. A directory based protocol could be used to enforcing distributed data coherence across the servers. The RDBMS would take advantage of commands directly supported by current or future hardware. Caching data on the application server would improve the performance by minimizing remote server access.

2. THE TEXT SEARCH PROBLEM

A. The LEXIS Legal Decision Retrieval System

The LEXIS service provides the legal profession with a timely search and retrieval operation on millions of legal documents. The system constructs very large index structures on all key words in each structure [1]. This operation is very time and space intensive. Because of this, LEXIS is unable to address a quickly changing database. The scalability of LEXIS could improve with the use of several strategies used by RDMS. A parallel index build implemented by Informix Software could allow LEXIS to take specific advantage of SMP architectures. This could allow LEXIS to support a more dynamic database. Additionally, LEXIS could support more users accessing a single database if it took advantage a clustered system’s shared disk. This especially true considering the application is read only. If the application were read/write, it might use a DLM in order to take advantage of a clustered architecture. Both schemes are examined in greater detail.

B. The Connectionist Machine Approach

While the Connection Machine (CM) does address a variety of applications, its massive architecture is especially applicable to the text search problem. The CM proves to be very scalable. Both scaleup and speedup are supported by the text search implementation. A subdivision of the entire task is very straight forward. The CM’s primary strategy revolves around distributing both articles and their required processing across the available nodes. Each node hashes its article into vectors. The vectors are saved to disk for future use, but remain in the node’s memory for most of the search operation. The search criteria are combined into another vector. The relationship between the search and article vectors predict a “hit” with 99.99% accuracy [1]. There is no path selecation phase required by the search. Each time a “full scan” of the data is required. The optimization phase required by the RDBMS makes massive partitioning of data and processing much more difficult. A single relational database object cannot be effectively distributed at such a high granularity.

3. RDBMS EXPLOITATION OF PARALLEL HARDWARE ARCHITECTURES

In this section I will focus on how RDBMS take advantage of SMP hardware, clustered systems, and loosely coupled systems. I will point out various advantages and limitations. Database Parallelism can take several forms. Relational operators map easily into a pipelined structure but each stage might be lengthy and require individual completion. Search and Sort are good examples of such operators. These operations can be subdivided if a combine phase is added to the operator. This type of parallelism is often called partitioned parallelism as apposed to pipeline parallelism [2]. Because I/O plays such a significant role, partitioning data across disk increases throughput by allowing parallel access. This requires that cpu intensive work to be partitioned hand in hand with the data. This strategy is used with both SMP systems and loosely couple systems.

A. Exploiting A Symmetric Multi Processor (SMP) Architecture

RDBMS take advantage of a SMP machine in several ways. All of them utilize the shared memory to cache data rows. I/O is minimized and concurrency is optimized. The individual CPUs implement their own internal cache coherence transparent to the RDBMS. Read, write isolation and concurrency at the table level is implemented by the RDMS. Typically read and write locks are supported via latches. The latches control access to a shared resource in memory. A process desiring a latched resource typically spins while another process is in a critical section. A single multithreaded database server can also be split up into virtual servers.

Figure 1

One class of server, perhaps an optimizer, can operate while an I/O server is blocking. Additional servers of a particular class can be added when needed to increase throughput. This allows database use to scale without excessive overhead. Hardware parallelism is exploited to allow more applications to access the database simultaneously. Batch oriented decision support applications also can exploit the hardware parallelism of SMP platforms.

Informix Software and Sequent computers are involved in joint development of a Parallel Data Query (PDQ) project. PDQ allows an index build operation to be serviced by numerous processes. First many processes perform an optimize merge sort on the data to be indexed. Then these processes cooperatively manipulate the index structures in memory. Other processes write dirty pages to disk in synchronization with the manipulating processes. This feature has cut index build time by several orders of magnitude [6].

The project also applies partitioned parallelism to the sort, scan, and join operators [6]. At the heart of the system, is an optimizer with detailed knowledge of how data is partitioned across disks. When a table is created the partitioning method is specified. Data can be separated by range, hashing or in a round-robin fashion.

Figure 2

Round-robin is especially good at applications that need sequential access, Hashing is very effective if tables are to be joined in the retrieve operation [2]. The optimizer spawns processes based upon the query, the data partitioning and data distribution. These processes can communicate and share data via the hardware supported shared memory. The PDQ project has realized a large gain in both speedup and scaleup.

B. Loosely Coupled Architectures
The Gamma database systems extend the communication and cooperation of PDQ to a loosely coupled architecture. The system currently runs on a 32 node Intel iPSC. Each node has its own private disk. All three partitioning methods mentioned above are supported. In addition, several hybrid strategies are supported [2]. Communications between the nodes is handled by a combination of hardware and software routing. The system is more complicated than PDQ, because it does not allow the same level of data sharing. In order to access a particular disk a process must run on the node to which that disk is attached. The data must then be routed to appropriate nodes. This creates a possible hot spot or contention point. Even so, Gamma displays excellent scaleup and speedup. Next I will look at another loosely coupled system.

The NCR 3600 is a highly configurable hardware platform. It can contain up to eight Application Processors (AP), Several Parse Engines (PE) and up to ten Access Module Processors (AMP) with disk subsytems. Data is partitioned across the AMPs is a round robin fashion. These components are connected by A high speed Interconnection Network (IN) called the Y-NET [7].

Figure 3

The APs are collections of two through eight 486 processors. They support a global shared memory with a directory based implementation. This is required over a snoopy cache protocol due to the dual bus architecture of the Y-NET. The APs support virtual memory and raid disk subsystems. The APs can exchange messages with TCP and UNIX signals. Unique process identifiers are also supported across the APs. The APs are used to run applications making database requests. Together the PEs and AMPs retrieve the data. The PEs determine the execution method for an SQL statement. They determine which AMPs should search their disk subsystems. The AMPs send the Data over the Y-NET and it is presented to the AP as a single collection of data. The Y-NET uses a binary tree structure with dual channels.

Figure 4

It is able to combine data from up to ten AMPs and return it in a sorted fashion [7]. This specialized piece of hardware separates the 3600 from a generic hypercube. It speeds response time by implementing a basic database function at the hardware level.

C. Clustered Approaches

The NCR 3600 also supports another method of parallelism. Up to 4 APs can be clustered around a single disk. Many hardware providers allow a single disk system to be accessed simultaneously from multiple computers. The computers must synchronize the activity to assure integrity on the disk. This type of architecture was highly developed in Digital’s VAX Cluster. This technology enables a new computer can be added to the cluster. This way more database users can access a single database. The size of the cluster is limited by the actual disk and bus hardware. Current technology does not allow the system to scale past ten. A Distributed Lock Manager (DLM) is used to manage all I/O. The VAX’s Lock Manager supports system calls to enqueue, dequeue, and report the status of locks requests. It directly supports local buffer chaching to decrease I/O in low frequency update situations. Alternatively, resource versioning is supported for high update situations [3]. It is up to the application to choose the type of locking.

The Oracle Parallel Server (OPS) currently uses such a strategy. It is implemented on many clustered systems including the Encore MultiMax. Because of its overhead, OPS minimizes its use of the DLM. It implements a local versioning strategy for read isolation [8]. OPS has also been optimized to generate unique sequence numbers across nodes. OPS allows concurrent inserts on each node without contention during space allocation [4]. While these optimizations have been implemented, scaleup or speedup are limited by the disk technology or contention for the DLM. The fact remains that this is a well-understood technology and has historically enabled performance gains for an existing hardware solution [5].

The Sequoia is a fault tolerant multi processor machine with shared memory support. Cache coherence is supported with a non-write through cache. This is because of the special memory check pointing required for the fault tolerance. A dual bus supports up to 64 tightly coupled processing elements [10]. A RDBMS cannot use the series 400 memory like a traditional SMP platform. Two processes running on different processor cannot write to the same piece of memory efficiently. This is because of the non-write through cache and the large block size used when writing memory from the cache to the main memory. OPS can also take advantage of the series 400’s many processors. Multiple instances of OPS are run on the different processors. Each instance of OPS has its own private memory. All the instances are able to share a single database on disk. The DLM is used to control parallel access to the database. The entire task must be split by the developer across the available instances of OPS. Even though the instances cannot use shared memory to store shared data they are able pass messages via the shared memory. A DLM would not be neccessary if the disk was assigned to indivdial instances of the RDBMS. This type of system is more scalable than the clustered systems, but requires very complicated partitioning strategies. The partitioning is both critical to performance and a manual task.

4. DISTRIBUTED DATA COHERENCE

My proposal is a RDBMS which allows a single data page to be cached in the memory of many different computers. Data coherence could be enforced with a invalidation based scheme. Each computer would also use its private disk to store actual data pages. The physical disks attached to each computer in the network would form a logical disk for the entire system. This would be conceptual similar to the distributed global memory of the DASH project. The system would use similar notions of home and remote pages. A RDBMS would be able to take advantage of hardware supported cache coherence commands.
In a shared memory machine we have a fixed amount of memory and an easy way to reference it. Instead of recording the state of individual memory addresses. The RDBMS would ensure distributed data coherence by tracking the state of individual disk pages on the home pages. Each computer’s optimizer would determine a query path based on a single global disk layout. The query optimizer would run on each computer and implement the appropriate coherency commands based upon the state and location of required disk pages. This does increase the demands on the optimizer. Research could help determine the optimal page size is when doing reads or write through.

The optimizer could track where a disk page is accessed most often. The data on that page could be moved to the computer that uses it the most. In other words, certain rows might move their “home” location. This would provide an optimal data partitioning that could change along with actual access patterns. The partitioning would be automatic.

The DASH system must deal with potentially long write latency. The RDBMS give us some relief from this problem. In a database application we can perform asynchronous disk writes if we know that we a writing to a log file in a synchronous manner. If each system does synchronous writes to a “home” disk page than we could avoid a long latency for the application.
Hopefully this paper has served as an introduction to existing database parallelism. The alternative proposed could integrate both parallel and non parallel RDBMS into a single distributed system. It would be a RDBMS capable of caching data at many sites simultaneously.

Bibliography
[1] A. R. Hurson, et al, Specialized Parallel Architectures for Textual Datbases”. Advances in Computers, Vol. 30 1990.
[2] D. DeWittZ, J. Gray, “Parallel Database Systems, The Future … “, Communications of the ACM, June 1992 Vol. 35 No. 6.
[3] Digital Eq. “Lock Management Services”, Cpt. 12, VMS Cluster Reference, 1990.
[4] Encore Computer Corp. “The Encore Infinity 90 Series and The Oracle Parallel Server.”, SM- ECC-0992 1992.
[5] Gartner Group , “The Rush to Clusters”, Research Note March 15, 1993.
[6] Informix, co., Sequent Corp., “Parallel Data Query”, Document number 000-20182-70 or DB- 1030, June 1991.
[7] NCR Corp., “NCR System 3600 Product Description”, pages 1-1, NCR E&M San Diego ST- 2119-91, 1992.
[8] Oracle Corp., “Oracle 7 for Digital Alpha AXP”, Part No. 53129.0293, February 1993.
[10] Sequoia, “Series 400 Technical Summary”, Sequoia Systems, Inc. Marborough Mass, pages 1-1, June 1992.
Figure 2 page 86 of [2]
Figure 3 page 14 of [7]
Figure 4 page 28 of [7]

WALL STREET TECHNOLOGIES – Distributed Objects in Financial Services

Considered almost experimental just a few years ago, distributed object technologies have gained a strong foothold in the financial services industry. Expersoft, for example, has delivered distributed object infrastructures to such institutions as Chemical Bank, Goldman Sachs, Canadian Imperial Bank, American Express, Fidelity Investments, and National Australian Bank.
The financial services industry is ideal for this technology. Development projects often have a strong focus on component reuse, must support a variety of network communication models, and must deliver reliable code quickly. Distributed objects are a perfect tool for meeting all three of these requirements.

The following case studies and scenarios demonstrate how Expersoft’s PowerBroker products can be applied to a variety of financial applications, including electronic commerce, derivatives trading, portfolio management, foreign trading, real- time update notification and shared calculation servers.

Case Study: Electronic Commerce and On-line Banking

More than $10 billion worth of retail banking transactions occurred electronically this year, and that number is expected to increase as customers grow more technically savvy and as financial services companies realize the potential cost savings. In creating these applications, distributed object technology can play a major role.

Consider the experience of a leading developer of personal finance software, which is using Expersoft’s PowerBroker tools to provide on-line electronic banking services. The system will allow individuals to keep track of expenses, write both standard and electronic checks, inquire about balances, and access other financial information. The service will ultimately support some of U.S.’s largest commercial banks, with a client base that is expected to grow rapidly over the next decade.

The diagram below outlines how the system will work. An individual’s computer gains dial-in access to a set of distributed services-some offered by the bank, others by the provider-which are included along with the home finance package. These services include security features, information distribution, and features which electronically interact with the supported banks.
All interfaces between the banks and the provider are defined in the CORBA 2.0 IDL (interface definition language), which is compiled by the PowerBroker compiler and accessed using remote method invocations supported by the PowerBroker ORB. The IDL in effect defines a service contract between the banks and the provider, with each implementing their respective responsibilities. The PowerBroker Orb ensures that service requests and responses are delivered to the proper parties, while the PowerBroker administration tools allow the banks and the provider to manage the various server processes. Expersoft products enable the software provider to deliver critical services to its customers that would have been difficult to implement using a traditional architecture.
Multi-tier Cache and Data Servers

Financial applications must support large numbers of users who share data stored in different types of databases spread across the enterprise. Applications might also require real-time notification of data and events. In addition to accessing relatively stagnant data stored in a database, these applications also need immediate access to highly volatile data that is time critical. Let us use a trading application as an example.

Many brokers sell the same stocks or bonds. In a typical system facilitating these sales, all client applications directly access the database. By contrast, a distributed object approach might utilize PowerBroker’s capability to distribute an arbitrary service as a remote object. Here, instead of going directly to a database, clients would access a PowerBroker object server. The server provides distributed access to a trading service, which, in turn, accesses the database. This approach of isolating the client application from the database adds considerable flexibility and allows for additional functionality not supported directly by a database. Such an architecture is supported directly via the PowerBroker APIs, class libraries, preprocessor and IDL compiler. It can be supported either directly through the CORBA 2.0 standard or through extended distributed object services.

In addition to accessing stored data, a trading service might also combine data from different databases or might perform complex calculations. For example, a trade service might cache data-such as foreign currency exchange rates-across different trades. Such an architecture would yield significantly higher performance because an object-based service can make use of domain-specific knowledge not available in a generic database server. A PowerBroker-based service also enables the installation to easily change data models and database vendors at a later date. Moreover, developers can maintain and update the trade service without altering clients of the trade service. Once developed, the trade service can be leveraged across the entire enterprise.

In addition to accessing the relatively stagnant data managed by the trading service, applications might need access to real-time data-such as a stock quote. This information might come from a market data service or exchange rate service that is deployed in addition to the trade service. Such object-based services can be leveraged across the entire enterprise-not by responding to requests, but by distributing data to applications as often as they require. Doing so results in services that can distribute data to client applications as a notification, instead of as a response to a request. For example, when placing a foreign trade, the system need not retrieve the latest exchange rate from a request/response based rate service. Instead, the system simply distributes the current exchange rates directly to the trading service.

Market data must often be distributed in this way as well. It doesn’t really make sense for applications to poll for the most recent prices. Rather, the market data server, which “knows” when prices have changed, should take the initiative in distributing this data-utilizing a data distribution platform such as Reuters or Telerate to obtain the latest prices.
PowerBroker’s object-based, event-driven messaging allows for the development services that can interrupt interested clients when a particular event has occurred. This capability differentiates the PowerBroker ORB from almost all CORBA products on the market. Using a pure CORBA approach to provide this functionality can lead to increased application complexity, dangerous application polling or costly performance bottlenecks associated with using a database as a communication facilitator.

Some Expersoft customers have utilized a request/response interface alongside event-based object messaging to develop very sophisticated services. Consider what happens when an exceptionally large trade is about to take place on a PowerBroker-based trading system. Since the trade service is shared by the entire sales force, it can be used to notify others of the trade in addition to responding to requests that originate from the sales applications. In doing so, the trading service can take advantage of the same notification capabilities that were used to develop the real-time data service described above. The PowerBroker object broadcast mechanism can be leveraged within the trade service to ensure that all connected sales applications are notified of critical orders placed by other sales people. Again, PowerBroker allows a complex service to be developed that might otherwise have proven too difficult to implement.

A similar application architecture is currently in production at a Wall Street bank (see diagram below). The architecture supports many users, provides quick response times, supports real-time update notification, and requires a minimal number of actual database connections. It allows many sales traders to run several applications on their individual workstations sharing common data. Because of this architecture, the applications running on a given workstation will not require excessive resources or present any hidden integrity problems. Even though the applications share common data and all have update capabilities, problems associated with duplicate data and a high degree of concurrency are avoided, as are excessive resource requirements associated with storing the same data in several applications.

The architecture consists of a data service and feed service each running on high- end SMP machines, and a cache service with several GUI applications running on each workstation. To place an order, a GUI application makes a request to its cache service. The cache service then submits the trade to the data service, which provides all Oracle access while offering increased performance, reduced resource requirements and the elimination of integrity problems. The caching service runs on each client workstation and makes requests to the data service. GUI applications access the local cache service directly, while the cache service accesses the data server only when it needs data which is not cached.
In addition to servicing requests from the various cache servers, the data service notifies all cache servers if any of their cached data has been changed. Similarly, a cache service will notify all GUIs if any of their displayed data has been changed at the request of another GUI application. In other words, if several GUI clients are displaying the same data and the cache service modifies the data at the request of one GUI, the other GUI clients will be immediately notified with the new data, as well. Because of these procedures, all displayed data is guaranteed to be current at all times. The three-tiered architecture supporting both request/response and real-time notification provides up-to- the-second information along with excellent performance and resource utilization.
Enterprise Wide Internal Infrastructure

The brokerage community has been among the first to reduce development times and costs by reusing components across various departments, thereby eliminating the need to build the same or similar systems more than once. This approach also ensures that different systems developed by different departments can interoperate.
Traditionally, installations standardize on a particular set of tools-only to discover that different applications still need to be heavily modified to support cross- department communication. At this point, the firm might decide to start an infrastructure group and mandate the use of its API across all departments. The infrastructure group is then responsible for delivering robust API class libraries to support general application development along with financial business services. General development requirements include GUI building, messaging, and DBMS access. Business services include trading, calculators, and portfolio management. The infrastructure group then implements the desired APIs or class libraries, leveraging commercial off-the-shelf software whenever it is available and satisfies their requirements.

Expersoft provides a number of products that accomplish this goal. Internal infrastructure groups at several of the largest financial services companies have implemented portions of their internal infrastructure on top of Expersoft products. A high level diagram of such an infrastructure is included below.

The distributed object paradigm allows the building of business services separately from the applications that use them. By providing a framework for distributing C++ classes and CORBA interfaces, PowerBroker allows services to be spread across different machines, programming languages and network topologies. The distributed object paradigm supports a request/response model, which has proven very effective at addressing calculation engines. PowerBroker supports the distributed object architecture defined by the CORBA standard along with additional extended functionality for providing pass-by value, asynchronous invocation and real-time object messaging.
Typically a department develops components using the internal infrastructure framework and guidelines. The infrastructure group hides the PowerBroker APIs and class libraries within their own component development environment. Application developers can then build department solutions with the corporate framework. By utilizing the common framework and a component based methodology, developers ensure that department solutions will take advantage of currently deployed services. Departments can avoid rebuilding application components and allow other departments to leverage their development efforts. With a PowerBroker-developed framework, all corporate components support a high degree of scalability and provide the needed performance. PowerBroker administration tools can also be used to manage deployed components, with the PowerBroker replicated namespace helping ensure component availability while providing location transparency.

PowerBroker also supports a high-performance, object-based messaging and broadcast capability. Some firms have utilized this for real-time distribution of messages, updates or notifications. One advantage of PowerBroker is its ability to deliver true object orientation. While other messaging tools let structures be passed between networked computers, PowerBroker allows the messages to be composed of C++ class instances-allowing for easy integration with third party or internal class libraries. Passing objects by value instead of by structures allows a message to be easily re-implemented to provide different or enhanced functionality, while still providing the same semantic operations. In addition, the application developer does not need to know the internal composition of the message. With object-based messaging, the object, not the application, “knows” how to implement its own functionality-a capability not found with many other messaging systems. PowerBroker object-based messaging provides robust capabilities for specifying priorities, managing event processing, and performing logical broadcasts.

Organizations can reap many benefits by insuring that different departments use a mandated infrastructure built on top of PowerBroker, including the elimination of development costs associated with building application functionality more than once. The PowerBroker-enabled infrastructure will ensure that components can be leveraged across the organization. Additional benefits like interoperability and reduced maintenance costs will be achieved as well.

Object Magazine – ORB AND ODBMS: THE PERFECT SOLUTION

With system designers, developers, and IT managers becoming more familiar with both Object Request Brokers (ORBs) and Object-Oriented Database Management Systems (ODBMS), many are looking for help differentiating the two. Should I use an ORB or an ODBMS? I hear this question over and over again. I usually begin by saying that the two technologies are different but highly complementary. I feel the question is not “Which technology should I use?” but “How can I use these two technologies together successfully?” While working for Expersoft, I have been involved with many leading edge technologists who have successfully combined the two technologies to create a highly competitive applications.

In most cases, combining both technologies is more powerful and flexible than using either technology alone. I like what John Rymer said in the May 1995 Seybold SnapShots, in an editorial about using databases with distributed object technology. “In a distributed object architecture, a database server plays a vital, but limited role, It provides a service to store and retrieve data… The distribution service is usually an object request broker (ORB). An ORB is a communications backbone.” Below I will discuss several application problems, and how the combination of an ORB with an ODBMS can do the trick.
Using an ODBMS within an Object Server or Object Implementation
Is your application Object Oriented? Would you like native database support to store and retrieve object states? Would you like services to be transparently accessible over a heterogeneous network? Embedding ODBMS access within a distributed object server or object implementation will solve this problem for you.

An ORB allows a particular process, an object server, to instantiate objects at the request of client applications. An object server might also explicitly instantiate objects itself and allow clients to connect their local object surrogate or stub to the previously instantiated object. The object server will then invoke methods on instantiated objects transparently when a client invokes a method on the local object surrogate or stub. An object server should be able to service instances of different classes. The actions that occur when a method is invoked on an object are defined within the implementation of that object not within the object server.
Both the object server process and the actual object implementations can utilize an ODBMS to their benefit. An object implementation could utilize an ODBMS as a persistent store for that objects internal or private information objects. For example, in a derivative trading application, a derivative object might be accessed remotely through an ORB. The ORB would allow calculations on the derivative object to be distributed to a calculation server. The implementation of the derivative object might utilize an ODBMS to persist a contained object which represents its current positions. The ODBMS ensures integrity, archiving, and recovery of this complex information. The developer would make extensive use of the ODBMS API when implementing the derivative object. The derivative object implementation would be considered an ODBMS client. Applications accessing the derivative object via the ORB would not need to access the ODBMS directly. This minimizes ODBMS access and isolates client applications from the underlying persistence mechanism.

Remote Object Faulting

Does your application require many distributed objects? In an object oriented application, all services are provided through objects. An object server could have instantiated several thousands of objects, over a heterogeneous network. With this many objects, your application might require too much memory for a particular machine configuration. An ODBMS could be used to fault in and out instantiated objects based on their actual runtime usage. An object server could access an ODBMS to retrieve an object which hasn’t been used recently. If a method call occurs, the object server would check if that object is in memory, if not, it could retrieve the object from the ODBMS and then invoke the method for the client application. This functionality could be transparent to the developer depending on the ORB implementation or it could be developed into the object server explicitly by using the ODBMS API.
A Unified Object Model

Would your application require an ORB to provide the transparent location of object services, and a database to provide the reliable storage and retrieval of objects? One option is to write an object server yourself, using both the ORB API and the ODBMS API to provide a seamless service which will access the associated underlying persistence mechanism. Another option is to have this persistence mechanism provided as part of the ORB. Expersoft provides this transparent service with ObjectStore, a leading Object database.

Expersoft’s XShell product supports the transmission of objects by value. Full fledged objects can be passed as parameters and return values to remote method calls. This can be thought of as a network deep copy. When would you want to pass an object by value? If a client application constructs a large matrix object through client interaction and wants to pass the particular matrix to a remote calculator object, it makes sense to pass the object by value. By modeling the object as an XShell ADT instead of as a regular C++ class, the XShell preprocessor is able to generate all the code required to stream the object by value across the network. The remote calculator object might want to store the matrix into an ODBMS so that in the future, it doesn’t have to be sent across the network again.
The ObjectStore integration from Expersoft allows any ADT to be written/read from standard memory or ObjectStore’s persistent memory. This allows a single object model to be used, whether it be between ORB clients and ORB servers or ODBMS clients and ODBMS servers. For example, let us assume that a remote matrix server has been developed. A client can instantiate a local matrix server surrogate and connect it to the remote matrix server. It could then invoke the retrieve method. The retrieve method might take a name as a parameter and return a matrix object by value. The implementation of the remote matrix retrieval object could read the matrix object out of ObjectStore’s persistent memory, and return it directly by value to the client via the retrieve call. A unified object model makes it easy to use both technologies without having to map an ODBMS object to an ORB object.

COSS I – Persistence

Does your application require object states to be saved between invocations of your program? As part of the COSS 1 specifications, every ORB should provide a persistence object service. Could an ODBMS be used to support the COSS Persistence Service? Yes. An application might access the COSS Persistence Service through a product provided by an ODBMS. Several vendors contributed to the actual specification. These include Objectivity, Ontos, Persistence Software and Versant. An application might use an vendor’s Persistence Service and an ORB vendors ORB implementation. It is likely that many ORB vendors will provide implementations of the COSS Persistence Service. It is possible that some of these will interface to a commercial ODBMS. Some ORB vendors will provide native implementations of the COSS Persistence Service without interfacing to an ODBMS.

Why would an application use an ODBMS once implementations of the COSS Persistence Service become available? The COSS Persistence Service does provide object storage and retrieval but does not support querying to the same extent as a true ODBMS. Additionally the COSS Persistence Service is defined simply by its API. No where in the specification are implementation details specified. These details are left up the vendor implementing the service. An implementations of the COSS Persistence Service is not required to provide archival or restore facilities. This type of functionality known as an implementation detail.

Embedding an ODBMS within an ORB Implementation

An ORB can utilize an ODBMS to implement CORBA services. In this case, the use of the ODBMS is completely hidden from all ORB clients and exists for the convenience/benefit of the actual ORB implementation. This architecture is discussed for the sake of completeness.
An ORB Implementation provides many services to ORB clients. Providing these services to clients often requires the management of complex data. The ORB could utilize an ODBMS for this complex data management. For example, the ORB needs to keep track of numerous client processes, numerous server processes, and possibly, a very large number of specific connection identifiers. The ORB could be implemented to track this information via standard C structures, C++ objects or ODBMS aware objects. By utilizing an ODBMS, the ORB itself might reduce its memory requirements or provide better fault recovery. The ORB must also keep track of named objects. The ORB could use an internal, proprietary mechanism for storing naming information persistently, or it could utilize an ODBMS. Considering the simplicity of the Naming information, a RDBMS could potentially be used for this as well.
If an ORB has been implemented using an ODBMS, it will probably not expose the ODBMS to its users. The ODBMS would be hidden from ORB developers via the ORBs API. The benefits obtained by using an ODBMS within the ORB Implementation, can also be obtained by using an ODBMS within a remote object server or implementation.

Summary

In this article, I have investigated several different ways to take advantage of both ORB and ODBMS technology within a single distributed application. Most distributed applications can make use of ORB technology as the underlying communications infrastructure. ODBMS technology can be used to store important object state information. With this combination, your application can neatly: store objects, minimize resources, and execute remote methods. In addition, the integration between XShell and ObjectStore provides a unified object model which can be used across the ORB and the ODBMS. This can drastically reduce your development time.

Bibliography
“Heart of the Matter” by John R. Rymer, SnapShots: May 1995, Patricia Seybold Group. Pages 6-7.