计算之智与哲学之慧分享 http://blog.sciencenet.cn/u/huangfuqiang

博文

从数据管理到信息集成:自然的演化

已有 4026 次阅读 2010-6-22 16:37 |个人分类:数据库与知识库|系统分类:海外观察|关键词:学者

博主注:这篇文章总结的很好,值得一读与一番思考。
文章作者:

Mary Roth, Senior engineer and Manager , IBM Silicon Valley Lab
Dan Wolfson, Senior Technical Staff Member and Manager, IBM Silicon Valley Lab
Summary:  The boundaries that have traditionally existed between DBMSs and other data sources are increasingly blurring, and there is a great need for an information integration solution that provides a unified view of all of these services. This article proposes a platform that extends a federated database architecture to support both relational and XML as first class data models, and tightly integrates content management services, workflow, messaging, analytics, and other enterprise application services.

© 2002 International Business Machines Corporation. All rights reserved.

Introduction

The explosion of the Internet and e-business in recent years has caused a secondary explosion of information. Industry analysts predict that more data will be generated in the next three years than in all of recorded history [INFO]. Enterprise business applications can respond to this information overload in one of two ways: they can bend and break under the sheer volume and diversity of such data, or they can harness this information and transform it into a valuable asset by which to gain a competitive advantage in the marketplace.

Because the adoption of Internet-based business transaction models has significantly outpaced the development of tools and technologies to deal with the information explosion, many businesses find themselves unintentionally using the former approach. Significant development resources are spent on quick and dirty integration solutions that cobble together different data management systems (databases, content management systems, enterprise application systems) and transform data from one format to another (structured, XML, byte streams). Revenue is lost when applications suffer from scalability and availability problems. New business opportunities are simply overlooked because the critical nuggets of information required to make a business decision are lost among the masses of data being generated.

In this article, we propose a technology platform and tools to harness the information explosion and provide an end-to-end solution for transparently managing both the volume and diversity of data that is in the marketplace today. We call this technology information integration. IBM provides a family of data management products that enable a systematic approach to solve the information integration challenges that businesses face today. Many of these products and technologies are showcased in the Information Integration technology demo.

The foundation of the platform is a state-of-the art database architecture that seamlessly provides both relational and native XML as first class data models. We believe that database technology provides the strongest foundation for an information integration platform for three significant reasons:

  • First, DBMSs have proven to be hugely successful in managing the information explosion that occurred in traditional business applications over the past 30 years. DBMSs deal quite naturally with the storage, retrieval, transformation, scalability, reliability, and availability challenges associated with robust data management.
  • Secondly, the database industry has shown that it can adapt quickly to accommodate the diversity of data and access patterns introduced by e-business applications over the past 6 years. For example, most enterprise-strength DBMSs have built-in object-relational support, XML capabilities, and support for federated access to external data sources.
  • Thirdly, there is a huge worldwide investment in DBMS technology today, including databases, supporting tools, application development environments, and skilled administrators and developers. A platform that exploits and enhances the DBMS architecture at all levels is in the best position to provide robust end-to-end information integration.

This paper is organized as follows:

  • We briefly review the evolution of the DBMS architecture.
  • We provide a real-world scenario that illustrates the scope of the information integration problem and sketches out the requirements for a technology platform.
  • We formally call out the requirements for a technology platform.
  • We present a model for an information integration platform that satisfies these requirements and provides an end-to-end solution to the integration problem as the next evolutionary step of the DBMS architecture.

Evolution of DBMS technology

Figure 1 captures the evolution of relational database technology. Relational databases were born out of a need to store, manipulate and manage the integrity of large volumes of data. In the 1960s, network and hierarchical systems such as [CODASYL] and IMSTM were the state-of-the -art technology for automated banking, accounting, and order processing systems enabled by the introduction of commercial mainframe computers. While these systems provided a good basis for the early systems, their basic architecture mixed the physical manipulation of data with its logical manipulation. When the physical location of data changed, such as from one area of a disk to another, applications had to be updated to reference the new location.

A revolutionary paper by Codd in 1970 [CODD] and its commercial implementations changed all that. Codd's relational model introduced the notion of data independence, which separated the physical representation of data from the logical representation presented to applications. Data could be moved from one part of the disk to another or stored in a different format without causing applications to be rewritten. Application developers were freed from the tedious physical details of data manipulation, and could focus instead on the logical manipulation of data in the context of their specific application.

Not only did the relational model ease the burden of application developers, but it also caused a paradigm shift in the data management industry. The separation between what and how data is retrieved provided an architecture by which the new database vendors could improve and innovate their products. [SQL] became the standard language for describing what data should be retrieved. New storage schemes, access strategies, and indexing algorithms were developed to speed up how data was stored and retrieved from disk, and advances in concurrency control, logging, and recovery mechanisms further improved data integrity guarantees [GRAY][LIND] [ARIES]. Cost-based optimization techniques [OPT] completed the transition from databases acting as an abstract data management layer to being high-performance, high-volume query processing engines.

As companies globalized and as their data quickly became distributed among their national and international offices, the boundaries of DBMS technology were tested again. Distributed systems such as [R*] and [TANDEM]distributed data. Distributed data led to the introduction of new parallel query processing techniques [PARA], demonstrating the scalability of the DBMS as a high-performance, high-volume query processing engine. showed that the basic DBMS architecture could easily be exploited to manage large volumes of


Figure 1. Evolution of DBMS architecture
Evolution of DBMS architecture

The lessons learned in extending the DBMS with distributed and parallel algorithms also led to advances in extensibility, whereby the monolithic DBMS architecture was replumbed with plug-and-play components [STARBURST]. Such an architecture enabled new abstract data types, access strategies and indexing schemes to be easily introduced as new business needs arose. Database vendors later made these hooks publicly available to customers as Oracle data cartridges, Informix® DataBlades®, and DB2® ExtendersTM.

Throughout the 1980s, the database market matured and companies attempted to standardize on a single database vendor. However, the reality of doing business generally made such a strategy unrealistic. From independent departmental buying decision to mergers and acquisitions, the scenario of multiple database products and other management systems in a single IT shop became the norm rather than the exception. Businesses sought a way to streamline the administrative and development costs associated with such a heterogeneous environment, and the database industry responded with federation. Federated databases [FED] provided a powerful and flexible means for transparent access to heterogeneous, distributed data sources.

We are now in a new revolutionary period enabled by the Internet and fueled by the e-business explosion. Over the past six years, JavaTM and XML have become the vehicles for portable code and portable data. To adapt, database vendors have been able to draw on earlier advances in database extensibility and abstract data types to quickly provide object-relational data models [OR], mechanisms to store and retrieve relational data as XML documents [XTABLES] , and XML extensions to SQL [SQLX].

The ease with which complex Internet-based applications can be developed and deployed has dramatically accelerated the pace of automating business processes. The premise of our paper is that the challenge facing businesses today is information integration. Enterprise applications require interaction not only with databases, but also content management systems, data warehouses, workflow systems, and other enterprise applications that have developed on a parallel course with relational databases. In the next section, we illustrate the information integration challenge using a scenario drawn from a real-world problem.


Scenario

To meet the needs of its high-end customers and manage high-profile accounts, a financial services company would like to develop a system to automate the process of managing, augmenting and distributing research information as quickly as possible. The company subscribes to several commercial research publications that send data in the Research Information Markup Language (RIXML), an XML vocabulary that combines investment research with a standard format to describe the report's meta data [RIXML]. Reports may be delivered via a variety of mechanisms, such as real-time message feeds, e-mail distribution lists, web downloads and CD ROMs.


Figure 2. Financial services scenario
Financial services scenario

Figure 2 shows how such research information flows through the company.

  1. When a research report is received, it is archived in its native XML format.
  2. Next, important meta data such as company name, stock price, earnings estimates, etc., is extracted from the document and stored in relational tables to make it available for real-time and deep analysis.
  3. As an example of real-time analysis, the relational table updates may result in database triggers being fired to detect and recommend changes in buy/sell/hold positions, which are quickly sent off to equity and bond traders and brokers. Timeliness is of the essence to this audience and so the information is immediately replicated across multiple sites. The triggers also initiate e-mail notifications to key customers.
  4. As an example of deep-analysis, the original document and its extracted meta data are more thoroughly analyzed, looking for such keywords as "merger", "acquisition" or "bankruptcy" to categorize and summarize the content. The summarized information is combined with historical information made available to the company's market research and investment banking departments.
  5. These departments combine the summarized information with financial information stored in spread sheetand other documents to perform trend forecasting, and to identify merger and acquisition opportunities.

Requirements

To build the financial services integration system on today's technology, a company must cobble together a host of management systems and applications that do not naturally coexist with each other. DBMSs, content management systems, data mining packages and workflow systems are commercially available, but the company must develop integration software in-house to integrate them. A database management system can handle the structured data, but XML repositories are just now becoming available on the market. Each time a new data source is added or the information must flow to a new target, the customer's home grown solution must be extended.

The financial services example above and others like it show that the boundaries that have traditionally existed between DBMSs, content management systems, mid-tier caches, and data warehouses are increasingly blurring, and there is a great need for a platform that provides a unified view of all of these services. We believe that a robust information integration platform must meet the following requirements:

  • Seamless integration of structured, semi-structured, and unstructured data from multiple heterogeneous sources. Data sources include data storage systems such as databases, file systems, real time data feeds, and image and document repositories, as well as data that is tightly integrated with vertical applications such as SAP or Calypso. There must be strong support for standard meta-data interchange, schema mapping, schema-less processing, and support for standard data interchange formats. The integration platform must support both consolidation, in which data is collected from multiple sources and stored in a central repository, and federation, in which data from multiple autonomous sources is accessed as part of a search, but is not moved into the platform itself. As shown in the financial services example, the platform must also provide transparent transformation support to enable data reuse by multiple applications.
  • Robust support for storing, exchanging, and transforming XML data. For many enterprise information integration problems, a relational data model is too restrictive to be effectively used to represent semi-structured and unstructured data. It is clear that XML is capable of representing more diverse data formats than relational, and as a result it has become the lingua franca of enterprise integration. Horizontal standards such as [EBXML] [SOAP], etc., provide a language for independent processes to exchange data, and vertical standards such as [RIXML] are designed to handle data exchange for a specific industry. As a result, the technology platform must be XML-aware and optimized for XML at all levels. A native XML store is absolutely necessary, along with efficient algorithms for XML data retrieval. Efficient search requires XML query language support such as [SQLX] and [XQuery].
  • Built-in support for advanced search capabilities and analysis over integrated data. The integration platform must be bilingual. Legacy OLTP and data warehouses speak SQL, yet integration applications have adopted XML. Content management systems employ specialized APIs to manage and query a diverse set of artifacts such as documents, music, images, and videos. An inverse relationship naturally exists between overall system performance and the path length between data transformation operations and the source of the data. As a result, the technology platform must provide efficient access to data regardless of whether it is locally managed or generated by external sources, and whether it is structured or unstructured. Data to be consolidated may require cleansing, transformation and extraction before it can be stored. To support applications that require deep analysis such as the investment banking department in the example above, the platform must provide integrated support for full text search, classification, clustering and summarization algorithms traditionally associated with text search and data mining.
  • Transparently embed information access in business processes. Enterprises rely heavily on workflow systems to choreograph business processes. The financial services example above is an example of a macroflow, a multi-transaction sequence of steps that capture a business process. Each of these steps may in turn be a microflow, a sequence of steps executed within a single transaction, such as the insert of extracted data from the research report and the database trigger that fires as a result. A solid integration platform must provide a workflow framework that transparently enables interaction with multiple data sources and applications. Additionally, many business processes are inherently asynchronous. Data sources and applications come up and go down on a regular basis. Data feeds may be interrupted by a hardware or a network failures. Furthermore, end users such as busy stock traders may not want to poll for information, but instead prefer to be notified when events of interest occur. An integration platform must embed messaging, web services and queuing technology to tolerate sporadic availability, latencies and failures in data sources and to enable application asynchrony.
  • Support for standards and multiple platforms. It goes without saying that an integration platform must run on multiple platforms and support all relevant open standards. The set of data sources and applications generating data will not decrease, and a robust integration platform must be flexible enough to transparently incorporate new sources and applications as they appear. Integration with OLTP systems and data warehouses require strong support for traditional SQL. To be an effective platform for business integration, emerging cross-industry standards such as [SQLX] and [XQuery] as well as standards supporting vertical applications [RIXML].
  • Easy to use and maintain. Customers today already require integration services and have pieced together in-house solutions to integrate data and applications, and these solutions are costly to develop and maintain. To be effective, a technology platform to replace these in-house solutions must reduce development and administration costs. From both an administrative and development point of view, the technology platform should be as invisible as possible. The platform should include a common data model for all data sources and a consistent programming model. Metadata management and application development tools must be provided to assist administrators, developers, and users in both constructing and exploiting information integration systems.

Architecture

Figure 3 illustrates our proposal for a robust information integration platform.

  • The foundation of the platform is the data tier, which provides storage, retrieval and transformation of data from base sources in different formats. We believe that it is crucial to base this foundation layer upon an enhanced full-featured federated DBMS architecture.
  • A services tier built on top of the foundation draws from content management systems and enterprise integration applications to provide the infrastructure to transparently embed data access services into enterprise applications and business processes.
  • The top tier provides a standards-based programming model and query language to the rich set of services and data provided by the data and services tiers.

Figure 3. An information integration platform


Information integration platform

The data tier

As shown in the figure, the data tier is an enhanced high performance federated DBMS. We have already described the evolution of the DBMS as a robust, high-performance and extendable technology for managing tructured data. We believe that a foundation based on a DBMS architecture allows us to exploit and extend these key advances to semi-structured and unstructured data.

Storage and retrieval. Data may be stored as structured relational tables, semi-structured XML documents, or in unstructured formats such as byte streams, scanned documents, and so on. Because XML is the lingua franca of enterprise applications, a first class XML repository that stores and retrieves XML documents in their native format is an integral component of the data tier. This repository is a true native XML store that understands and exploits the XML data model, not just a rehashed relational record manager, index manager, and buffer manager. It can act as a repository for XML documents as well as a staging area to merge and consolidate federate data. In this role, meta data about the XML data is as critical as the XML data itself. This hybrid XML/relational storage and retrieval infrastructure not only ensures high performance, and data durability for both data formats, but also provides the 24x7 availability and extensive administrative capabilities expected of enterprise database management systems.

Federation. In addition to a locally-managed XML and relational data store, the data tier exploits federated database technology with a flexible wrapper architecture to integrate external data sources [WRAP]. The external data sources may be traditional data servers, such as external databases, document management systems, and file systems, or they may be enterprise applications such as CICS® or SAP, or even an instance of a workflow. These sources may in turn serve up structured, semi-structured or unstructured data.

The services tier

The services tier draws on features from enterprise application integration systems, content management systems and exploits the enhanced data access capabilities enabled by the data tier to provide embedded application integration services.

Query processing. In addition to providing storage and retrieval services for disparate data, the data tier provides sophisticated query processing and search capabilities. The heart of the data tier is a sophisticated federated query processing engine that is as fluent with XML and object-relational queries as it is with SQL. Queries may be expressed in SQL, SQLX, or XQuery and data may be retrieved as either structured data or XML documents. The federated query engine provides functional compensation to extend full query and analytic capabilities over data sources that do not provide such native operations, and functional extension to enable extended capabilities such as market trend analysis or biological compound similarity search.

In addition to standard query language constructs, native functions that integrate guaranteed message delivery with database triggers [MQDB2] allow notifications to fire automatically based on database events, such as the arrival of a new nugget of information from a real-time data feed.

Text search and mining. Web crawling and document indexing services are crucial to navigate the sea of information and place it within a context usable for enterprise applications. The services tier exploits the federated view of data provided by the data tier to provide combined parametric and full text search over original and consolidated XML documents and extracted meta data.. Unstructured information must be analyzed and categorized to be of use to an enterprise application, and for real-time decisions, the timeliness of the answer is a key component of the quality. The technology platform integrates services such as Intelligent Miner for Text to extract key information from a document and create summaries, categorize data based on predefined taxonomies, and cluster documents based on knowledge that the platform gleans automatically from document content. Built-in scoring capabilities such as Intelligent Miner Scoring integrated into the query language [SQLMM] turn interesting data into actionable data.

Versioning and meta data management. As business applications increasingly adopt XML as the language for information exchange, vast numbers of XML artifacts, such as XML schema documents, DTDs, Web service description documents, etc., are being generated. These documents are authored and administered by multiple parties in multiple locations, quickly leading to a distributed administration challenge. The services tier includes a WebDav-compliant XML Registry to easily manage XML document life cycle and meta data in a distributed environment. [WebDAV] [XRR]. Features of the registry include versioning, locking, and name space management.

Digital asset management. Integrated digital rights management capabilities and privilege systems are essential for controlling access to the content provided by the data tier. To achieve these goals, the information integration platform draws on a rich set of content management features (such as that provided in IBM Content Manager) to provide integrated services to search, retrieve and rank data in multiple formats such as documents, video, audio, etc., multiple languages, and multi-byte character sets, as well as to control and track access to those digital assets.

Transformation, replication and caching. Built-in replication and caching facilities [CACHE] and parallelism provide transparent data scalability as the enterprise grows. Logic to extract and transform data from one format to another can be built on top of constraints, triggers, full text search, and the object relational features of today's database engines. By leveraging these DBMS features, data transformation operations happen as close to the source of data as possible, minimizing both the movement of data and the code path length between the source and target of the data.

The Application Interface

The top tier visible to business applications is the application interface, which consists of both a programming interface and a query language.

Programming Interface. A foundation based on a DBMS enables full support of traditional programming interfaces such as ODBC and JDBC, easing migration of legacy applications. Such traditional APIs are synchronous and not well-suited to enterprise integration, which is inherently asynchronous. Data sources come and go, multiple applications publish the same services, and complex data retrieval operations may take extended periods of time. To simplify the inherent complexities introduced by such a diverse and data-rich environment, the platform also provides an interface based on Web services ([WSDL] and [SOAP]). In addition, the platform includes asynchronous data retrieval APIs based on message queues and workflow technology [MQ] [WORKFLOW]to transparently schedule and manage long running data searches.

Query Language. As with the programming interface, the integration platform enhances standard query languages available for legacy applications with support for XML-enabled applications. [XQuery] is supported as the query language for applications that prefer an XML data model. [SQLX] is supported as the query language for applications that require a mixed data model as well as legacy OLTP-type applications. Regardless of the query language, all applications have access to the federated content enabled by the data tier. An application may issue an XQuery request to transparently join data from the native XML store, a local relational table, and retrieved from an external server. A similar query could be issued in SQLX by another (or the same) application.


Conclusion

The explosion of information made available to enterprise applications by the broad-based adoption of Internet standards and technologies has introduced a clear need for an information integration platform to help harness that information and make it available to enterprise applications. The challenges for a robust information integration platform are steep. However, the foundation to build such a platform is already on the market. DBMSs have demonstrated over the years a remarkable ability to managed and harness structured data, to scale with business growth, and to quickly adapt to new requirements. We believe that a federated DBMS enhanced with native XML capabilities and tightly coupled enterprise application services, content management services and analytics is the right technology to provide a robust end-to-end solution.


Resources

  • [ARIES] C. Mohan, et al.: ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging., TODS 17(1): 94-162 (1992).

  • [CACHE] C. Mohan: Caching Technologies for Web Applications, A Tutorial at the Conference on Very Large Databases (VLDB), Rome, Italy, 2001.

  • [CODASYL] ACM: CODASYL Data Base Task Group April 71 Report, New York, 1971.

  • [CODD] E. Codd: A Relational Model of Data for Large Shared Data Banks. ACM 13(6):377-387 (1970).

  • [EBXML] http://www.ebxml.org.

  • [FED] J. Melton, J. Michels, V. Josifovski, K. Kulkarni, P. Schwarz, K. Zeidenstein: SQL and Management of External Data', SIGMOD Record 30(1):70-77, 2001.

  • [GRAY] Gray, et al.: Granularity of Locks and Degrees of Consistency in a Shared Database., IFIP Working Conference on Modelling of Database Management Systems, 1-29, AFIPS Press.

  • [INFO] P. Lyman, H. Varian, A. Dunn, A. Strygin, K. Swearingen: How Much Information? at http://www.sims.berkeley.edu/research/projects/how-much-info/.

  • [LIND] B. Lindsay, et. al: Notes on Distributed Database Systems. IBM Research Report RJ2571, (1979).

  • [MQ] IBM MQSeries Integrator 2.0: The Next Generation Message Broker, http://www.ibm.com/software/ts/mqseries/library/whitepapers/mqintegrator/msgbrokers.html.

  • [MQDB2] D. Wolfson: Using MQSeries from DB2 Applications, IBM Corporation, 2001, at http://www.ibm.com/developerworks/db2/library/techarticle/wolfson/0108wolfson.html.

  • [OPT] P. Selinger, M. Astrahan, D. Chamberlin, R. Lorie, T. Price: Access Path Selection in a Relational Database Management System. In Proc.of the 1979 ACM SIGMOD International Conference on Management of Data, Boston, Massachusetts, 1979, pp 23-34.

  • [OR] ANSI/NCITS X3.135.10-1998 15-DEC-1998 Database Languages - SQL - Part 10: Object Language Bindings (SQL/OLB), 1998 and ISO/IEC 9075-10:2000 Information technology -- Database languages -- SQL -- Part 10: Object Language Bindings (SQL/ OLB), 2000.

  • [PARA] D. DeWitt and J. Gray: Parallel Database Systems: The Future of High Performance Database Systems, ACM 35(6): 85-98, 1992.

  • [RIXML] RIXML Specification Users Guide & Data Dictionary Report. RIXML.org, 2001, at http://www.rixml.org.

  • [RTREE] A. Guttman: R-Trees: A Dynamic Index Structure for Spatial Searching, SIGMOD 84, p 47-57.

  • [R*] R. Williams, et. al.: R*: An Overview of The Architecture, IBM Research, San Jose, Ca., RJ3325, December 1981.

  • [SCORE]Intelligent Miner Scoring: Administration and Programming for DB2, International Business Machines Corp, 2001, at http://publib.boulder.ibm.com/epubs/pdf/idmr0a00.pdf.

  • [SOAP] Version 1.2 Part 1: Messaging Framework. W3C Working Draft 2, October 2001, at http://www.w3c.org/TR/soap12-part1/. , and SOAP Version 1.2 Part 2: Adjuncts. W3C Working Draft 2, October 2001, at http://www.w3c.org/TR/soap12-part2/.

  • [SQL] ISO International Standard (IS), Information technology - Database language SQL - Part 1: Framework (SQL/Framework), ISO/IEC 9075-1:1999, July 1999 and ISO International Standard (IS), Information technology - Database language SQL - Part 2: Foundation (SQL/Foundation), ISO/IEC 9075-2:1999, July 1999.

  • [SQLMM] SQL Multimedia and Application Packages #x2014 ISO/IEC 13249-6:200x Part 6: Data Mining, Final Committee Draft, 2001.

  • [SQLX] Andrew Eisenberg, Jim Melton: SQL/XML and the SQLX Informal Group of Companies. SIGMOD Record 30(3): 105-108 (2001).

  • [STARBURST] L. Haas, et. al: Starburst Mid-Flight: As the Dust Clears, TKDE 2(1): 143-160, 1990.

  • [SYSTEMR] M. Astrahan, et. al: System R: A Relational Approach to Database Management., TODS 1(2): 97-137 (1976).

  • [TANDEM] Tandem Database Group: NonStop SQL: A Distributed, High-Performance, High-Availability Implementation of SQL, HPTS:60-104, 1987.

  • [WebDAV] Web-based Distributed Authoring and Versioning, http://www.webdav.org.

  • [WORKFLOW] Frank Leymann, Dieter Roller, Andreas Reuter: Production Workflow: Concepts and Techniques. Prentice Hall PTR, Upper Saddle River, NJ. 2000. IBM white paper: e-business Process Automation.

  • [WRAP] M. Tork Roth and P. Schwarz. "Don't scrap it, wrap it! A wrapper architecture for legacy data sources". In Proc. Of the Conf. On Very Large Data Bases (VLDB), Athens, Greece, August 1997.

  • [WSDL] E. Christensen, F. Curbera, G. Meredith, S. Weerawarana: Web Services Description Language (WSDL) 1.1., W3C Note 15 March 2001, at http://www.w3.org/TR/wsdl.

  • [XQuery] D. Chamberlin, J. Clark, D. Florescu, J. Robie, J. Sim#x439on, M. Stefanescu. XQuery 1.0. An XML Query Language. W3C Working Draft 2001. Available at http://www.w3.org/TR/xquery/.

  • [XRR] XML Registry/Repository, at http://www.ibm.com/alphaworks.

  • [XTABLES] C. Fan, J. Funderburk, J. Kiernan, H. Lam, E. Shekita, J. Shanmugasundaram, XTABLES: Bridging Relational Technology and XML, at http://www.ibm.com/developerworks/db2/library/techarticle/0203shekita/0203shekita.pdf.

About the authors

Mary Roth is a senior engineer and manager in the Database Technology Institute for e-Business at IBM's Silicon Valley Lab. She has over 12 years of experience in database research and development. As a researcher at the Almaden Research Center, she contributed key advances in heterogeneous data integration techniques and federated query optimization and led efforts to implement federated database support in DB2. Mary is leading a team of developers to deliver a key set of components for Xperanto, IBM's information integration initiative for distributed data access and integration.

Dan Wolfson is a Senior Technical Staff Member and manager in the IBM Database Technology Institute for e-Business. With more than 15 years of experience in distributed computing, Dan's interests have ranged broadly across databases, messaging, and transaction systems. Dan is a lead architect for Xperanto, focusing on DB2 integration with WebSphere, MQ Series®, workflow, Web services, and asynchronous client protocols.

信息来源于http://www.ibm.com/developerworks/data/library/techarticle/0206roth/0206roth.html#ibm-pcon

https://m.sciencenet.cn/blog-89075-337806.html

上一篇:结构方程模型SEM
下一篇:闲话计算机人才培养100624

1 pkuzeal

该博文允许实名用户评论 评论 (0 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-5-18 11:47

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部