Resource-Aware Decomposition of Geoprocessing Services Based on Declarative Request Languages
- Distributed, service-oriented systems are often used today for geospatial data access and processing. However, it is difficult to find methods for easy, flexible, and automatic composition and orchestration of workflow of geo-services. A promising approach at addressing this problem is provided by the Open Geospatial Consortium (OGC) Web Coverage Processing Service (WCPS). This service offers a multidimensional raster processing query language with formal semantics and we believe that this language contains sufficient information for an automatic orchestration. Therefore, this thesis focuses on investigating the means to dynamically and efficiently distribute WCPS-based web service request across several heterogeneous nodes as sub-requests, and managing the execution of, and the aggregation of the results of those sub-requests. Task distribution is based, among others, on the individual node capabilities and availability, network capabilities, and source data locations. A key goal is to dynamically optimize a global service quality function while also considering a global service cost function. This thesis, therefore, involves a highly interdisciplinary approach which combines results from geo-processing, distributed computing, workflow scheduling, service oriented computing, and distributed query optimization and execution. We propose D-WCPS (Distributed WCPS), a framework in which coverage processing query can be dynamically distributed among several WCPS servers . Servers can join the network by registering with any server in the network, and every server publishes its processing capacities and locally available data to the WCPS registry which is mirrored across all servers. Likewise, each of the servers in the framework can decompose a query to a distributed query, and also, coordinate the execution of a distributed query using information from its WCPS registry. Other contributions of this thesis include query optimization and decomposition algorithms; inter-operator, intra-operator, and inter-tuple parallelism methods for coverage processing; cost model and sever calibrations for distributed coverage processing; P2P-based orchestration model; mirrored registry synchronization techniques etc. D-WCPS has been implemented and tested in clusters and clouds. Evaluation of our scheduling and distributed execution model shows remarkable speedups in the execution of different distributed WCPS queries. Several servers can, therefore, efficiently share data and computation with respect to dynamic, resource-aware coverages processing.