I am exploring computational models and practical methods for assembling heterogeneous distributed systems. This research involves building a specialised library operating system kernel that supports the integration of such systems across virtualisation platforms. I shall post information of significance here as my research progresses.
Modern science relies on workflow technology to capture, process, and analyse data obtained from scientific instruments. Scientific workflows are precise descriptions of experiments in which multiple computational tasks are coordinated based on the dataflows between them. Orchestrating scientific workflows presents a significant research challenge: they are typically executed in a manner such that all data pass through a centralised compute server known as the engine, which causes unnecessary network traffic that leads to a performance bottleneck. These workflows are commonly composed of services that perform computation over geographically distributed resources, and involve the management of dataflows between them. Centralised orchestration is clearly not a scalable approach for coordinating services dispersed across distant geographical locations. During my doctoral studies, my research focused on the construction of decentralised service-oriented orchestration systems. My research provides a scalable decentralised service-oriented orchestration system that relies on a high-level data coordination language for the specification and execution of workflows. This system’s architecture consists of distributed engines, each of which is responsible for executing part of the overall workflow. It exploits parallelism in the workflow by decomposing it into smaller sub workflows, and determines the most appropriate engines to execute them using computation placement analysis. This permits the workflow logic to be distributed closer to the services providing the data for execution, which reduces the overall data transfer in the workflow and improves its execution time. Decentralised orchestration relies on a data-driven approach that allows each sub workflow to be executed as soon as the data needed for its execution becomes available from other sources. Hence, a scheduling mechanism is not required to manage the order in which the sub workflows are orchestrated.
Microsoft Windows Azure Research Award ($20,000), Principal Investigator (PI), 2015 - 2016.