Hide metadata

dc.date.accessioned2017-12-20T09:48:17Z
dc.date.available2017-12-20T09:48:17Z
dc.date.issued2017
dc.identifier.urihttp://hdl.handle.net/10852/59394
dc.description.abstractCloud Computing has seen a tremendous popularity in last several years. A scalable and efficient data center network is essential for a performance capable cloud computing infrastructure. This thesis provides practical solutions to enable an efficient, flexible, multi-tenant network architecture suitable for high-performance cloud computing, using InfiniBand (IB) as a demonstration technology. The work is motivated by the needs of the future data centers to provide efficient cloud solutions for increasing uptake of the cloud technology for both big data and traditional High-Performance Computing (HPC) applications. Research contributions of this thesis lie within three main categories. First, we propose a set of improvements to the fat-tree routing algorithm to make it suitable for HPC workloads in the cloud. Fat-Tree is a popular network topology in HPC systems. Our proposed improvements to the fat-tree routing make it more efficient, provides performance isolation among tenants in multi-tenant systems, and enable routing of both physical end nodes and virtualized end nodes according to the policies set by the provider. Second, we design new network reconfiguration methods to significantly reduce the time it takes to reroute the IB network. Reduced network reconfiguration time means that the interconnection network in a HPC cloud can optimize itself quickly to adapt to changing tenant configurations, faults, running workloads, and current network conditions. Last, we demonstrate a self-adaptive network prototype for IB-based HPC clouds, fully equipped with autonomous monitoring and adaptation, and configurable through a high-level condition-action language for the service providers. The research conducted in this thesis has potential impacts on both private cloud infrastructures, such as medium sized clusters used for enterprise HPC, and public clouds offering innovative HPC solutions to the customers at scale. The industrial application of the thesis is reflected by the eight patent applications resulted from this work.en_US
dc.language.isoenen_US
dc.relation.haspartPaper I: Feroz Zahid, Ernst Gunnar Gran, Bartosz Bogdanski, Bjørn Dag Johnsen, and Tor Skeie. Weighted Fat-Tree Routing Algorithm for Efficient Load-Balancing in InfiniBand Enterprise Clusters. 23rd Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP), 2015. The paper is not available in DUO due to publisher restrictions. The published version is available at: https://doi.org/10.1109/PDP.2015.111
dc.relation.haspartPaper II: Feroz Zahid, Ernst Gunnar Gran, Bartosz Bogdanski, Bjørn Dag Johnsen, and Tor Skeie. Partition-Aware Routing to Improve Network Isolation in InfiniBand Based Multi-tenant Clusters. 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), 2015. The paper is not available in DUO due to publisher restrictions. The published version is available at: https://doi.org/10.1109/CCGrid.2015.96
dc.relation.haspartPaper III: Feroz Zahid, Ernst Gunnar Gran, Bartosz Bogdanski, Bjørn Dag Johnsen, and Tor Skeie. Efficient network isolation and load balancing in multi-tenant HPC clusters. Future Generation Computer Systems (FGCS), Volume 72, 2017. The paper is not available in DUO due to publisher restrictions. The published version is available at: https://doi.org/10.1016/j.future.2016.04.003
dc.relation.haspartPaper IV: Feroz Zahid, Ernst Gunnar Gran, Bartosz Bogdanski, Bjørn Dag Johnsen, and Tor Skeie. SlimUpdate: Minimal Routing Update for Performance-Based Reconfigurations in Fat-Trees 1st IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB), held in conjunction with IEEE International Conference on Cluster Computing (CLUSTER), 2015. The paper is not available in DUO due to publisher restrictions. The published version is available at: https://doi.org/10.1109/CLUSTER.2015.142
dc.relation.haspartPaper V: Feroz Zahid, Ernst Gunnar Gran, Bartosz Bogdanski, Bjørn Dag Johnsen, Tor Skeie, Evangelos Tasoulas. Compact network reconfguration in fat-trees. The Journal of Supercomputing (JSC), Volume 72, Issue 12, 2016. The paper is not available in DUO due to publisher restrictions. The published version is available at: https://doi.org/10.1007/s11227-016-1759-y
dc.relation.haspartPaper VI: Evangelos Tasoulas, Feroz Zahid, Ernst Gunnar Gran, Kyrre Begnum, Bjørn Dag Johnsen, and Tor Skeie. Efficient Routing and Reconfiguration in Virtualized HPC Environments with vSwitch-enabled Lossless Networks. Submitted to Concurrency and Computation: Practice & Experience (CONCURRENCY), Wiley, 2017. The paper is not available in DUO awaiting publishing.
dc.relation.haspartPaper VII: Feroz Zahid, Amir Taherkordi, Ernst Gunnar Gran, Tor Skeie, and Bjørn Dag Johnsen. A Self-Adaptive Network for HPC Clouds: Architecture, Framework, and Implementation. Submitted to IEEE Transactions on Parallel and Distributed Systems (TPDS), 2017. The paper is not available in DUO awaiting publishing.
dc.relation.urihttps://doi.org/10.1109/PDP.2015.111
dc.relation.urihttps://doi.org/10.1109/CCGrid.2015.96
dc.relation.urihttps://doi.org/10.1016/j.future.2016.04.003
dc.relation.urihttps://doi.org/10.1109/CLUSTER.2015.142
dc.relation.urihttps://doi.org/10.1007/s11227-016-1759-y
dc.titleNetwork Optimization for High Performance Cloud Computingen_US
dc.typeDoctoral thesisen_US
dc.creator.authorZahid, Feroz
dc.identifier.urnURN:NBN:no-62076
dc.type.documentDoktoravhandlingen_US
dc.identifier.fulltextFulltext https://www.duo.uio.no/bitstream/handle/10852/59394/1/PhD-Feroz-Zahid-2017.pdf


Files in this item

Appears in the following Collection

Hide metadata