Show simple item record

dc.contributor.authorBurke, Neil
dc.date.accessioned2021-04-29T16:45:26Z
dc.date.available2021-04-29T16:45:26Z
dc.date.issued2021-04-29T16:45:26Z
dc.identifier.urihttp://hdl.handle.net/10222/80446
dc.description.abstractBig data decision support systems are used to interpret meaning from extremely large data sets. The users of such systems rely on decision support systems to provide short, human-readable summarizations to aid the user in the decision making process. An interactive big data decision support system must do all of this within seconds of a user request. This short response window promotes interactivity between the system and its user, enabling the user to make several ad hoc or follow-up queries to the system shortly after receiving a response. In this thesis, we explore the design and development of interactive big data decision support systems that satisfy four key useful characteristics: performance, scalability, availability and consistency. We do this within the context of two applications. We first design and develop a novel interactive reinsurance portfolio analytics system. Our system runs on a cloud architecture and efficiently distributes work to achieve excellent scalability, scaling up to thousands of cores. In order for our system to be highly performant, we design our system to process all data entirely in memory. Our system is made consistent by a decentralized data storage service that guarantees strong consistency for all input data. A queuing system that automatically retries failed tasks ensures that the system is highly available. In a comparison with one of the leading commercial portfolio analytics systems, our system performed approximately 50 times faster. Later, we further improve performance by caching intermediate results between portfolio analyses, allowing extremely complex location-level analytics queries to be processed in only 11 seconds. Without caching, the same queries would have to process hundreds of millions of transformations over terabytes of data. Our second application is Online Analytical Processing (OLAP), where we focus solely on data consistency. We describe a method for quantifying consistency in distributed OLAP systems and present a corresponding Monte Carlo simulation to approximate the level of consistency for quorum-replicated OLAP systems, allowing users to explore their system's level of consistency under different usage scenarios. In a case study, we validate the accuracy of our simulation on a real, interactive OLAP system.en_US
dc.language.isoenen_US
dc.subjectbig dataen_US
dc.subjecthigh-performance computingen_US
dc.subjectdecision supporten_US
dc.titleDesigning and Developing Interactive Big Data Decision Support Systems for Performance, Scalability, Availability and Consistencyen_US
dc.date.defence2021-04-22
dc.contributor.departmentFaculty of Computer Scienceen_US
dc.contributor.degreeDoctor of Philosophyen_US
dc.contributor.external-examinerDr. Owen Kaseren_US
dc.contributor.graduate-coordinatorDr. Milios Evangelosen_US
dc.contributor.thesis-readerDr. Andrew Rau-Chaplinen_US
dc.contributor.thesis-readerDr. Qigang Gaoen_US
dc.contributor.thesis-supervisorDr. Norbert Zehen_US
dc.contributor.thesis-supervisorDr. Oliver Baltzeren_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.manuscriptsNot Applicableen_US
dc.contributor.copyright-releaseNot Applicableen_US
 Find Full text

Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record