Show simple item record

dc.contributor.authorBurke, Neil
dc.date.accessioned2015-12-15T16:12:27Z
dc.date.available2015-12-15T16:12:27Z
dc.date.issued2015
dc.identifier.urihttp://hdl.handle.net/10222/64692
dc.description.abstractQuorum-based replication is popular in many distributed online transaction processing data stores, especially key-value stores like Dynamo or Cassandra. By committing each write operation to N different nodes in a distributed system, a data store can achieve a level of fault tolerance. Since writes invariably arrive at different nodes at different times, it is useful to know the way in which the lack of {\em consistency} between nodes will affect the result of read operations. Read and write quorums, a mechanism used to ensure operational consistency between a subset of nodes in the system, can be used to reduce the system's level of inconsistency at the cost of latency. While many distributed key-value data stores have very poor worst case data consistency, they often work well in practice. In 2012, Probabilistically Bounded Staleness (PBS), a method of modelling consistency of key-value systems, was introduced to better analyze the performance of real systems and to help model the trade-off between consistency and latency. This thesis has three central contributions. First is an extension of vOLAP, a real-time distributed online analytical processing (OLAP) system, to a dynamic cloud environment, in order to support an elastic model of automated resource management. On an 11-node AWS cloud, vOLAP is capable of ingesting 140,000 inserts per second and answering 40,000 complex range queries per second. Next, we present a model of quorum-based replication adapted for OLAP-style data stores. The challenges encountered and experimental results from implementing the model within vOLAP are also described. Not surprisingly, quorum-based replication comes at a cost. A vOLAP system with N=3 executes 80,000 inserts per second and 20,000 queries per second. While this is a significant drop in performance from the N=1 case, this version of vOLAP is essentially doing 3 times the work at approximately 2 times the cost in performance, achieving a high level of fault tolerance. Finally, Aggregate Probabilistically Bounded Staleness (A-PBS) is introduced, which adapts ideas originally developed in PBS for the simple key-value setting to a wide class of eventually consistent aggregate data stores. Associated with A-PBS is a simulation framework for exploring the level of consistency of OLAP quorum systems. A Monte Carlo simulation is used to analyze the impact of varying key system parameters on staleness. The A-PBS model illuminated some surprising results about consistency in an aggregate data store under typical query types such as count, sum, max, and mean. For example, relative error rates were found to be typically extremely small even when performing large aggregate queries.en_US
dc.language.isoenen_US
dc.subjectOLAPen_US
dc.subjectClouden_US
dc.subjectConsistencyen_US
dc.subjectAvailabilityen_US
dc.titleElasticity, Availability and Consistency in Cloud-based Aggregate Data Storesen_US
dc.date.defence2015-12-08
dc.contributor.departmentFaculty of Computer Scienceen_US
dc.contributor.degreeMaster of Computer Scienceen_US
dc.contributor.external-examinern/aen_US
dc.contributor.graduate-coordinatorDr. Norbert Zehen_US
dc.contributor.thesis-readerDr. Norbert Zehen_US
dc.contributor.thesis-readerDr. Qigang Gaoen_US
dc.contributor.thesis-supervisorDr. Andrew Rau-Chaplinen_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.manuscriptsNot Applicableen_US
dc.contributor.copyright-releaseNot Applicableen_US
 Find Full text

Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record