Ph.D. Research Index

Jason Cairns


This is an index of reports composed as part of my research towards my doctoral studies and dissertation; “A Platform for Large-Scale Statistical Modelling Using R”. The core aim of my research is to overhaul the capacity for working with very large datasets among statisticians. To this end, an infrastructure is required that is both geared toward statisticians, as well as physically capable of handling large datasets. Physical capacity is being developed through the conceptualisation and implementation of a highly scalable distributed system that enables the performance of complex operations on statistical data. For it to be more than a proof of concept, the system needs high speed, reliability, and scalability. The project is written with R as it’s entry point in order to command the power of a language purpose-built for statistics. Having it in R also means that statisticians are already familiar with it; beyond increasing ease of uptake, this also means that emergent network effects can potentially lead to great extensions over the base layer provided by the project. I am grateful to be working with Dr. Simon Urbanek and Dr. Paul Murrell as my main and co supervisors respectively.