Home

Self-healing Data in Largerscale

Jason Cairns

2021-09-20

1 Introduction

This demonstration continues on from the previous general demonstration. Here, self-healing of missing data is performed

1.1 REMOTE MACHINE

From the previous demonstration, a remote dataframe was initially generated. From this dataframe and a formula, a linear model was fit. A summary was derived from the linear model, and from the summary, coefficients were extracted. The variables made use of were cdata, sdata, lmdata, and dfdata. cfdata depended on sdata, which depended on lmdata, depending on dfdata in turn. Let’s remove the fixed value for cdata, which is equivalent to removing the machine storing cdata.

    computationqueue()
    ## Queue: 0 Elements
    datapool()
    ## Pool: 16 Items
    unstore(cdata)
    datapool()
    ## Pool: 15 Items

1.2 LOCAL MACHINE

Upon cdata being removed, an error is thrown if access is attempted, and a recovery signal is sent.

    tryCatch(value(cdata), error=identity)
    ## <simpleError in value.data(cdata): Data lost. Recovering...>

1.3 REMOTE MACHINE

A remote machine storing the computation resulting in cdata performs the recovery.

    str(computationqueue())
    ## Queue: 1 Elements 
    ## [ Back to Front ]
    ##  $ :Computation:
    ## Identifier: Identifier:  139053  
    ## Input:   List of 1
    ##    $ :Computation:
    ##   Identifier: Identifier:  722666  
    ##   Input:   List of 1
    ##      $ :Data:
    ##     Identifier: Identifier:  876653  
    ##     Computation Identifier: Identifier:  365220  
    ##   Value: function (object, ...)   
    ##   Output: Identifier:  995977  
    ## Value: function (x, ...)   
    ## Output:  NULL
    do(receive())
    ##                  Estimate   Std. Error       t value     Pr(>|t|)
    ## (Intercept) -5.684342e-14 5.598352e-15 -1.015360e+01 5.618494e-17
    ## x            1.000000e+00 9.624477e-17  1.039018e+16 0.000000e+00
    computationqueue()
    ## Queue: 0 Elements
    datapool()
    ## Pool: 17 Items

1.4 LOCAL MACHINE

And cdata is now accessible again.

    tryCatch(value(cdata), error=identity)

    ##                  Estimate   Std. Error       t value     Pr(>|t|)
    ## (Intercept) -5.684342e-14 5.598352e-15 -1.015360e+01 5.618494e-17
    ## x            1.000000e+00 9.624477e-17  1.039018e+16 0.000000e+00

1.5 REMOTE MACHINE

If the entire chain of dependencies is deleted, recovery is still possible, as long as there is some self-sufficient computation, equivalent to checkpointing. Here, the initial computation leading to dfdata has no dependencies, so it is able to regenerate dfdata as well as the chain.

    unstore(cdata)
    unstore(sdata)
    unstore(lmdata)
    unstore(dfdata)
    datapool()
    ## Pool: 13 Items

1.6 LOCAL MACHINE

An error occurs if trying to access the now-deleted data

    tryCatch(value(cdata), error=identity)
    ## <simpleError in value.data(cdata): Data lost. Recovering...>

1.7 REMOTE MACHINE

And the regeneration process takes place, making use of continuations to return control in the case of missing dependencies. Note that the choice of queue as data structure is exceedingly inefficient; a stack will be used in place next week, with only O(n) operations required for recovery.

    while(!is.null(r <- receive())) {
                print(computationqueue())
            callCC(function(k) do(r, k))
    }
    ## Queue: 0 Elements 
    ## Queue: 1 Elements 
    ## Queue: 2 Elements 
    ## Queue: 3 Elements 
    ## Queue: 4 Elements 
    ## Queue: 5 Elements 
    ## Queue: 6 Elements 
    ## Queue: 7 Elements 
    ## Queue: 6 Elements 
    ## Queue: 5 Elements 
    ## Queue: 4 Elements 
    ## Queue: 3 Elements 
    ## Queue: 2 Elements 
    ## Queue: 1 Elements 
    ## Queue: 0 Elements

1.8 LOCAL MACHINE

After recovery, the value is available again.

    tryCatch(value(cdata), error=identity)
    ##                  Estimate   Std. Error       t value     Pr(>|t|)
    ## (Intercept) -5.684342e-14 5.598352e-15 -1.015360e+01 5.618494e-17
    ## x            1.000000e+00 9.624477e-17  1.039018e+16 0.000000e+00