MBrace.Core and MBrace.Azure


Using the R type provider with MBrace

In this tutorial, you will learn how you can use MBrace to distribute code that utilises the R Type Provider.

Installing R across your cluster

First of all, we define a bit of MBrace code that performs installation of R components on an MBrace cluster. This assumes that worker processes are run with elevated permisions. As of MBrace.Azure v 1.1.5, bundled cloud service packages have elevated permissions enabled. If your cluster does not come with elevated permissions, please ensure that R is already installed across your workers.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
28: 
29: 
30: 
31: 
32: 
33: 
34: 
35: 
36: 
/// Path to R installer mirror; change as appropriate
let R_Installer = "http://cran.cnr.berkeley.edu/bin/windows/base/R-3.2.2-win.exe"

/// checks whether R is installed in the local computer
let isRInstalled() = Microsoft.Win32.Registry.LocalMachine.OpenSubKey(@"SOFTWARE\R-core") <> null

/// Performs R installation operation on an MBrace cluster
/// Assumes workers running with elevated privileges
let installR () = cloud {
    let installRToCurrentWorker() = local {
        if not <| isRInstalled() then
            do! Cloud.Logf "Installing R in local machine."
            use wc = new System.Net.WebClient()
            let tmp = Path.GetTempPath()
            let tmpExe = Path.Combine(tmp, Path.ChangeExtension(Path.GetRandomFileName(),".exe"))
            do! Cloud.Logf "Downloading R bits..."
            do wc.DownloadFile(Uri R_Installer, tmpExe)
            do! Cloud.Logf "Installing R..."
            let psi = new ProcessStartInfo(tmpExe, "/COMPONENTS=x64,main,translation /SILENT")
            psi.UseShellExecute <- false
            let proc = Process.Start(psi)
            proc.WaitForExit()
            if proc.ExitCode <> 0 then invalidOp "failed to install R in local context"
            do! Cloud.Logf "R installation complete."
    }

    // performs install operation for every worker in the current cluster   
    let! _ = Cloud.ParallelEverywhere(installRToCurrentWorker())
    return ()
}

/// Parallel workflow that verifies whether R is successfully installed across the cluster
let isRInstalledCloud() = cloud {
    let! results = Cloud.ParallelEverywhere (cloud { return isRInstalled ()})
    return Array.forall id results
}

We can now install R across the cluster by calling

1: 
installR() |> cluster.Run

And verify that the operation was successful

1: 
isRInstalledCloud() |> cluster.Run

Deploying R provider code to your cluster

We are now ready to begin using the R type provider with MBrace. Here is a simple, non-parallel example taken from the R Type Provider tutorial.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
let testR() = cloud {
    // Random number generator
    let rng = Random()
    let rand () = rng.NextDouble()

    // Generate fake X1 and X2 
    let X1s = [ for i in 0 .. 9 -> 10. * rand () ]
    let X2s = [ for i in 0 .. 9 -> 5. * rand () ]

    // Build Ys, following the "true" model
    let Ys = [ for i in 0 .. 9 -> 5. + 3. * X1s.[i] - 2. * X2s.[i] + rand () ]

    let dataset =
        namedParams [
            "Y", box Ys;
            "X1", box X1s;
            "X2", box X2s; ]
        |> R.data_frame

    let result = R.lm(formula = "Y~X1+X2", data = dataset)

    let coefficients = result.AsList().["coefficients"].AsNumeric()
    let residuals = result.AsList().["residuals"].AsNumeric()
    return coefficients.ToArray(), residuals.ToArray()
}

cluster.Run (testR())

In this tutorial, you've learned how to use the R type provider using MBrace.

Continue with further samples to learn more about the MBrace programming model.

Note, you can use the above techniques from both scripts and compiled projects. To see the components referenced by this script, see ThespianCluster.fsx or AzureCluster.fsx.

namespace System
namespace System.IO
namespace System.IO.Compression
namespace System.Net
namespace System.Numerics
namespace System.Diagnostics
namespace RDotNet
namespace RProvider
namespace MBrace
namespace MBrace.Core
namespace MBrace.Library
namespace MBrace.Flow
val cluster : MBrace.Thespian.ThespianCluster

Full name: 200-using-r-provider.cluster
module Config
val GetCluster : unit -> MBrace.Thespian.ThespianCluster

Full name: Config.GetCluster


 Gets or creates a new Thespian cluster session.
val R_Installer : string

Full name: 200-using-r-provider.R_Installer


 Path to R installer mirror; change as appropriate
val isRInstalled : unit -> bool

Full name: 200-using-r-provider.isRInstalled


 checks whether R is installed in the local computer
namespace Microsoft
namespace Microsoft.Win32
type Registry =
  static val CurrentUser : RegistryKey
  static val LocalMachine : RegistryKey
  static val ClassesRoot : RegistryKey
  static val Users : RegistryKey
  static val PerformanceData : RegistryKey
  static val CurrentConfig : RegistryKey
  static val DynData : RegistryKey
  static member GetValue : keyName:string * valueName:string * defaultValue:obj -> obj
  static member SetValue : keyName:string * valueName:string * value:obj -> unit + 1 overload

Full name: Microsoft.Win32.Registry
field Win32.Registry.LocalMachine
Win32.RegistryKey.OpenSubKey(name: string) : Win32.RegistryKey
Win32.RegistryKey.OpenSubKey(name: string, permissionCheck: Win32.RegistryKeyPermissionCheck) : Win32.RegistryKey
Win32.RegistryKey.OpenSubKey(name: string, writable: bool) : Win32.RegistryKey
Win32.RegistryKey.OpenSubKey(name: string, permissionCheck: Win32.RegistryKeyPermissionCheck, rights: Security.AccessControl.RegistryRights) : Win32.RegistryKey
val installR : unit -> 'a

Full name: 200-using-r-provider.installR


 Performs R installation operation on an MBrace cluster
 Assumes workers running with elevated privileges
val not : value:bool -> bool

Full name: Microsoft.FSharp.Core.Operators.not
module Cloud

from MBrace.Library
Multiple items
type WebClient =
  inherit Component
  new : unit -> WebClient
  member BaseAddress : string with get, set
  member CachePolicy : RequestCachePolicy with get, set
  member CancelAsync : unit -> unit
  member Credentials : ICredentials with get, set
  member DownloadData : address:string -> byte[] + 1 overload
  member DownloadDataAsync : address:Uri -> unit + 1 overload
  member DownloadFile : address:string * fileName:string -> unit + 1 overload
  member DownloadFileAsync : address:Uri * fileName:string -> unit + 1 overload
  member DownloadString : address:string -> string + 1 overload
  ...

Full name: System.Net.WebClient

--------------------
WebClient() : unit
type Path =
  static val DirectorySeparatorChar : char
  static val AltDirectorySeparatorChar : char
  static val VolumeSeparatorChar : char
  static val InvalidPathChars : char[]
  static val PathSeparator : char
  static member ChangeExtension : path:string * extension:string -> string
  static member Combine : [<ParamArray>] paths:string[] -> string + 3 overloads
  static member GetDirectoryName : path:string -> string
  static member GetExtension : path:string -> string
  static member GetFileName : path:string -> string
  ...

Full name: System.IO.Path
Path.GetTempPath() : string
Path.Combine([<ParamArray>] paths: string []) : string
Path.Combine(path1: string, path2: string) : string
Path.Combine(path1: string, path2: string, path3: string) : string
Path.Combine(path1: string, path2: string, path3: string, path4: string) : string
Path.ChangeExtension(path: string, extension: string) : string
Path.GetRandomFileName() : string
Multiple items
type Uri =
  new : uriString:string -> Uri + 5 overloads
  member AbsolutePath : string
  member AbsoluteUri : string
  member Authority : string
  member DnsSafeHost : string
  member Equals : comparand:obj -> bool
  member Fragment : string
  member GetComponents : components:UriComponents * format:UriFormat -> string
  member GetHashCode : unit -> int
  member GetLeftPart : part:UriPartial -> string
  ...

Full name: System.Uri

--------------------
Uri(uriString: string) : unit
Uri(uriString: string, uriKind: UriKind) : unit
Uri(baseUri: Uri, relativeUri: string) : unit
Uri(baseUri: Uri, relativeUri: Uri) : unit
Multiple items
type ProcessStartInfo =
  new : unit -> ProcessStartInfo + 2 overloads
  member Arguments : string with get, set
  member CreateNoWindow : bool with get, set
  member Domain : string with get, set
  member EnvironmentVariables : StringDictionary
  member ErrorDialog : bool with get, set
  member ErrorDialogParentHandle : nativeint with get, set
  member FileName : string with get, set
  member LoadUserProfile : bool with get, set
  member Password : SecureString with get, set
  ...

Full name: System.Diagnostics.ProcessStartInfo

--------------------
ProcessStartInfo() : unit
ProcessStartInfo(fileName: string) : unit
ProcessStartInfo(fileName: string, arguments: string) : unit
Multiple items
type Process =
  inherit Component
  new : unit -> Process
  member BasePriority : int
  member BeginErrorReadLine : unit -> unit
  member BeginOutputReadLine : unit -> unit
  member CancelErrorRead : unit -> unit
  member CancelOutputRead : unit -> unit
  member Close : unit -> unit
  member CloseMainWindow : unit -> bool
  member EnableRaisingEvents : bool with get, set
  member ExitCode : int
  ...

Full name: System.Diagnostics.Process

--------------------
Process() : unit
Process.Start(startInfo: ProcessStartInfo) : Process
Process.Start(fileName: string) : Process
Process.Start(fileName: string, arguments: string) : Process
Process.Start(fileName: string, userName: string, password: Security.SecureString, domain: string) : Process
Process.Start(fileName: string, arguments: string, userName: string, password: Security.SecureString, domain: string) : Process
val invalidOp : message:string -> 'T

Full name: Microsoft.FSharp.Core.Operators.invalidOp
val isRInstalledCloud : unit -> 'a

Full name: 200-using-r-provider.isRInstalledCloud


 Parallel workflow that verifies whether R is successfully installed across the cluster
type Array =
  member Clone : unit -> obj
  member CopyTo : array:Array * index:int -> unit + 1 overload
  member GetEnumerator : unit -> IEnumerator
  member GetLength : dimension:int -> int
  member GetLongLength : dimension:int -> int64
  member GetLowerBound : dimension:int -> int
  member GetUpperBound : dimension:int -> int
  member GetValue : [<ParamArray>] indices:int[] -> obj + 7 overloads
  member Initialize : unit -> unit
  member IsFixedSize : bool
  ...

Full name: System.Array
val forall : predicate:('T -> bool) -> array:'T [] -> bool

Full name: Microsoft.FSharp.Collections.Array.forall
val id : x:'T -> 'T

Full name: Microsoft.FSharp.Core.Operators.id
member MBrace.Runtime.MBraceClient.Run : workflow:MBrace.Core.Cloud<'T> * ?cancellationToken:MBrace.Core.ICloudCancellationToken * ?faultPolicy:MBrace.Core.FaultPolicy * ?target:MBrace.Core.IWorkerRef * ?additionalResources:MBrace.Core.Internals.ResourceRegistry * ?taskName:string -> 'T
val testR : unit -> 'a

Full name: 200-using-r-provider.testR
Multiple items
type Random =
  new : unit -> Random + 1 overload
  member Next : unit -> int + 2 overloads
  member NextBytes : buffer:byte[] -> unit
  member NextDouble : unit -> float

Full name: System.Random

--------------------
Random() : unit
Random(Seed: int) : unit
val namedParams : s:seq<string * 'a> -> Collections.Generic.IDictionary<string,obj>

Full name: RProvider.Helpers.namedParams
val box : value:'T -> obj

Full name: Microsoft.FSharp.Core.Operators.box
type R =
  static member ``<Error>`` : string

Full name: RProvider.R
Fork me on GitHub