MBrace.Core and MBrace.Azure


Cloud Gotchas

This chapter explores more advanced topics of MBrace and the cloud. In particular, we will look at common misconceptions and errors that occur when programming in MBrace.

Follow the instructions and complete the assignments described below.

Local vs. Remote execution

MBrace makes it possible to execute cloud workflows in the local process just as if they were asynchronous workflows: parallelism is achieved using the local threadpool. This can be done using the cluster.Runlocally() method:

1: 
2: 
cloud { return Environment.MachineName } |> cluster.Run         // remote execution
cloud { return Environment.MachineName } |> cluster.RunLocally  // local execution

As demonstrated above, local versus remote execution comes with minute differences w.r.t. to the computed result as well as observed side-effects.

Let's try a simple example. Just by looking at the example below, can you guess what the difference will be when run locally as opposed to remotely?

1: 
cloud { let _ = printfn "I am a side-effect!" in return 42 }

While the above is a mostly harmless example, what can be said about the example below?

1: 
2: 
3: 
4: 
5: 
6: 
open System.IO
let currentDirectory = Directory.GetCurrentDirectory()
let getContents = cloud { return Directory.EnumerateFiles currentDirectory |> Seq.toArray }

cluster.RunLocally getContents
cluster.Run getContents

Why does the error happen? Can you suggest a way the above could be fixed?

Cloud workflows and serialization I

It is often the case that our code relies on objects that are not serializable. But what happens when this code happens to be running in the cloud?

1: 
2: 
3: 
4: 
5: 
let downloader = cloud {
    let client = new System.Net.WebClient()
    let! downloadProc = Cloud.CreateProcess(cloud { return client.DownloadString("www.fsharp.org") })
    return downloadProc.Result
}

What will happen if we attempt to execute the snippet above?

1: 
cluster.Run(downloader)

Assingment: can you rewrite the snippet above so that it no longer fails? Tip: can you detect what segments of the code entail transition to a different machine?

Cloud workflows and serialization II

Let us now consider the following type implementation:

1: 
2: 
3: 
4: 
5: 
6: 
type Session() =
    let cluster = cluster
    let value = 41

    member s.Increment() =
        cluster.Run(cloud { return value + 1 })

Can you predict what will happen if we run the following line?

1: 
Session().Increment()

Can you fix the problem only by changing the Increment() implementation?

Now, let's try the following example:

1: 
2: 
3: 
4: 
5: 
6: 
module Session2 =
    let cluster = cluster
    let value = 41
    let increment() = cluster.Run(cloud { return value + 1})

Session2.increment()

Can you explain why the behaviour of the above differs from the original example?

Cloud workflows and object identity

Consider the following snippet:

1: 
2: 
3: 
4: 
5: 
let example2 = cloud {
    let data = [| 1 .. 100 |]
    let! proc = Cloud.CreateProcess(cloud { return data })
    return Object.ReferenceEquals(data, proc.Result) 
}

Can you guess its result?

1: 
2: 
cluster.Run example2
cluster.RunLocally example2

Can you explain why this behaviour happens?

Cloud workflows and mutation

Consider the following sample:

1: 
2: 
3: 
4: 
5: 
let example3 = cloud {
    let data = [|1 .. 10|]
    let! _ = Cloud.Parallel [for i in 0 .. data.Length - 1 -> cloud { data.[i] <- 0 } ]
    return data
}

Can you guess its result?

1: 
2: 
cluster.Run example3
cluster.RunLocally example3

Can you explain why this behaviour happens?

Summary

In this tutorial, you've learned how to reason about exceptions and faults in MBrace. Continue with further samples to learn more about the MBrace programming model.

Note, you can use the above techniques from both scripts and compiled projects. To see the components referenced by this script, see ThespianCluster.fsx or AzureCluster.fsx.

namespace System
namespace System.IO
namespace MBrace
namespace MBrace.Core
module BuilderAsyncExtensions

from MBrace.Core
namespace MBrace.Flow
val cluster : MBrace.Thespian.ThespianCluster

Full name: 400-cloud-gotchas.cluster
module Config
val GetCluster : unit -> MBrace.Thespian.ThespianCluster

Full name: Config.GetCluster


 Gets or creates a new Thespian cluster session.
type Environment =
  static member CommandLine : string
  static member CurrentDirectory : string with get, set
  static member Exit : exitCode:int -> unit
  static member ExitCode : int with get, set
  static member ExpandEnvironmentVariables : name:string -> string
  static member FailFast : message:string -> unit + 1 overload
  static member GetCommandLineArgs : unit -> string[]
  static member GetEnvironmentVariable : variable:string -> string + 1 overload
  static member GetEnvironmentVariables : unit -> IDictionary + 1 overload
  static member GetFolderPath : folder:SpecialFolder -> string + 1 overload
  ...
  nested type SpecialFolder
  nested type SpecialFolderOption

Full name: System.Environment
property Environment.MachineName: string
member MBrace.Runtime.MBraceClient.Run : workflow:MBrace.Core.Cloud<'T> * ?cancellationToken:MBrace.Core.ICloudCancellationToken * ?faultPolicy:MBrace.Core.FaultPolicy * ?target:MBrace.Core.IWorkerRef * ?additionalResources:MBrace.Core.Internals.ResourceRegistry * ?taskName:string -> 'T
member MBrace.Runtime.MBraceClient.RunLocally : workflow:MBrace.Core.Cloud<'T> * ?cancellationToken:MBrace.Core.ICloudCancellationToken * ?memoryEmulation:MBrace.Core.MemoryEmulation -> 'T
val printfn : format:Printf.TextWriterFormat<'T> -> 'T

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.printfn
val currentDirectory : string

Full name: 400-cloud-gotchas.currentDirectory
type Directory =
  static member CreateDirectory : path:string -> DirectoryInfo + 1 overload
  static member Delete : path:string -> unit + 1 overload
  static member EnumerateDirectories : path:string -> IEnumerable<string> + 2 overloads
  static member EnumerateFileSystemEntries : path:string -> IEnumerable<string> + 2 overloads
  static member EnumerateFiles : path:string -> IEnumerable<string> + 2 overloads
  static member Exists : path:string -> bool
  static member GetAccessControl : path:string -> DirectorySecurity + 1 overload
  static member GetCreationTime : path:string -> DateTime
  static member GetCreationTimeUtc : path:string -> DateTime
  static member GetCurrentDirectory : unit -> string
  ...

Full name: System.IO.Directory
Directory.GetCurrentDirectory() : string
val getContents : MBrace.Core.Cloud<obj>

Full name: 400-cloud-gotchas.getContents
Directory.EnumerateFiles(path: string) : Collections.Generic.IEnumerable<string>
Directory.EnumerateFiles(path: string, searchPattern: string) : Collections.Generic.IEnumerable<string>
Directory.EnumerateFiles(path: string, searchPattern: string, searchOption: SearchOption) : Collections.Generic.IEnumerable<string>
module Seq

from Microsoft.FSharp.Collections
val toArray : source:seq<'T> -> 'T []

Full name: Microsoft.FSharp.Collections.Seq.toArray
val downloader : MBrace.Core.Cloud<obj>

Full name: 400-cloud-gotchas.downloader
namespace System.Net
Multiple items
type WebClient =
  inherit Component
  new : unit -> WebClient
  member BaseAddress : string with get, set
  member CachePolicy : RequestCachePolicy with get, set
  member CancelAsync : unit -> unit
  member Credentials : ICredentials with get, set
  member DownloadData : address:string -> byte[] + 1 overload
  member DownloadDataAsync : address:Uri -> unit + 1 overload
  member DownloadFile : address:string * fileName:string -> unit + 1 overload
  member DownloadFileAsync : address:Uri * fileName:string -> unit + 1 overload
  member DownloadString : address:string -> string + 1 overload
  ...

Full name: System.Net.WebClient

--------------------
Net.WebClient() : unit
Multiple items
type Session =
  new : unit -> Session
  member Increment : unit -> 'a

Full name: 400-cloud-gotchas.Session

--------------------
new : unit -> Session
val cluster : MBrace.Thespian.ThespianCluster
val value : int
val s : Session
member Session.Increment : unit -> 'a

Full name: 400-cloud-gotchas.Session.Increment
val cluster : MBrace.Thespian.ThespianCluster

Full name: 400-cloud-gotchas.Session2.cluster
val value : int

Full name: 400-cloud-gotchas.Session2.value
val increment : unit -> 'a

Full name: 400-cloud-gotchas.Session2.increment
module Session2

from 400-cloud-gotchas
val example2 : MBrace.Core.Cloud<obj>

Full name: 400-cloud-gotchas.example2
Multiple items
type Object =
  new : unit -> obj
  member Equals : obj:obj -> bool
  member GetHashCode : unit -> int
  member GetType : unit -> Type
  member ToString : unit -> string
  static member Equals : objA:obj * objB:obj -> bool
  static member ReferenceEquals : objA:obj * objB:obj -> bool

Full name: System.Object

--------------------
Object() : unit
Object.ReferenceEquals(objA: obj, objB: obj) : bool
val example3 : MBrace.Core.Cloud<obj>

Full name: 400-cloud-gotchas.example3
Fork me on GitHub