Using Data Parallel Cloud Flows to Analyze Historical Event Data

In this example, you learn how to use data parallel cloud flows with historical event data at scale. The data is drawn directly from open government data on the internet. This sample has been adapted from Isaac Abraham's blog.

You start by using FSharp.Data and its CSV Type Provider. Usually the type provider can infer all data types and columns but in this case the file does not include headers, so we’ll supply them ourselves. You use a local version of the CSV file which contains a subset of the data (the live dataset even for a single month is > 10MB)

1:	`type HousePrices = CsvProvider< @"../../data/SampleHousePrices.csv", HasHeaders = true>`

With that, you have a strongly-typed way to parse CSV data.

Here is the input data. (Each of these files is ~70MB but can take a significant amount of time to download due to possible rate-limiting from the server).

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16:

let smallSources = 
  [ "http://prod.publicdata.landregistry.gov.uk.s3-website-eu-west-1.amazonaws.com/pp-2012-part1.csv" ]

(* For larger data you can add more years: *)

let bigSources = 
  [ "http://prod.publicdata.landregistry.gov.uk.s3-website-eu-west-1.amazonaws.com/pp-2012.csv"
    "http://prod.publicdata.landregistry.gov.uk.s3-website-eu-west-1.amazonaws.com/pp-2013.csv"
    "http://prod.publicdata.landregistry.gov.uk.s3-website-eu-west-1.amazonaws.com/pp-2014.csv"
    "http://prod.publicdata.landregistry.gov.uk.s3-website-eu-west-1.amazonaws.com/pp-2015.csv"  ]

(* For testing you can uses micro data: *)

let tinySources = 
  [ "https://raw.githubusercontent.com/mbraceproject/MBrace.StarterKit/master/data/" 
    + "SampleHousePriceFile.csv" ]

Now, stream the data source from the original web location and across the cluster, then convert the raw text to our CSV provided type. Entries are grouped by month and the average price for each month is computed.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14:

//let sources = tinySources
let sources = smallSources
//let sources = bigSources

let pricesTask =
    sources
    |> CloudFlow.OfHttpFileByLine
    |> CloudFlow.collect HousePrices.ParseRows
    |> CloudFlow.averageByKey 
            (fun row -> row.DateOfTransfer.Year, row.DateOfTransfer.Month) 
            (fun row -> float row.Price)
    |> CloudFlow.sortBy fst 100
    |> CloudFlow.toArray
    |> cluster.CreateProcess

A CloudFlow is an MBrace primitive which allows a distributed set of transformations to be chained together. A CloudFlow pipeline is partitioned across the cluster, making full use of resources available: only when the pipelines are completed in each partition are they aggregated together again.

Now observe the progress. Time will depend on download speeds to your data center or location.
For the large data sets above you can expect approximately 2 minutes.

While you're waiting, notice that you're using type providers in tandem with cloud computations. Once we call the ParseRows function, in the next call in the pipeline, we’re working with a strongly-typed object model – so DateOfTransfer is a proper DateTime etc. For example, if you hit "." after "row" you will see the available information includes Locality, Price, Street, Postcode and so on. In addition, all dependent assemblies have automatically been shipped with MBrace. MBrace wasn’t explicitly designed to work with FSharp.Data and F# type providers – it just works.

Now wait for the results.

1: 
2: 
3: 
4:

pricesTask.ShowInfo()
cluster.ShowWorkers()

let prices = pricesTask.Result

Now that you have a summary array of year, month and price data, you can chart the data.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11:

let formatYearMonth (year,month) = 
    sprintf "%s" (DateTime(year, month, 1).ToString("yyyy-MMM"))

let chartPrices prices = 
    prices
    |> Seq.map(fun (ym, price) -> formatYearMonth ym, price)
    |> Chart.Line
    |> Chart.WithOptions(Options(curveType = "function"))
    |> Chart.Show

chartPrices prices

Price over time (2012 subset of data)

Persisted Cloud Flows

To prevent repeated work, MBrace supports something called Persisted Cloud Flows (known in the Spark world as RDDs). These are flows whose results are partitioned and cached across the cluster, ready to be re-used again and again. This is particularly useful if you have an intermediary result set that you wish to query multiple times.

In this case, you now persist the first few lines of the computation (which involves downloading the data from source and parsing with the CSV Type Provider), ready to be used for any number of strongly-typed queries we might have: –

1: 
2: 
3: 
4: 
5: 
6: 
7:

// download data, convert to provided type and partition across nodes in-memory only
let persistedHousePricesTask =
    sources
    |> CloudFlow.OfHttpFileByLine 
    |> CloudFlow.collect HousePrices.ParseRows
    |> CloudFlow.persist StorageLevel.Memory
    |> cluster.CreateProcess

Now observe progress:

1: 
2:

persistedHousePricesTask.ShowInfo()
cluster.ShowWorkers()

Now wait for the results:

1:	`let persistedHousePrices = persistedHousePricesTask.Result`

The input file will have been partitioned depending on the number of workers in your cluster. The partitions are already assigned to different workers. With the results persisted on the nodes, we can use them again and again.

First, get the total number of entries across the partitioned, persisted result:

1: 
2: 
3: 
4:

let count =
    persistedHousePrices
    |> CloudFlow.length
    |> cluster.Run

Next, get the first 100 entries:

1: 
2: 
3: 
4: 
5:

let first100 =
    persistedHousePrices
    |> CloudFlow.take 100
    |> CloudFlow.toArray
    |> cluster.Run

Next, get the average house price by year/month.

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8:

let pricesByMonthTask =
    persistedHousePrices
    |> CloudFlow.averageByKey 
          (fun row -> (row.DateOfTransfer.Year, row.DateOfTransfer.Month)) 
          (fun row -> float row.Price)
    |> CloudFlow.sortBy fst 100
    |> CloudFlow.toArray
    |> cluster.CreateProcess

Make a chart of the results. This will be the same chart as before, but based on persisted results.

1: 
2: 
3: 
4: 
5: 
6:

pricesByMonthTask.ShowInfo()
pricesByMonthTask.Result

let pricesByMonth = pricesByMonthTask.Result

pricesByMonth |> chartPrices

Next, get the average prices per street. This takes a fair while since there are a lot of streets. We persist the results: CloudFlow.cache is the same as persiting to memory.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11:

let averagePricesTask =
    persistedHousePrices
    |> CloudFlow.averageByKey
          (fun row -> (row.TownCity, row.District, row.Street))
          (fun row -> float row.Price)
    |> CloudFlow.cache
    |> cluster.CreateProcess

averagePricesTask.ShowInfo()

let averagePrices = averagePricesTask.Result

Next, use the cached results to get the most expensive city and street.

1: 
2: 
3: 
4: 
5:

let mostExpensive =
    averagePrices
    |> CloudFlow.sortByDescending snd 100
    |> CloudFlow.toArray
    |> cluster.Run

Next, use the cached results to also get the least expensive city and street.

1: 
2: 
3: 
4: 
5:

let leastExpensive =
    averagePrices
    |> CloudFlow.sortBy snd 100
    |> CloudFlow.toArray
    |> cluster.Run

Count the sales by city:

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8:

let purchasesByCity =
    persistedHousePrices
    |> CloudFlow.countBy (fun row -> row.TownCity)
    |> CloudFlow.sortByDescending snd 1500
    |> CloudFlow.toArray
    |> cluster.Run

Chart.Pie purchasesByCity |> Chart.Show

Count by city

And so on.

So notice that the first query takes 45 seconds to execute, which involves downloading the data and parsing it via the CSV type provider. Once we’ve done that, we persist it across the cluster in memory – then we can re-use that persisted flow in all subsequent queries, each of which just takes a few seconds to run.

Finding the Current Prices of Monopoly Streets

We all know and love the game Monopoly. For those who grew up in the United Kingdom or Australia, you probably played using the London street names and prices where buying Mayfair cost £400. But how do prices today look, and which streets are now the most expensive?

Next, you get the list of all streets on the Monopoly board using the HTML type provider over this web page. This is a type provider in FSharp.Data that helps you crack the content of HTML tables.

1: 
2: 
3: 
4:

type MonopolyTable = 
    HtmlProvider<"http://www.jdawiseman.com/papers/trivia/monopoly-rents.html">

let monopolyPage = MonopolyTable.GetSample()

This page contains a particular table with all the property names for the UK edition of Monopoly:

1:	`let data = monopolyPage.Tables.Table2.Rows`

Giving:

1: 
2: 
3: 
4: 
5:

[("Property", "Cost", "M’tg", "Site", "1 hse", "2 hses", "3 hses", "4 hses","Hotel");
 ("Old Kent Road", "60", "30", "2", "10", "30", "90", "160", "250");
 ...
 ("Park Lane", "350", "175", "35", "175", "500", "1100", "1300", "1500");
 ("Mayfair", "400", "200", "50", "200", "600", "1400", "1700", "2000")]

Next you put the names into a set, converting them to lower case as you go:

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9:

let monopolyStreets = 
        [ for p in data do 
             let streetName = p.Property.ToLower()
             // Strip off the header 
             if streetName <> "Property" then 
                 yield streetName
          // Yield one random street in Mayfair
          yield "Grosvenor Street".ToLower()]
        |> set

Next, you find the sales that correspond to Monopoly streets, again reusing your calculation of average-prices on streets:

1: 
2: 
3: 
4: 
5: 
6: 
7:

let monopoly =
    averagePrices
    |> CloudFlow.filter (fun ((city, district, street),price) -> 
          city = "LONDON" && monopolyStreets.Contains(street.ToLower())) 
    |> CloudFlow.sortByDescending snd 100
    |> CloudFlow.toArray
    |> cluster.Run

You're done! If you are using the large 4-year data set, you will see this:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21:

[|(("LONDON", "CITY OF WESTMINSTER", "TRAFALGAR SQUARE"), 5540000.0);
  (("LONDON", "CITY OF WESTMINSTER", "PICCADILLY"), 3500000.0);
  (("LONDON", "CITY OF WESTMINSTER", "THE STRAND"), 3100000.0);
  (("LONDON", "KENSINGTON AND CHELSEA", "MARLBOROUGH STREET"), 2975000.0);
  (("LONDON", "CITY OF WESTMINSTER", "PARK LANE"), 2465000.0);
  (("LONDON", "CITY OF WESTMINSTER", "OXFORD STREET"), 1803150.0);
  (("LONDON", "CITY OF WESTMINSTER", "WHITEHALL"), 1130000.0);
  (("LONDON", "CITY OF WESTMINSTER", "BOW STREET"), 1056428.571);
  (("LONDON", "CAMDEN", "EUSTON ROAD"), 1045781.25);
  (("LONDON", "CITY OF WESTMINSTER", "GROSVENOR STREET"), 717400.0);
  (("LONDON", "CITY OF WESTMINSTER", "PALL MALL"), 700000.0);
  (("LONDON", "CITY OF LONDON", "FLEET STREET"), 621111.1111);
  (("LONDON", "BRENT", "REGENT STREET"), 581421.4286);
  (("LONDON", "REDBRIDGE", "NORTHUMBERLAND AVENUE"), 549850.0);
  (("LONDON", "ISLINGTON", "PENTONVILLE ROAD"), 539376.4634);
  (("LONDON", "NEWHAM", "BOND STREET"), 360000.0);
  (("LONDON", "TOWER HAMLETS", "WHITECHAPEL ROAD"), 330071.4286);
  (("LONDON", "ENFIELD", "PARK LANE"), 290400.0);
  (("LONDON", "SOUTHWARK", "OLD KENT ROAD"), 266975.6622);
  (("LONDON", "HARINGEY", "PARK LANE"), 210770.8333);
  (("LONDON", "EALING", "BOND STREET"), 206000.0)|]

Some of these results are false: they are probably not the streets being referred to in the Monopoly game! But most are accurate. You will see that the "red" set does particularly well in the 21st century!! Also, many of the "cheap" properties are still at the lower end of the list, albeit at roughly 7,000x the price of the original game!

Summary

In this example, you've learned how to use data parallel cloud flows with historical event data drawn directly from the internet. By using F# type providers (FSharp.Data) plus a sample of the data you have given strong types to your information. You then learned how to persist partial results and to calculate averages and sums of groups of the data.

Continue with further samples to learn more about the MBrace programming model.

Note, you can use the above techniques from both scripts and compiled projects. To see the components referenced by this script, see ThespianCluster.fsx or AzureCluster.fsx.

namespace System

namespace System.IO

namespace XPlot

module GoogleCharts

from XPlot

Multiple items
namespace FSharp

--------------------
namespace Microsoft.FSharp

Multiple items
namespace FSharp.Data

--------------------
namespace Microsoft.FSharp.Data

namespace MBrace

namespace MBrace.Core

namespace MBrace.Flow

val cluster : MBrace.Thespian.ThespianCluster

Full name: 200-house-data-analysis-example.cluster

module Config

val GetCluster : unit -> MBrace.Thespian.ThespianCluster

Full name: Config.GetCluster

Gets or creates a new Thespian cluster session.

type HousePrices = CsvProvider<...>

Full name: 200-house-data-analysis-example.HousePrices

type CsvProvider

Full name: FSharp.Data.CsvProvider

<summary>Typed representation of a CSV file.</summary>
       <param name='Sample'>Location of a CSV sample file or a string containing a sample CSV document.</param>
       <param name='Separators'>Column delimiter(s). Defaults to `,`.</param>
       <param name='InferRows'>Number of rows to use for inference. Defaults to `1000`. If this is zero, all rows are used.</param>
       <param name='Schema'>Optional column types, in a comma separated list. Valid types are `int`, `int64`, `bool`, `float`, `decimal`, `date`, `guid`, `string`, `int?`, `int64?`, `bool?`, `float?`, `decimal?`, `date?`, `guid?`, `int option`, `int64 option`, `bool option`, `float option`, `decimal option`, `date option`, `guid option` and `string option`.
       You can also specify a unit and the name of the column like this: `Name (type<unit>)`, or you can override only the name. If you don't want to specify all the columns, you can reference the columns by name like this: `ColumnName=type`.</param>
       <param name='HasHeaders'>Whether the sample contains the names of the columns as its first line.</param>
       <param name='IgnoreErrors'>Whether to ignore rows that have the wrong number of columns or which can't be parsed using the inferred or specified schema. Otherwise an exception is thrown when these rows are encountered.</param>
       <param name='SkipRows'>SKips the first n rows of the CSV file.</param>
       <param name='AssumeMissingValues'>When set to true, the type provider will assume all columns can have missing values, even if in the provided sample all values are present. Defaults to false.</param>
       <param name='PreferOptionals'>When set to true, inference will prefer to use the option type instead of nullable types, `double.NaN` or `""` for missing values. Defaults to false.</param>
       <param name='Quote'>The quotation mark (for surrounding values containing the delimiter). Defaults to `"`.</param>
       <param name='MissingValues'>The set of strings recogized as missing values. Defaults to `NaN,NA,N/A,#N/A,:,-,TBA,TBD`.</param>
       <param name='CacheRows'>Whether the rows should be caches so they can be iterated multiple times. Defaults to true. Disable for large datasets.</param>
       <param name='Culture'>The culture used for parsing numbers and dates. Defaults to the invariant culture.</param>
       <param name='Encoding'>The encoding used to read the sample. You can specify either the character set name or the codepage number. Defaults to UTF8 for files, and to ISO-8859-1 the for HTTP requests, unless `charset` is specified in the `Content-Type` response header.</param>
       <param name='ResolutionFolder'>A directory that is used when resolving relative file references (at design time and in hosted execution).</param>
       <param name='EmbeddedResource'>When specified, the type provider first attempts to load the sample from the specified resource
          (e.g. 'MyCompany.MyAssembly, resource_name.csv'). This is useful when exposing types generated by the type provider.</param>

val smallSources : string list

Full name: 200-house-data-analysis-example.smallSources

val bigSources : string list

Full name: 200-house-data-analysis-example.bigSources

val tinySources : string list

Full name: 200-house-data-analysis-example.tinySources

val sources : string list

Full name: 200-house-data-analysis-example.sources

val pricesTask : MBrace.Runtime.CloudProcess<((int * int) * float) []>

Full name: 200-house-data-analysis-example.pricesTask

Multiple items
module CloudFlow

from MBrace.Flow

--------------------
module CloudFlow

from Utils

--------------------
type CloudFlow =
  static member OfArray : source:'T [] -> CloudFlow<'T>
  static member OfCloudArrays : cloudArrays:seq<#CloudArray<'T>> -> LocalCloud<PersistedCloudFlow<'T>>
  static member OfCloudCollection : collection:ICloudCollection<'T> * ?sizeThresholdPerWorker:(unit -> int64) -> CloudFlow<'T>
  static member OfCloudDirectory : dirPath:string * serializer:ISerializer * ?sizeThresholdPerCore:int64 -> CloudFlow<'T>
  static member OfCloudDirectory : dirPath:string * ?deserializer:(Stream -> seq<'T>) * ?sizeThresholdPerCore:int64 -> CloudFlow<'T>
  static member OfCloudDirectory : dirPath:string * deserializer:(TextReader -> seq<'T>) * ?encoding:Encoding * ?sizeThresholdPerCore:int64 -> CloudFlow<'T>
  static member OfCloudDirectoryByLine : dirPath:string * ?encoding:Encoding * ?sizeThresholdPerCore:int64 -> CloudFlow<string>
  static member OfCloudFileByLine : path:string * ?encoding:Encoding -> CloudFlow<string>
  static member OfCloudFileByLine : paths:seq<string> * ?encoding:Encoding * ?sizeThresholdPerCore:int64 -> CloudFlow<string>
  static member OfCloudFiles : paths:seq<string> * serializer:ISerializer * ?sizeThresholdPerCore:int64 -> CloudFlow<'T>
  ...

Full name: MBrace.Flow.CloudFlow

--------------------
type CloudFlow<'T> =
  interface
    abstract member WithEvaluators : collectorFactory:LocalCloud<Collector<'T,'S>> -> projection:('S -> LocalCloud<'R>) -> combiner:('R [] -> LocalCloud<'R>) -> Cloud<'R>
    abstract member DegreeOfParallelism : int option
  end

Full name: MBrace.Flow.CloudFlow<_>

static member CloudFlow.OfHttpFileByLine : urls:seq<string> * ?encoding:Text.Encoding -> CloudFlow<string>
static member CloudFlow.OfHttpFileByLine : url:string * ?encoding:Text.Encoding -> CloudFlow<string>

val collect : f:('T -> #seq<'R>) -> flow:CloudFlow<'T> -> CloudFlow<'R>

Full name: MBrace.Flow.CloudFlow.collect

CsvProvider<...>.ParseRows(text: string) : CsvProvider<...>.Row []

Multiple items
val averageByKey : keyProjection:('T -> 'Key) -> valueProjection:('T -> 'Value) -> source:CloudFlow<'T> -> CloudFlow<'Key * 'Value> (requires equality and member ( + ) and member get_Zero and member DivideByInt)

Full name: MBrace.Flow.CloudFlow.averageByKey

--------------------
val averageByKey : keyf:('T -> 'Key) -> valf:('T -> 'Val) -> x:CloudFlow<'T> -> CloudFlow<'Key * 'Val> (requires equality and member ( + ) and member get_Zero and member DivideByInt)

Full name: Utils.CloudFlow.averageByKey

val row : CsvProvider<...>.Row

property CsvProvider<...>.Row.DateOfTransfer: DateTime

property DateTime.Year: int

property DateTime.Month: int

Multiple items
val float : value:'T -> float (requires member op_Explicit)

Full name: Microsoft.FSharp.Core.Operators.float

--------------------
type float = Double

Full name: Microsoft.FSharp.Core.float

--------------------
type float<'Measure> = float

Full name: Microsoft.FSharp.Core.float<_>

property CsvProvider<...>.Row.Price: int

val sortBy : projection:('T -> 'Key) -> takeCount:int -> flow:CloudFlow<'T> -> CloudFlow<'T> (requires comparison)

Full name: MBrace.Flow.CloudFlow.sortBy

val fst : tuple:('T1 * 'T2) -> 'T1

Full name: Microsoft.FSharp.Core.Operators.fst

val toArray : flow:CloudFlow<'T> -> MBrace.Core.Cloud<'T []>

Full name: MBrace.Flow.CloudFlow.toArray

member MBrace.Runtime.MBraceClient.CreateProcess : workflow:MBrace.Core.Cloud<'T> * ?cancellationToken:MBrace.Core.ICloudCancellationToken * ?faultPolicy:MBrace.Core.FaultPolicy * ?target:MBrace.Core.IWorkerRef * ?additionalResources:MBrace.Core.Internals.ResourceRegistry * ?taskName:string -> MBrace.Runtime.CloudProcess<'T>

member MBrace.Runtime.CloudProcess.ShowInfo : unit -> unit

member MBrace.Runtime.MBraceClient.ShowWorkers : unit -> unit

val prices : ((int * int) * float) []

Full name: 200-house-data-analysis-example.prices

property MBrace.Runtime.CloudProcess.Result: ((int * int) * float) []

val formatYearMonth : year:int * month:int -> string

Full name: 200-house-data-analysis-example.formatYearMonth

val year : int

val month : int

val sprintf : format:Printf.StringFormat<'T> -> 'T

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.sprintf

Multiple items
type DateTime =
  struct
    new : ticks:int64 -> DateTime + 10 overloads
    member Add : value:TimeSpan -> DateTime
    member AddDays : value:float -> DateTime
    member AddHours : value:float -> DateTime
    member AddMilliseconds : value:float -> DateTime
    member AddMinutes : value:float -> DateTime
    member AddMonths : months:int -> DateTime
    member AddSeconds : value:float -> DateTime
    member AddTicks : value:int64 -> DateTime
    member AddYears : value:int -> DateTime
    ...
  end

Full name: System.DateTime

--------------------
DateTime()
   (+0 other overloads)
DateTime(ticks: int64) : unit
   (+0 other overloads)
DateTime(ticks: int64, kind: DateTimeKind) : unit
   (+0 other overloads)
DateTime(year: int, month: int, day: int) : unit
   (+0 other overloads)
DateTime(year: int, month: int, day: int, calendar: Globalization.Calendar) : unit
   (+0 other overloads)
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int) : unit
   (+0 other overloads)
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int, kind: DateTimeKind) : unit
   (+0 other overloads)
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int, calendar: Globalization.Calendar) : unit
   (+0 other overloads)
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int, millisecond: int) : unit
   (+0 other overloads)
DateTime(year: int, month: int, day: int, hour: int, minute: int, second: int, millisecond: int, kind: DateTimeKind) : unit
   (+0 other overloads)

val chartPrices : prices:seq<(int * int) * #value> -> unit

Full name: 200-house-data-analysis-example.chartPrices

val prices : seq<(int * int) * #value>

module Seq

from Microsoft.FSharp.Collections

val map : mapping:('T -> 'U) -> source:seq<'T> -> seq<'U>

Full name: Microsoft.FSharp.Collections.Seq.map

val ym : int * int

val price : #value

type Chart =
  static member Annotation : data:seq<#seq<DateTime * 'V * string * string>> * ?Labels:seq<string> * ?Options:Options -> GoogleChart (requires 'V :> value)
  static member Annotation : data:seq<DateTime * #value * string * string> * ?Labels:seq<string> * ?Options:Options -> GoogleChart
  static member Area : data:seq<#seq<'K * 'V>> * ?Labels:seq<string> * ?Options:Options -> GoogleChart (requires 'K :> key and 'V :> value)
  static member Area : data:seq<#key * #value> * ?Labels:seq<string> * ?Options:Options -> GoogleChart
  static member Bar : data:seq<#seq<'K * 'V>> * ?Labels:seq<string> * ?Options:Options -> GoogleChart (requires 'K :> key and 'V :> value)
  static member Bar : data:seq<#key * #value> * ?Labels:seq<string> * ?Options:Options -> GoogleChart
  static member Bubble : data:seq<string * #value * #value * #value * #value> * ?Labels:seq<string> * ?Options:Options -> GoogleChart
  static member Bubble : data:seq<string * #value * #value * #value> * ?Labels:seq<string> * ?Options:Options -> GoogleChart
  static member Bubble : data:seq<string * #value * #value> * ?Labels:seq<string> * ?Options:Options -> GoogleChart
  static member Calendar : data:seq<DateTime * #value> * ?Labels:seq<string> * ?Options:Options -> GoogleChart
  ...

Full name: XPlot.GoogleCharts.Chart

static member Chart.Line : data:seq<#seq<'K * 'V>> * ?Labels:seq<string> * ?Options:Options -> GoogleChart (requires 'K :> key and 'V :> value)
static member Chart.Line : data:seq<#key * #value> * ?Labels:seq<string> * ?Options:Options -> GoogleChart

static member Chart.WithOptions : options:Options -> chart:GoogleChart -> GoogleChart

Multiple items
type Options =
  new : unit -> Options
  member ShouldSerializeaggregationTarget : unit -> bool
  member ShouldSerializeallValuesSuffix : unit -> bool
  member ShouldSerializeallowHtml : unit -> bool
  member ShouldSerializealternatingRowStyle : unit -> bool
  member ShouldSerializeanimation : unit -> bool
  member ShouldSerializeannotations : unit -> bool
  member ShouldSerializeannotationsWidth : unit -> bool
  member ShouldSerializeareaOpacity : unit -> bool
  member ShouldSerializeavoidOverlappingGridLines : unit -> bool
  ...

Full name: XPlot.GoogleCharts.Configuration.Options

--------------------
new : unit -> Options

static member Chart.Show : chart:GoogleChart -> GoogleChart
static member Chart.Show : chart:GoogleChart -> unit

val persistedHousePricesTask : MBrace.Runtime.CloudProcess<PersistedCloudFlow<CsvProvider<...>.Row>>

Full name: 200-house-data-analysis-example.persistedHousePricesTask

val persist : storageLevel:MBrace.Core.StorageLevel -> flow:CloudFlow<'T> -> MBrace.Core.Cloud<PersistedCloudFlow<'T>>

Full name: MBrace.Flow.CloudFlow.persist

val persistedHousePrices : PersistedCloudFlow<CsvProvider<...>.Row>

Full name: 200-house-data-analysis-example.persistedHousePrices

property MBrace.Runtime.CloudProcess.Result: PersistedCloudFlow<CsvProvider<...>.Row>

val count : int64

Full name: 200-house-data-analysis-example.count

val length : flow:CloudFlow<'T> -> MBrace.Core.Cloud<int64>

Full name: MBrace.Flow.CloudFlow.length

member MBrace.Runtime.MBraceClient.Run : workflow:MBrace.Core.Cloud<'T> * ?cancellationToken:MBrace.Core.ICloudCancellationToken * ?faultPolicy:MBrace.Core.FaultPolicy * ?target:MBrace.Core.IWorkerRef * ?additionalResources:MBrace.Core.Internals.ResourceRegistry * ?taskName:string -> 'T

val first100 : CsvProvider<...>.Row []

Full name: 200-house-data-analysis-example.first100

val take : n:int -> flow:CloudFlow<'T> -> CloudFlow<'T>

Full name: MBrace.Flow.CloudFlow.take

val pricesByMonthTask : MBrace.Runtime.CloudProcess<((int * int) * float) []>

Full name: 200-house-data-analysis-example.pricesByMonthTask

val pricesByMonth : ((int * int) * float) []

Full name: 200-house-data-analysis-example.pricesByMonth

val averagePricesTask : MBrace.Runtime.CloudProcess<PersistedCloudFlow<(string * string * string) * float>>

Full name: 200-house-data-analysis-example.averagePricesTask

property CsvProvider<...>.Row.TownCity: string

property CsvProvider<...>.Row.District: string

property CsvProvider<...>.Row.Street: string

val cache : flow:CloudFlow<'T> -> MBrace.Core.Cloud<PersistedCloudFlow<'T>>

Full name: MBrace.Flow.CloudFlow.cache

val averagePrices : PersistedCloudFlow<(string * string * string) * float>

Full name: 200-house-data-analysis-example.averagePrices

property MBrace.Runtime.CloudProcess.Result: PersistedCloudFlow<(string * string * string) * float>

val mostExpensive : ((string * string * string) * float) []

Full name: 200-house-data-analysis-example.mostExpensive

val sortByDescending : projection:('T -> 'Key) -> takeCount:int -> flow:CloudFlow<'T> -> CloudFlow<'T> (requires comparison)

Full name: MBrace.Flow.CloudFlow.sortByDescending

val snd : tuple:('T1 * 'T2) -> 'T2

Full name: Microsoft.FSharp.Core.Operators.snd

val leastExpensive : ((string * string * string) * float) []

Full name: 200-house-data-analysis-example.leastExpensive

val purchasesByCity : (string * int64) []

Full name: 200-house-data-analysis-example.purchasesByCity

val countBy : projection:('T -> 'Key) -> flow:CloudFlow<'T> -> CloudFlow<'Key * int64> (requires equality)

Full name: MBrace.Flow.CloudFlow.countBy

static member Chart.Pie : data:seq<string * #value> * ?Labels:seq<string> * ?Options:Options -> GoogleChart

type MonopolyTable = HtmlProvider<...>

Full name: 200-house-data-analysis-example.MonopolyTable

type HtmlProvider

Full name: FSharp.Data.HtmlProvider

<summary>Typed representation of an HTML file.</summary>
           <param name='Sample'>Location of an HTML sample file or a string containing a sample HTML document.</param>
           <param name='PreferOptionals'>When set to true, inference will prefer to use the option type instead of nullable types, `double.NaN` or `""` for missing values. Defaults to false.</param>
           <param name='IncludeLayoutTables'>Includes tables that are potentially layout tables (with cellpadding=0 and cellspacing=0 attributes)</param>
           <param name='MissingValues'>The set of strings recogized as missing values. Defaults to `NaN,NA,N/A,#N/A,:,-,TBA,TBD`.</param>
           <param name='Culture'>The culture used for parsing numbers and dates. Defaults to the invariant culture.</param>
           <param name='Encoding'>The encoding used to read the sample. You can specify either the character set name or the codepage number. Defaults to UTF8 for files, and to ISO-8859-1 the for HTTP requests, unless `charset` is specified in the `Content-Type` response header.</param>
           <param name='ResolutionFolder'>A directory that is used when resolving relative file references (at design time and in hosted execution).</param>
           <param name='EmbeddedResource'>When specified, the type provider first attempts to load the sample from the specified resource
              (e.g. 'MyCompany.MyAssembly, resource_name.html'). This is useful when exposing types generated by the type provider.</param>

val monopolyPage : HtmlProvider<...>

Full name: 200-house-data-analysis-example.monopolyPage

HtmlProvider<...>.GetSample() : HtmlProvider<...>

val data : HtmlProvider<...>.Table2.Row []

Full name: 200-house-data-analysis-example.data

property HtmlProvider<...>.Tables: HtmlProvider<...>.TablesContainer

property HtmlProvider<...>.TablesContainer.Table2: HtmlProvider<...>.Table2

property Runtime.BaseTypes.HtmlTable.Rows: HtmlProvider<...>.Table2.Row []

val monopolyStreets : Set<string>

Full name: 200-house-data-analysis-example.monopolyStreets

val p : HtmlProvider<...>.Table2.Row

val streetName : string

property HtmlProvider<...>.Table2.Row.Property: string

String.ToLower() : string
String.ToLower(culture: Globalization.CultureInfo) : string

val set : elements:seq<'T> -> Set<'T> (requires comparison)

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.set

val monopoly : ((string * string * string) * float) []

Full name: 200-house-data-analysis-example.monopoly

val filter : predicate:('T -> bool) -> flow:CloudFlow<'T> -> CloudFlow<'T>

Full name: MBrace.Flow.CloudFlow.filter

val city : string

val district : string

val street : string

val price : float

member Set.Contains : value:'T -> bool

MBrace.Core and MBrace.Azure

Using Data Parallel Cloud Flows to Analyze Historical Event Data

Persisted Cloud Flows

Finding the Current Prices of Monopoly Streets

Summary