KNN Digit Recognizer
This example is from the MBrace Starter Kit.
This example shows a digit recognizer classification using k nearest neighbours based on the Kaggle dataset. https://www.kaggle.com/c/digit-recognizer
First, define the types and constants relevant to images:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: |
|
Next, define the types relevant to classification of images:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: |
|
Next, implement a range of image classifiers:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: |
|
Next, implement local multicore classification and validation:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: |
|
Next, implement the distributed, cloud versions of the same algorithms, to classify and validate the images using an MBrace cluster:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: |
|
Now, acquire the samples:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: |
|
Run the validation operation in the cluster:
1:
|
|
Check on its progress:
1: 2: |
|
Run the classification operation in the cluster:
1:
|
|
Check on its progress:
1: 2: |
|
Get the results:
1: 2: |
|
In this example, you've learned how perform a machine learning classification task on an MBrace cluster. Continue with further samples to learn more about the MBrace programming model.
Note, you can use the above techniques from both scripts and compiled projects. To see the components referenced by this script, see ThespianCluster.fsx or AzureCluster.fsx.
Full name: 200-knn-digit-recognizer-example.cluster
Full name: Config.GetCluster
Gets or creates a new Thespian cluster session.
type LiteralAttribute =
inherit Attribute
new : unit -> LiteralAttribute
Full name: Microsoft.FSharp.Core.LiteralAttribute
--------------------
new : unit -> LiteralAttribute
Full name: 200-knn-digit-recognizer-example.pixelLength
Full name: 200-knn-digit-recognizer-example.ImageId
Image identifier
val int : value:'T -> int (requires member op_Explicit)
Full name: Microsoft.FSharp.Core.Operators.int
--------------------
type int = int32
Full name: Microsoft.FSharp.Core.int
--------------------
type int<'Measure> = int
Full name: Microsoft.FSharp.Core.int<_>
{Id: ImageId;
Pixels: int [];}
static member Parse : file:string -> Image []
Full name: 200-knn-digit-recognizer-example.Image
Image bitmap representation
Full name: 200-knn-digit-recognizer-example.Image.Parse
Parses a set of points from text using the Kaggle digit recognizer CSV format
static member AppendAllLines : path:string * contents:IEnumerable<string> -> unit + 1 overload
static member AppendAllText : path:string * contents:string -> unit + 1 overload
static member AppendText : path:string -> StreamWriter
static member Copy : sourceFileName:string * destFileName:string -> unit + 1 overload
static member Create : path:string -> FileStream + 3 overloads
static member CreateText : path:string -> StreamWriter
static member Decrypt : path:string -> unit
static member Delete : path:string -> unit
static member Encrypt : path:string -> unit
static member Exists : path:string -> bool
...
Full name: System.IO.File
File.ReadAllLines(path: string, encoding: Encoding) : string []
type Stream =
inherit MarshalByRefObject
member BeginRead : buffer:byte[] * offset:int * count:int * callback:AsyncCallback * state:obj -> IAsyncResult
member BeginWrite : buffer:byte[] * offset:int * count:int * callback:AsyncCallback * state:obj -> IAsyncResult
member CanRead : bool
member CanSeek : bool
member CanTimeout : bool
member CanWrite : bool
member Close : unit -> unit
member CopyTo : destination:Stream -> unit + 1 overload
member Dispose : unit -> unit
member EndRead : asyncResult:IAsyncResult -> int
...
Full name: System.IO.Stream
--------------------
type Stream<'T> =
private {Run: Context<'T> -> Iterable;}
member private RunBulk : ctxt:Context<'T> -> unit
override ToString : unit -> string
Full name: Nessos.Streams.Stream<_>
Full name: Nessos.Streams.Stream.ofSeq
Full name: Nessos.Streams.Stream.skip
Full name: Nessos.Streams.Stream.map
String.Split(separator: string [], options: StringSplitOptions) : string []
String.Split(separator: char [], options: StringSplitOptions) : string []
String.Split(separator: char [], count: int) : string []
String.Split(separator: string [], count: int, options: StringSplitOptions) : string []
String.Split(separator: char [], count: int, options: StringSplitOptions) : string []
member Clone : unit -> obj
member CopyTo : array:Array * index:int -> unit + 1 overload
member GetEnumerator : unit -> IEnumerator
member GetLength : dimension:int -> int
member GetLongLength : dimension:int -> int64
member GetLowerBound : dimension:int -> int
member GetUpperBound : dimension:int -> int
member GetValue : [<ParamArray>] indices:int[] -> obj + 7 overloads
member Initialize : unit -> unit
member IsFixedSize : bool
...
Full name: System.Array
Full name: Microsoft.FSharp.Collections.Array.map
Full name: Nessos.Streams.Stream.mapi
Full name: Nessos.Streams.Stream.toArray
Full name: 200-knn-digit-recognizer-example.Classification
Digit classification
Full name: 200-knn-digit-recognizer-example.Distance
Distance on points; use uint64 to avoid overflows
val uint64 : value:'T -> uint64 (requires member op_Explicit)
Full name: Microsoft.FSharp.Core.Operators.uint64
--------------------
type uint64 = UInt64
Full name: Microsoft.FSharp.Core.uint64
{Classification: Classification;
Image: Image;}
static member Parse : file:string -> TrainingImage []
Full name: 200-knn-digit-recognizer-example.TrainingImage
A training image annotaded by its classification
TrainingImage.Classification: Classification
--------------------
type Classification = int
Full name: 200-knn-digit-recognizer-example.Classification
Digit classification
TrainingImage.Image: Image
--------------------
type Image =
{Id: ImageId;
Pixels: int [];}
static member Parse : file:string -> Image []
Full name: 200-knn-digit-recognizer-example.Image
Image bitmap representation
Full name: 200-knn-digit-recognizer-example.TrainingImage.Parse
Parses a training set from text using the Kaggle digit recognizer CSV format
Full name: 200-knn-digit-recognizer-example.Classifier
Digit classifier
static member Write : outFile:string * classifications:(ImageId * Classification) [] -> unit
Full name: 200-knn-digit-recognizer-example.Classifications
Digit classifiers
Full name: 200-knn-digit-recognizer-example.Classifications.Write
Writes a point classification to file
val string : value:'T -> string
Full name: Microsoft.FSharp.Core.Operators.string
--------------------
type string = String
Full name: Microsoft.FSharp.Core.string
type StreamWriter =
inherit TextWriter
new : stream:Stream -> StreamWriter + 6 overloads
member AutoFlush : bool with get, set
member BaseStream : Stream
member Close : unit -> unit
member Encoding : Encoding
member Flush : unit -> unit
member Write : value:char -> unit + 3 overloads
static val Null : StreamWriter
Full name: System.IO.StreamWriter
--------------------
StreamWriter(stream: Stream) : unit
StreamWriter(path: string) : unit
StreamWriter(stream: Stream, encoding: Encoding) : unit
StreamWriter(path: string, append: bool) : unit
StreamWriter(stream: Stream, encoding: Encoding, bufferSize: int) : unit
StreamWriter(path: string, append: bool, encoding: Encoding) : unit
StreamWriter(path: string, append: bool, encoding: Encoding, bufferSize: int) : unit
(+0 other overloads)
TextWriter.WriteLine(value: obj) : unit
(+0 other overloads)
TextWriter.WriteLine(value: string) : unit
(+0 other overloads)
TextWriter.WriteLine(value: decimal) : unit
(+0 other overloads)
TextWriter.WriteLine(value: float) : unit
(+0 other overloads)
TextWriter.WriteLine(value: float32) : unit
(+0 other overloads)
TextWriter.WriteLine(value: uint64) : unit
(+0 other overloads)
TextWriter.WriteLine(value: int64) : unit
(+0 other overloads)
TextWriter.WriteLine(value: uint32) : unit
(+0 other overloads)
TextWriter.WriteLine(value: int) : unit
(+0 other overloads)
Full name: Microsoft.FSharp.Collections.Array.iter
Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.sprintf
Full name: 200-knn-digit-recognizer-example.l2
l^2 distance
Full name: Microsoft.FSharp.Core.Operators.pown
Full name: 200-knn-digit-recognizer-example.knn
single-threaded, stream-based k-nearest neighbour classifier
Full name: Nessos.Streams.Stream.ofArray
Full name: Nessos.Streams.Stream.sortBy
Full name: Nessos.Streams.Stream.take
Full name: Nessos.Streams.Stream.countBy
Full name: Microsoft.FSharp.Core.Operators.id
Full name: Nessos.Streams.Stream.maxBy
Full name: Microsoft.FSharp.Core.Operators.snd
Full name: Microsoft.FSharp.Core.Operators.fst
Full name: 200-knn-digit-recognizer-example.classifyLocalMulticore
local multicore classification
module ParStream
from Nessos.Streams
--------------------
type ParStream<'T> =
private {Impl: ParStreamImpl<'T>;}
member Apply : collector:ParCollector<'T,'R> -> 'R
member private Stream : unit -> Stream<'T>
member DegreeOfParallelism : int
member private PreserveOrdering : bool
member private SourceType : SourceType
Full name: Nessos.Streams.ParStream<_>
Full name: Nessos.Streams.ParStream.ofArray
Full name: Nessos.Streams.ParStream.map
Full name: Nessos.Streams.ParStream.toArray
Full name: 200-knn-digit-recognizer-example.validateLocalMulticore
local multicore validation
Full name: Nessos.Streams.ParStream.sum
val float : value:'T -> float (requires member op_Explicit)
Full name: Microsoft.FSharp.Core.Operators.float
--------------------
type float = Double
Full name: Microsoft.FSharp.Core.float
--------------------
type float<'Measure> = float
Full name: Microsoft.FSharp.Core.float<_>
Full name: 200-knn-digit-recognizer-example.classifyCloud
Clasify test images using MBrace
module CloudFlow
from MBrace.Flow
--------------------
type CloudFlow =
static member OfArray : source:'T [] -> CloudFlow<'T>
static member OfCloudArrays : cloudArrays:seq<#CloudArray<'T>> -> LocalCloud<PersistedCloudFlow<'T>>
static member OfCloudCollection : collection:ICloudCollection<'T> * ?sizeThresholdPerWorker:(unit -> int64) -> CloudFlow<'T>
static member OfCloudDirectory : dirPath:string * serializer:ISerializer * ?sizeThresholdPerCore:int64 -> CloudFlow<'T>
static member OfCloudDirectory : dirPath:string * ?deserializer:(Stream -> seq<'T>) * ?sizeThresholdPerCore:int64 -> CloudFlow<'T>
static member OfCloudDirectory : dirPath:string * deserializer:(TextReader -> seq<'T>) * ?encoding:Encoding * ?sizeThresholdPerCore:int64 -> CloudFlow<'T>
static member OfCloudDirectoryByLine : dirPath:string * ?encoding:Encoding * ?sizeThresholdPerCore:int64 -> CloudFlow<string>
static member OfCloudFileByLine : path:string * ?encoding:Encoding -> CloudFlow<string>
static member OfCloudFileByLine : paths:seq<string> * ?encoding:Encoding * ?sizeThresholdPerCore:int64 -> CloudFlow<string>
static member OfCloudFiles : paths:seq<string> * serializer:ISerializer * ?sizeThresholdPerCore:int64 -> CloudFlow<'T>
...
Full name: MBrace.Flow.CloudFlow
--------------------
type CloudFlow<'T> =
interface
abstract member WithEvaluators : collectorFactory:LocalCloud<Collector<'T,'S>> -> projection:('S -> LocalCloud<'R>) -> combiner:('R [] -> LocalCloud<'R>) -> Cloud<'R>
abstract member DegreeOfParallelism : int option
end
Full name: MBrace.Flow.CloudFlow<_>
Full name: MBrace.Flow.CloudFlow.map
Full name: MBrace.Flow.CloudFlow.toArray
Full name: 200-knn-digit-recognizer-example.validateCloud
Validate training images using MBrace
Full name: MBrace.Flow.CloudFlow.filter
Full name: MBrace.Flow.CloudFlow.length
Full name: 200-knn-digit-recognizer-example.trainPath
Full name: 200-knn-digit-recognizer-example.testPath
Full name: 200-knn-digit-recognizer-example.training
Parses a training set from text using the Kaggle digit recognizer CSV format
Full name: 200-knn-digit-recognizer-example.tests
Parses a set of points from text using the Kaggle digit recognizer CSV format
Full name: 200-knn-digit-recognizer-example.classifier
Full name: 200-knn-digit-recognizer-example.validateTask
Full name: 200-knn-digit-recognizer-example.classifyTask
Full name: 200-knn-digit-recognizer-example.validateResult
Full name: 200-knn-digit-recognizer-example.classifyResult