Phillip Trelford's Array

POKE 36879,255

Machine Learning from Disaster

Off the back of the popular Machine Learning hands on session at Skills Matter last month where we created a digit recognizer, last night we tackled a new dataset. Again we took a task from Kaggle’s online predictive modelling competitions. This time the data set was passenger details from the Titanic, with the task to analyse who was likely to survive.


Guided Task: http://trelford.com/titanic.zip (unblock the file, unzip to C:\titanic, load in VS2012 and run through the tasks in the titanic.fsx interactive F# script).

Kaggle provide a CSV file with the passenger details, we loaded this using FSharp.Data’s CSV provider which infers the fields and types of the data for you:

let [<Literal>] path = "C:/titanic/train.csv"
type Train = CsvProvider<path,InferRows=0>
type Passenger = Train.Row

let passengers : Passenger[] = 
    Train.Load(path).Take(600).Data 
    |> Seq.toArray

Then did some preliminary data analysis tasks looking at how well specific features predicted survival:

let females = passengers |> where female
let femaleSurvivors = females |> tally survived
let femaleSurvivorsPc = females |> percentage survived

Finally we used a provided decision tree learning algorithm for prediction:

let labels = [|"sex"; "class"|]

let features (p:Passenger) : obj[] = [|p.Sex; p.Pclass|]

let dataSet : obj[][] =
    [|for passenger in passengers ->
        [|yield! features passenger; 
          yield box (p.Survived = 1)|] |]

let tree = createTree(dataSet, labels)

I used the decision tree code from the Machine Learning in Action book porting the Python implementation to F#, here’s the gist of it. The Python Tools for Visual Studio (PVTS) came in handy for checking the outputs were the same on both implementations. Mathias Brandewinder has a great article on Decision Tree classification and also Random Forest classification in F# using the same Titanic data set. 

Again it was great to see a full house for the event with over 50 members in attendance:

full house

There’s a few more pictures from the event over on the Skills Matter Facebook page :)

Check out the F#unctional Londoners meetup page for upcoming meetings, the next one is 2 weeks on F# Mobile Apps. If you’re interested in more hands on sessions with F# I’d also highly recommend the Progressive F# Tutorials in New York this September and London in October, as there is still a great early bird rate:

miketempbannerprogfsharp-670x180px