Phillip Trelford's Array

POKE 36879,255

F# Exchange 2015

This Friday saw the first ever F# eXchange, a one-day 2 track conference dedicated to all things F#, hosted at Skills Matter in London and attracting developers from across Europe.

There was a strong focus on open source projects throughout the day including MBrace (data scripting for the cloud), Fake (a DSL for build tasks), Paket (a dependency manager for .Net), the F# Power Tools and FunScript (an F# to JS compiler). In fact all the presenters used the open source project FsReveal to generate their slides!

Keynote

Tomas Petricek opened proceedings with a keynote on The Big F# and Open Source Love Story:

Slow development

One of Tomas’s observations, on slow development for open source projects resonated with many, where successful projects often start as just a simple script that fulfils a specific need and slowly gather momentum over time.

As an example, in Steffen Forkmann’s presentation he talked about how Fake had started as a simple F# script and over the years seen more and more contributors and downloads, with the addition of high quality documentation having a huge impact:

Talks

All the talks were recorded, and all the videos are already online!

Speakers

The videos:

Steffen also took advantage of his talk to make a special announcement about Paket:


Panel

The day ended with some pizza, drinks and a panel organized by prolific F# contributor, Don Syme:

Panel Discussion

Each panel member pitched why they thought F# was good in their core domain area from cloud, games, design, data science, scripting through to web.

There were some interesting discussions, and some mentions of the recent fsharpWorks led F# Survey.

2016

The date for next year’s F# eXchange 2016, the 16th April, is already in the calendar, hope to see you there, and please take advantage of the early bird ticket offer, only 85GBP up until the 16th June!

String and StringBuilder revisited

I came across a topical .Net article by Dave M Bush published towards the tail end of 2014 entitled String and StringBuilder where he correctly asserts that .Net’s built-in string type are reference types and immutable. All good so far.

The next assertion is that StringBuilder will be faster than simple string concatenation when adding more than 3 strings together, which is probably a pretty good guess, but lets put it to the test with 4 strings.

The test can be performed easily using F# interactive (built-in to Visual Studio) with the #time directive:

open System.Text

#time

let a = "abc"
let b = "efg"
let c = "hij"
let d = "klm"

for i = 1 to 1000000 do
   let e = StringBuilder(a)
   let f = e.Append(b).Append(c).Append(d).ToString() 
   ()
// Real: 00:00:00.317, CPU: 00:00:00.343, GC gen0: 101, gen1: 0, gen2: 0
   
for i = 1 to 1000000 do
   let e = System.String.Concat(a,b,c,d)
   ()
// Real: 00:00:00.148, CPU: 00:00:00.156, GC gen0: 36, gen1: 0, gen2: 0

What we actually see is that for concatenating 4 strings StringBuilder takes twice as long as using String.Concat (on this run 0.317ms vs 0.148ms) and generates approximately 3 times as much garbage (gen0: 101 vs gen0: 36)!

Underneath the hood the StringBuilder is creating an array to append the strings into. When appending if the current buffer length is exceeded (the default is 16) then a new array must be created. When ToString is called it may, based on a heuristic, decide to return the builder’s array or allocate a new array and copy the value into that. Therefore the performance of StringBuilder is dependent on the initial capacity of the builder and the number and lengths of the strings to append.

In contrast, String.Concat (which the compiler resolves the ‘+’ operator to) calculates the length of the concatenated string from the lengths of the passed in strings, then allocates a string of the required size and copies the values in, ergo, in many scenarios it will require less copying and less allocation.

When concatenating 2, 3 or 4 strings we can take advantage of String.Concat’s optimized overloads, after this the picture changes as an array argument must be passed which requires an additional allocation. However String.Concat may still be faster than StringBuilder in some scenarios where the builder requires multiple reallocations.

But wait there’s more, going back to the ‘+’ operator, if we assign the integer literal expression 1 + 2 + 3 the compiler can reduce the value to 6, equally if we define the strings as const string then the compiler can apply the string concatenations at compile time leading to, in this contrived example, no cost whatsoever.

The moral of the story is when it comes to performance optimization - measure, measure, measure.

Top 100 .Net Bloggers from 2014

In my last post I covered the top 100 .Net bloggers since 2008, based on links posted on Alvin Ashcraft's Morning Dew. This (intentionally) captured many bloggers that are no longer actively blogging, but equally still have interesting content to consume.

For completeness here's the ranking for the years 2014 and 2015 (up to last Friday) which may better capture active .Net bloggers:

Rank Name 2014  2015  Total
1 Sean Sexton 195 0 195
2 Raymond Chen 86 17 103
3 Greg Duncan 74 14 88
4 Scott Hanselman 50 7 57
5 Peter Vogel 44 12 56
6 Brian Harry 46 8 54
7 Ricardo Peres 38 13 51
8 Oren Eini 32 12 44
9 Eric Lippert 44 0 44
10 Sacha Barber 31 7 38
11 Martin Hinshelwood 25 5 30
12 Eric Battalio 27 2 29
13 Carl Franklin & Richard Campbell 16 10 26
14 Jonathan Allen 17 9 26
15 Sasha Goldshtein 19 7 26
16 Dhananjay Kumar 25 1 26
17 James Montemagno 17 7 24
18 Jimmy Bogard 18 6 24
19 Willy-P. Schaub 19 4 23
20 Mike Taulty 21 1 22
21 Nicholas Blumhardt 18 3 21
22 S.Somasegar 17 3 20
23 Rob Eisenberg 13 7 20
24 Kathleen Dollard 20 0 20
25 Jeremy Clark 10 9 19
26 Jon Skeet 16 3 19
27 Phillip Trelford 17 2 19
28 Michael Crump 13 5 18
29 Immo Landwerth 13 5 18
30 Rory Becker 18 0 18
31 Rowan Miller 15 2 17
32 Sanjay Sharma 17 0 17
33 Jesse Liberty 15 1 16
34 Charles Sterling 15 1 16
35 Miguel de Icaza 12 3 15
36 Steve Smith 15 0 15
37 Bnaya Eshet 5 9 14
38 Scott Guthrie 12 2 14
39 Gael Fraiteur 11 3 14
40 Bill Wagner 11 3 14
41 Mary Jo Foley 12 2 14
42 Rick Strahl 7 7 14
43 Kim Spilker 14 0 14
44 Tatworth 14 0 14
45 MS Downloads 13 0 13
46 John Montgomery 8 4 12
47 Jeff Martin 9 3 12
48 Kerry Meade 10 2 12
49 Latish Sehgal 12 0 12
50 Richard Carr 12 0 12
51 Jonathan Wood 8 3 11
52 K. Scott Allen 8 3 11
53 Susan Ibach 7 4 11
54 Filip Ekberg 11 0 11
55 Mads Kristensen 8 2 10
56 Robert Green 9 1 10
57 Bertrand Le Roy 8 2 10
58 Daria Dovzhikova 10 0 10
59 CodePlex 10 0 10
60 Laurent Bugnion 6 3 9
61 Erik EJ 8 1 9
62 Iris Classon 6 3 9
63 Pete D. 4 5 9
64 DevToolsGuy 3 6 9
65 Dave M. Bush 7 2 9
66 Cameron Taggart 8 1 9
67 Deborah Kurata 8 1 9
68 Julie Lerman 7 2 9
69 Anand Narayanaswamy 9 0 9
70 Philip Fu 9 0 9
71 Glenn Block 6 2 8
72 The .NET Team 6 2 8
73 Jeremy Likness 5 3 8
74 Shawn Wildermuth 6 2 8
75 Ondrej Balas 7 1 8
76 Kunal Chowdhury 6 2 8
77 Adam Anderson 8 0 8
78 Jeremy D. Miller 8 0 8
79 Schabse Laks 8 0 8
80 Sam Sabri 8 0 8
81 Frans Bouma 5 2 7
82 Jean-Marc Prieur 5 2 7
83 Sergio De Simone 6 1 7
84 David Voyles 4 3 7
85 Dmitri Nesteruk 2 5 7
86 Nick Randolph 5 2 7
87 Alois Kraus 6 1 7
88 Jef Claes 6 1 7
89 Eric Sink 6 1 7
90 Josh Morales 6 1 7
91 Terje Sandstrom 7 0 7
92 Xinyang Qiu 7 0 7
93 Jon Galloway 7 0 7
94 John Papa 7 0 7
95 Daniel Rubino 7 0 7
96 Matthieu Mezil 7 0 7
97 Angelos Petropoulos 3 3 6
98 Peter Kellner 3 3 6
99 Dror Helper 5 1 6
100 Tom Warren 3 3 6

 

This definitely brings up some new names alongside the old familiar ones :)

Script

For the analysis we employed a simple F# script, using FShapr.Data’s CSV Type Provider for types over the data set and Taha Hachana’s XPlot library for charting.

Here’s the code for the top 100:

open FSharp.Data

let [<Literal>] path = @"LinksTo2015.csv"
type Posts = CsvProvider<path>
let posts = Posts.Load(path)

let topAuthors n =
   posts.Rows
   |> Seq.where (fun row -> row.Year >= 2014)
   |> Seq.where (fun row -> row.Tag.Contains ".NET" || row.Tag.Contains "Top")
   |> Seq.groupBy (fun row -> row.Author) 
   |> Seq.map (fun (author,rows) -> author, rows |> Seq.toArray)
   |> Seq.sortBy (fun (_,rows) -> -rows.Length)
   |> Seq.take n
   |> Seq.toList

let top100 = topAuthors 100

For the table I simply used another short snippet to transform the results to text for an HTML table.