Immutable Collections Critique

3/14/2013 1:12:45 AM

Immutable Collections Critique

Microsoft released a preliminary version of immutable collections with mutable performance. It uses many of the same performance tricks that I used to build functional collections such as supporting freezable binary tree data structures. The algorithm used is similar to AVL trees in that it minimizes the difference in heights of two child trees. Red-black trees are not used probably because the number of rotations will still cause the O(lg(n)) allocations in an immutable setting and these allocations are a function of the height of the whole tree..

However, one major difference is the lack of support for constant equality comparison and fast set operations, each of which will limit the usage scenarios for the collection. Set operations in the Microsoft library are O(m lg(m+n) ), but can be implemented in O(m lg(n/m)) and even faster with constant-time equality comparisons O(d lg(n/d)), where d is the minimum of the number of differences between the two sets and m. A common problem which this helps is quickly determining all the changes between two different versions of the same collections, but I often use another solution with O(d). This is important for using immutable data structures pervasively such in a layout engine, where there are potentially many items and the interface needs to remain responsive.

At some point in the future, I plan to open-source my functional collections and other data structures as they are typically the fastest implementations available. In addition, multiple different strategies can be used to optimized for different types of usages such as full persistence, back-tracking, and ephemerality.

UPDATE: I just pulled the performance numbers from my head from a while back, but I think I made a mistake. They look wrong to me. I wrote quickly, and, on reflection, the Microsoft libraries should probably be O(m lg (n+m )), not O(n+m), for the union, intersection, etc operations. I also need to verify the bounds for mine. I’ll probably post a correction with a detailed analysis in the near future.






My name is Wesner Moise. I am a software entrepreneur developing revolutionary AI desktop applications. I worked as a software engineer in Microsoft Excel group for six years during the 1990s. I worked on PivotTables and wrote the most lines of code in Excel 97-- about 10 times the median developer. I have a Harvard BA in applied math/computer science and a UCLA MBA in technology entrepreneurship. I am a member of the Triple Nine Society, a 99.9 percentile high-IQ society.

Social Media