Treemap. How does it work ?

Inspiration
I did some research on the net and mainly used Jonathan Hodgson's post on codeproject.com, dating from june 2004, that describes the Squarified treemap technique.
The  Squarified treemap algorithm is explained in an academic paper related to Human Computer Interaction (HCI) by Ben Shneiderman of the University of Maryland.
You can find here his excellent explanation that I followed to write this User Defined Function for Excel.
A great Treemap summary post on Juice Analytics' blog also helped me to provide the right features.



Parameters
First of all, remember that your data must be sorted with highest values at the top of the column (I know, this could easily be integrated in the UDF... sometime maybe).

The syntax of the UDF is :
TREEMAP(DataRange, DestinationRange, HeightPixel, WidthPixel, ColorRange, ColorScale, LabelRange, TooltipRange, HighlightPosition).

Only DataRange and DestinationRange are compulsory parameters. Others are optionals.

Lets describe each parameter :
- DataRange : Range of data containing the values used for the Size of the sectors. Must be 1 column by X rows.
As clearly explained on Juice Analytics blog : "Size of the boxes should be a quantity measure. 
The measures should sum up along the hierarchical structure of the data. 
The sum of all the elements in one branch need to sum to the value of the branch as a whole. 
Therefore, you can’t use ratios or dates or any other measure you wouldn’t use in a pie chart."

- DestinationRange : Range of cells that will contain the Treemap or 1 single cell...
If DestinationRange is a single cell, we have 2 cases with the following parameters :
  • HeightPixel or WidthPixel are not specified or both  = 0 :
Treemap will be displayed inside this unique cell (you should better make it big, or merge several cells)
  • HeightPixel AND WidthPixel are specified (integer number, lets say between 100 and 500):
Treemap will be displayed inside a rectangle of corresponding height and width (in Pixel or points, I am not sure).

This is especially important when comparing 2 Treemaps and you want to have exact proportionnality of their respective total surface (lets say representing my stock portfolio's value before and after the crisis... just a painful example).

- ColorRange : contains the data (number or text !) used to define Color of the individual sectors. If this parameter is left blank, all sectors will be grey... clear and simple (Hi Robert !)
Again Juice Analytics : "Color of the boxes is best suited to a measure of performance or change such as growth over time, average conversion rate, or customer satisfaction."
Color can also be used to identify categories.
ColorRange must have the same size as DataRange : One color per sector, so logically organize your data in identically sized columns.

- ColorScale : No flexibility here. This has to be a range of 2 columns by Y rows, like for the Heatmap() UDF.

The first columns contains intervals or categories, in line with the values of the ColorRange area.
In front of each interval value, the 2nd column contains the RBG color code that will fill the sectors.
Color codes are available here and from Cynthia Brewer's ColorBrewer.
The color scale must be organized so that the lowest value is at the top, like in the screenshot from the previous post.

- LabelRange : Staightforward, range of cells (same size as DataRange again), containing the text for labels to be applyed to each sector.
I tried to maximize the size of the text inside the sector, like for tag-clouds, it emphasises the importance of big sectors in the treemap.
Labels will be displayed in black or in white, always maximizing the contrast against the background sector color.

- ToolTipRange : same as LabeRange, but this text will be displayed in a floating tooltip when hovering the mouse over the treemap.
Defenitely usefull to provide more detailed information than the label could.

- HighlightPosition : Integer number. Basically, it's the rank in the data range of the sector that you want to highlight. A red frame will be displayed around the corresponding sector... long to explain, easy to understand if you have come so far (Hopefuly!)

Regarding the VBA code.
1) it works reasonnably well. I tested it with 1 500 sectors with no trouble and great resolution in the smallest sectors
2) it is damn slow due to numerous iterations and loops, that could certainly be avoided by using collection or dictionnary objects instead of arrays.
3) Unlike C#, Silverlight or Java implementations, this code is not recursive... might be a reason why it's soooo slow.

Actually, this UDF is not a Treemap, rather a Branchmap... A treemap would have a fractal structure, with each sector including at least one embeded treemap, like real  tree-trunk with branches and leaves.

I you are interested to go further with Treemap, specific Treemaping softwares are listed here and more information available from the University of Maryland.

Last and not least.
The permantly-beta version of the add-in is available here (for XL2007 only)
An example file is available here.

Now you can create an interactive version of this nice Treemap from the NY Times... with Excel !

Felices fiestas !

16 comments:

Unknown said...

This is a very informative article. I have fallen in love with treemaps as well, and continue to find more and more uses for them wherever I turn. If you’re not well versed in manipulating technical parameters (like myself) I’ve found an awesome service that automatically creates treemaps for you! It makes creating treemaps super quick. The service is available from www.hiveondemand.com and they offer a free 7 day trial!

Happy treemapping!

Mes Cours said...

@Daniel : Excellent link to a very comprehensive tool... BTW, Ben Shneiderman, who developed the treemap concept in 1991 is member of the board of the Hive Group
Maybe I could charge 10% of what they ask ? $199/user/year !

Unknown said...

@Fabrice Rimlinger : I noticed that as well! They have a lot of slick applications on their website in the “Gallery”...iTunes, Digg, World Population, etc. The World Population treemap is my favorite...I never knew Monaco was so dense! http://www.hivegroup.com/gallery/worldpop/

Patrick Leunissen said...

Hi Fabrice,

Great tool. What I would find usefull if you were able to highlight not only one row of data but several; for example the top 10 in CO2 emission in your sample

Patrick Leunissen

Mes Cours said...

@ Patrick : This can be done with a 2 colors color scale, as shown in the following post .

Patrick Leunissen said...

Hello Frabrice,

It's me again. Is it possible to adjust the size of the labels printed in the middle of each square. Maybe a nice addition.
I make a treemap on 30x40cm paper and in this case the text is printed smaller and thus harder to read than possible.

Jim - Uber Driver said...

How long does it usually take to produce the tree maps.

Mes Cours said...

@ MutuelInvestor...
Ir depends... it is immediate with 20 values. Can take 10 seconds with 500 to 1000 values.

Rendering time also increases when adding tooltips and Hyperlinks.

Greg said...

Hello,
I'm not abble to générate the treemap with my own data.
Could you tell me how to do ?

I try to reduce nb of line with same data and the treemap get down
thanks

Anonymous said...

I have to say this is a great tool and you are very generous to share this with everyone.

Just a suggestion, but an excel file with examples of use would make things easier to pick up.

But, great work nevertheless.

Sly said...

Very usefull and incredibly fast...

I was wondering if it was possible to improve it with a second level, I mean, another treemap inside each boxes...

inventif said...

Hi Fabrice,

I still prefer (by far) your Sparklines & Family to the ones natively available in Excel 2010, excellent work!

Is it possible to use TreeMap to display hierarchical/nested values, as in the Market Treemap at SmartMoney?

Cheers,

--Mario

Humanidad sostenible said...

I applied the treemap and can not find the option or tooltiprange labelrange colorscale or not higlightposition. aplicola function when the graph goes all black. How I can make this graph with the add in?

thank you very much

Anonymous said...

How do I keep the shape colors from having a gradient fill? They all have the same starting gradient, which makes it difficult to distinguish the actual color of the shape.

Mes Cours said...

you probably have an old version of the add-in.
This known issue has been solved now.

Anonymous said...

Hi Fabrice,
A big thumbs up for your excellent add-in. It's great!
Particularly the Treemaps are such a strong visualisation tools.
Your Treemaps would be even closes to perfection if they would offer nesting possibility. Any chance of seeing that functionality in upcoming versions?

Thanks again!
H