*Authored by Steven Hall*

If you landed here I imagine you have the same problem I had in getting my CSV data formatted to use in those wonderful d3 charts like the sunburst partition or tree maps. Many of the examples on the d3 page use the flare.json file to illustrate the data visualization, but how do you get your data into that format? If you are new to javascript this can seem pretty daunting, but this tutorial should get you on your way. We are going to use the fantastic underscore.js to make quick work of this problem.

**In this tutorial we create a function that formats a CSV file to load into D3. You can take a look at the example and "view source" to follow along:**

## View the example in your browser

**A Nesting Function**

To solve our problem we create a function that will take CSV data and reformat it to work in the d3 visualizations that require a hierarchical JSON type of input. Essentially, we are running a recursive algorithm over the rows to get a nested version with each grouping getting a set of child elements as we see in the flare.json file. The function will take the CSV data and an array of fields that you want to use to construct your hierarchy (you can go as deep as you need to). If you are familiar with javascript you can just take this function and integrate it into your project. The rest of the tutorial will explain how to use the function in a visualization and then provide an example of using the code in a d3 chart to produce the graphic above. So here is our example function (requires underscore.js):

// The function takes two arguments:

// csvData - array of data rows

// groups - array of strings i.e. ['g1', 'g2'] or ['g1']

function genJSON(csvData, groups) {

var genGroups = function (data) {

return _.map(data, function(element, index) {

return { name : index, children : element };

});

};

var nest = function (node, curIndex) {

if (curIndex === 0) {

node.children = genGroups(_.groupBy(csvData, groups[0]));

_.each(node.children, function (child) {

nest(child, curIndex + 1);

});

}

else {

if (curIndex < groups.length) {

node.children = genGroups(

_.groupBy(node.children, groups[curIndex])

);

_.each(node.children, function (child) {

nest(child, curIndex + 1);

});

}

}

return node;

};

return nest({}, 0);

}

**Using the Function with D3**

Ok. great. So now how do I get this function to work with d3? Assuming you are already familiar with loading CSV files with d3, this will be relatively straight forward. We just need to load the CSV as normal and then pipe the rows through our function before sending the data to our visualization. You need to make sure underscore.js is loaded on the page to make this work. In a typical page this would then something like this:

d3.csv("data/theDataFile.csv", function(error, data) {

//*************************************************

// FUNCTION

//*************************************************

function genJSON(csvData, groups) {

var genGroups = function(data) {

return _.map(data, function(element, index) {

return { name : index, children : element };

});

};

var nest = function(node, curIndex) {

if (curIndex === 0) {

node.children = genGroups(_.groupBy(csvData, groups[0]));

_.each(node.children, function (child) {

nest(child, curIndex + 1);

});

}

else {

if (curIndex < groups.length) {

node.children = genGroups(

_.groupBy(node.children, groups[curIndex])

);

_.each(node.children, function (child) {

nest(child, curIndex + 1);

});

}

}

return node;

};

return nest({}, 0);

}

//*************************************************

// CALL FUNCTION WITH ARRAY OF GROUPS

//*************************************************

var preppedData = genJSON(data, ['group1', 'group2'])

//*************************************************

// YOUR DATA VISUALIZATION CODE HERE

//*************************************************

});

**An Example**

Yeah, that's great, but no self-respecting tutorial would end without a full example. To that end, let's start with some real data. I have an example file that has data on global religious populations at five year intervals from 1945 to 2010. You can take a look at the data here. The fields in my file are:

- year - Years from 1945 to 2010 at 5 year intervals (1945, 1950, 1955...)
- cat - Category - Christianity, Islam, Buddhism, or Judaism (only 4)
- type - Protestant, Catholic...etc. These are sub-categories.
- pop - The population

In this somewhat contrived example I want to produce a sunburst partition like the one above. I want to have three levels of hierarchy. I want the inner ring to be years, then categories in the second (christianity, buddhism, etc) and then show the sub-categories (Protestant, Catholic, etc) in the outer ring. The size will be determined by the population of each group. So my groupings that I will send to the function will be 'year' and 'cat' (in that order). A full explanation of how to make a sunburst partition is outside the scope of this tutorial, but I just made some slight modifications to the example shown here. So putting everything together we end up with the following page which you can just "view source" on to see the full code with some comments. The final example is located here.