OpenSpending is a database, analysis, and visualization tool for budget- and expenditure-related data. It acts as a data warehouse, giving you access to budgetary data from around the world.

One of OpenSpending’s most commonly used features is its ability to easily visualize budgets in two different formats. In this tutorial, we will use data from the World Bank’s BOOST portal and visualize it using OpenSpending.

What you’ll need

We’ll use these tools:

  • Open Refine – for converting the data
  • Mozilla Firefox or Google Chrome (Refine requires a browser, and it doesn’t work well with Internet Explorer)

The World Bank’s BOOST portal aims to provide insight into the way countries spend their money. They have done this for World Bank grants in Kenya with the portal kenya.wb-boost.org. In this tutorial, we’ll use this data and visualize it using OpenSpending.

The first step in the process is downloading the data from the World Bank’s BOOST platform.

Walkthrough: downloading data from the BOOST platform

1. Open kenya.wb-boost.org in your browser.

2. It defaults to a map view, which is not optimal for our purposes. Switch to the pivot table view of the tool by clicking on the “pivot table” tab.

BENYBVRThe table view is initially empty because the system does not know how to aggregate information.

3. On the left, there are the options for the data we want to have shown. Add more groups to the selection to allow for a finer picture. Do so by dragging them from the “Group Rows” box into the “Selected Groups” box.

We want to have:

  • Top-level spending Unit
  • Expenditure type
  • Sector
  • Economic Category
  • Sub-Economic Category

When you have these, click on the “Build Table” button in the middle of the gray area. This will take awhile.

4. Now download the data with the “Export to CSV” button on the center top. This will ask you to pick a folder in which to store the file.

Preparing the data for OpenSpending

Great — now we have the data, so let’s prepare it for OpenSpending!

OpenSpending needs the data to be in a specific format to be able to use it. Specifically, OpenSpending needs one column for the amount spent, one column for the date of the transaction (in our case the year), and then more columns for other properties (such as the spending unit, the sector, the economic category, and so on).

To be able to use the data we just downloaded in OpenSpending, we have to bring it into the necessary form. We will do this with a tool called Open Refine. Refine is made for cleaning data and converting data from one format to another.

Walkthrough: converting data with Refine

1. Start Refine by double-clicking on the Refine icon. Once it’s ready, a browser window will open and point to 127.0.0.1:3333.

2. First we need to create a project with our new data. Do so by clicking on the “Create Project” tab.

3. Now choose the file we just downloaded and click “next.” This will open the Preview tab. You’ll notice how the first row is empty and contains the years for the executed budget.

4. We want to have the years in the column headings. Let’s tell this to Refine. Enter “2″ next to the box saying “Parse the next ____ lines as column headers.”

parse

5. Now that this is fixed, let’s create the project. Click on the “Create Project” button on the top right.

6. OpenSpending wants a single column for all the amounts and a single column for all the years. Right now, we have a column for each year. No problem, let’s transpose it.

Select the column options for the first column (“Executed 2003″) and select “Transpose → Transpose cells across Columns into rows”.

7. Select “Two new Columns.” Call the first Year and the second Amount. Also make sure to select the “fill down in other columns” option.

transpose

8. Click on “Transpose” to do the transposition.

9. Now we only have to remove the “Executed” in the year column. We do this using a “Transform” on the column.

10. A menu pops up asking us for the “Expression” to transform. Expressions are like formulas in spreadsheets.

The expression we want to use is “value.replace(“Executed “,”")” — this will remove the “Executed” from the cells.

11. Next, we have to check whether all the years and all the amount fields actually have values in them. Let’s do this using a facet. Select “Facet → text facet” on the year column.
facet
This will open a Facet window on the left side.

12. There are 39 rows where the year is blank -– let’s delete them. Select the blank rows by clicking on “(blank).”

13. Now let’s remove the rows by selecting “edit rows → remove all matching rows” from the “ALL” column options.

14. Remove the facet by closing the facet window with the little “x” on the top left.

15. Now our dataset is ready for Open Spending!

Let’s export it to CSV (“Comma-separated value”):

Congratulations! You have successfully prepared a dataset for visualization and analysis in OpenSpending.