Using Google Fusion Tables
This brief tutorial will guide you through a simple exercise in using Google Fusion Tables to manipulate and visualise data.
Make sure that you have Google Fusion Tables available. Follow instructions here if you need to check.
This tutorial is inspired/abstracted from a tutorial shared by Dan Nguyen at: http://www.smalldatajournalism.com/projects/one-offs/mapping-with-fusion-tables/
The datafile used are:
To carry out this tutorial you will require a Google Account.
- The first step is to download the datafiles referenced above.
- Save these on your desktop for simplicity of access (right or cmd click and choose save to ..).
- Go to your Google Drive dashboard – it should be at: https://drive.google.com
- We will input these datafile onto your drive so that they can be inspected and edited. Do this by choosing import file from the ‘New’ button.
- Choose ‘restaurants.csv’ from your desktop and choose import. Google Drive will now recognise the file as a spreadsheet format and when it indicates the process has completed, click on the small window at the lower right corner to open the file.
- If you are not used to using Google Sheets, this should look a lot like Excel. Note that Google has automatically decided that the first row of the datafile represents the column headings. You should see something like this:
- Freeze the header row so that the column headings stay in place. Choose the first row if it is not already chosen by selecting 1. Then choose View > Freeze 1 Row. A thicker line should now appear under this first row and you can scroll through all the data while the headers remain static.
- You can Sort columns by choosing the small triangle that appears when you mouse over a column letter…Sort A-Z by zip code.
- Google Sheets can operate on data in more sophisticated ways than FusionTables, so we need to prepare some of the data in this spreadsheet for mapping in Fusion Tables. Specifically we need to create a richer (and less ambiguous address) for Fusion Tables to map. So we need to create a new address field combining: Building, Street and Zipcode. This is done by defining a formula and populating a new address column with the results.
- As these are text fields and not numeric we use a text function called CONCATENATE rather than addition functions. Click the next available column without data on our spreadsheet.
- In the header enter ‘address’.
- In row B click in the empty cell and type an = sign. Note that as you start typing letters Google Sheets will suggest available functions. We want Concatenate click on it from the list that appears
- Once you select this function Sheets leaves the cursor in the cell and waits for us to give the function arguments – i.e. what cells do you want to concatenate. To do this click on Row 2 cell B which contains the building number. This cell label will be entered into the formula. If you simply click return now the new cell will be a duplicate of B2. To concatenate we need to combine the 3 cells in this row – as well as adding spaces and some static text to them. While you can simply click and type, we can also manually enter this. In this instance: =CONCATENATE(B2&” “&C2&” NY “&D2). This will result in our address cell looking thus:
- As we want to perform the same calculation to produce addresses for all our records, we can simply fill this same calculation down the column. When you hover over the small larger square in the bottom right of the cell you can double click when you get crosshairs and the calculation will cascade down the sheet.
- Now we want to send this data that we have prepared into Fusion Tables (If you cannot see Fusion Tables under the New/Tools menu, please follow these steps to attach it). Since these products use the same platform, the file we have prepared in Sheets is available directly to Fusion Tables. Return to your Drive dashboard and click the New Button again. This time we need to navigate via the More > option to choose a Google Fusion Table.
- When you select this you will get a new creation screen asking where the data should come from. Choose Google Spreadsheets and you should see the sheet ‘restaurants’ that we have been working with as your first choice.
- Fusion Tables will confirm that the first row contains headers, choose Next.
- Note that you can also access sample datasets for playing with from here or proceed to tutorials in how to use Fusion Tables. In our case we are using our own data. In the next screen Fusion Tables asks you to provide some metadata. FT is about well structured and described data – and allows for you to share the data so indicating datasource and licensing is requested. In our case click Finish.
- Welcome to Fusion Tables. You will now see your dataset – in a not dissimilar fashion to Google Sheets. However there are some very important differences.
- Right now Fusion Tables doesn’t know what to do with our data and in fact even with its built in intelligence it has not determined that the address field is actually a location that might be mapped. If we mouse over that field and choose Change … from the popup menu, we will get a dialog box:
- Choose Type > Location from the popup menu. This will help Fusion Tables determine how it can visualise data. When you do this, the various values will be highlighted in yellow and we have a little red “+” sign to start to work with our data. When you click the plus sign you will get some choices:
- When you choose Add Map, Fusion Tables will take advantage of specific location services (it’s part of the whole Google Maps infrastructure) and begin to Geocode the location from the ‘address’ column. Although we can read the address column, to be able to map it we need specific georeferences that Google Mapping can understand in this case: Latitude and Longitude:
- This may take a while. This depends on server load and a variety of other limitations including restrictions on him many addresses can be geocoded for free on each Google account. I therefore have carried this out for you in advance.
- The Geocoded Datafile (that you have all been working with is at: GeocodedRestaurants)
- Either allow the geocoding to finish – the honest route or click the link above. Once the geocoding is complete (it will finish really) you will have a new tab on your Fusion Table screen. It should look thusly:
- We can now explore this visual representation. Clicking on any dot will present us with the data behind that dot:
- We can also change the map styles that we use and customise the display:
- We can also start to filter the data based on criteria. In this case I can search for specific restaurants. If I start to type Dunk… I can have it present only Dunkin Donuts:
- But where Fusion Tables really starts to shine is by augmenting our datasets and matching them to create new richer, and deeper one. let’s merge our Restaurants with Inspections. To do this wee need to import the csv file references above and create a new fusion table with it. Choose New > More > Google Fusion Table and from your desktop select that csv file ‘inspections.csv’.
- This is a far more visually boring file, but the value lies in being able to combine this data with our mapped restaurants and learning more about how safe each of those restaurants might be. The ‘camis’ column is a unique identifier that appears in each so we can merge these datafile using that key field.
- With your map of restaurants open, choose File > Merge and you will be asked to choose another dataset (in this case the Inspections file that you just imported) to merge – join with the Restaurants file.
- When you choose that file, Fusion Tables will ask you to tell it which field is common between the two files – it will suggest the first column it comes to which is clearly not correct:
- From the ‘This Table’ drop down menu find the ‘camis’ heading and select it. Things will look a lot more uniform at this point:
- Now select Next to tell Fusion Tables to create a new combined dataset by matching records based on this field – Fusion Tables will ask you to select what columns this new table should contain. In this case let’s leave it to the default of all columns from both tables. Click the Merge button and watch the fun:
- The resulting table will now be available for viewing and stored in your Drive. A link to the new table will be offered:
- Go ahead (you know you want to…click on the View table button. What young have is a table containing all the data from both the restaurants and the inspections tables combined based on the ‘camis’ filed which is the unique identifier for a restaurant used by the New York Department of Health.
- What we want to do is see if we can ‘see’ where problem restaurants might be based on their health scores. To do this we need to qualify our quantitative data. On our Map of Address we choose ‘Change feature styles >’. Here we can define ‘Buckets’ into which we can aggregate the data results and colour code them for viewing:
- When we divide the data into these custom buckets … with a clean score being up to 14, warnings of 14-28 and above 28 as under immediate order (over 50 is just plain avoid at all costs). Click save and let’s see what the map looks like:
- In fact if we zoom out we can even see that there are serious problems at a McDonald’s on Staten Island:
- At this point we can start to ask questions of area rather than of specific restaurants. If I change the map type from point features to a Heat Map based on Score, I can see that there are a great concentration of problem scores in lower Manhattan:
- In fact we can start to get even more creative and start to chart the data we have. If I choose the plus sign I can use the data to create a network diagram:
- I may also be interested in charting the correlation between score and zip code:
- Finally, it comes time to publish your work. You can share your completed visualisations in a variety of ways allowing for interactivity:
- Beyond simply sending a link to your friends, the ability to embed the finished work in your own web presence (based on data that can continue to change) involves copying the HTML embed code into something such as your own blog