Mapping Datasets with Palladio and Google Fusion Tables

Today’s world is driven on data. From big corporations like Google collecting user data to make browsing the internet more user friendly, as well as allowing Google’s targeting of its users, to small scale data collection done by a college student for a project, data is more than a spreadsheet filled with numbers and letters. Often, the ability of a dataset to have an effect on the viewer, or provide the viewer additional context, requires the need for some form of visual representation of said dataset. There are many different types of visual representations that can be used, everything from graphs to charts can all be employed to better convey the data to the end viewer. However, for some datasets that focus on specific locations and how they potentially relate to each other, another visual tool can also be used.

If you have a dataset that is based on location, such as the one I will use in my example, using maps may be the better choice for a visual representation, rather than a simple chart of graph. The reason for this is that it allows for a sense of location for a project. For example, providing that you have data that shows a certain dataset by location, it will be more useful for the viewer to physically see the distribution of your data on a map of the area, in contrast to a list of locations that are not very precise or visually interesting for the viewer. In addition, data mapped in this way can contribute to a better understanding of what is now called spatial history. The use of spatial history, as explained in the article by Zephyr Frank, is assisted by visualizations, as they allow for the observation of patterns and trends. The ability to view a trend, and the ability to interpret some form of spatial history will be apparent in my upcoming visualization example

Visualization can be done in a variety of ways. One such tool is Palladio, which maps the location of each item, and can even provide a timeline if your dataset has the necessary information. For example, using the dataset of the Cushman Collection, which is a collection of different types of photographs taken around the contiguous United States Screen Shot 2015-11-01 at 9.49.31 PMbetween 1938 and 1955, you can visually map the geolocations of where the photographs were taken, giving the viewer a sense of the scale of the collection and the distribution patterns of where popular locations were. An example of a Palladio mapping of this dataset can be seen in the photograph above. In this visualization, the viewer can clearly see that many of the photographs were taken on both coasts and in the central and southern United States, with only a small portion taken in the Northern States. This ability to see where the photos were taken and the concentration of their locations over the United States contributes to the spatial history of the particular areas.

Another option that you have if you want to use a visualization software is to use GoogleScreen Shot 2015-11-01 at 10.00.38 PM Fusion tables. As seen here on the right, the basic data modeling of the same collection on Google Fusion tables looks very similar to Palladio’s. However, unlike Palladio, Google Fusion tables offers a few alternate visualizations that are very interesting to experiment with. For example, Google Fusion tables allows you to create a heat map based on the locations of the data, as shown below. This is the exact same information as show on the other two maps, only it is presented in a different way. Screen Shot 2015-11-01 at 10.05.14 PMHowever, as interesting as the heat map is, it is clear that there is a problem when using it as the only source of your visualization of the data. Because it maps the data based on the concentration of data in certain locations, locations with less data points are sometimes left out. This can be seen in this particular example as both the Pacific Northwest and Texas have no data points in the heat map view, while the other views show the data points in those areas. For this reason, it is important to take this into consideration when attempting to create a visualization. Failure to pick a software that accurately models your data can create a visual representation that misrepresents the data, thus rendering the visualization unreliable and inaccurate.

Kyle C.


Sources:

Zephyr Frank, “Spatial History as Scholarly Practice”.

 

Leave a Reply

Your email address will not be published. Required fields are marked *