For our first Data Representation project, we chose from publicly available data sets on the Guardian’s website. We were to then generate two different visualizations from the imported data. The first visualization could be straight-forward, highlighting some aspect of the readings. The second should highlight some unique characteristic of that particular data.
After trying out a few data sets, I ended up going with a breakdown of U.S. military casualties from Iraq sorted by the home state of the soldiers wounded or killed (original data set here). This is what I produced:
The top visualization shows total casualties by state, with the color, size and positioning of the graphics changing based on the totals. California and Texas represent the majority of the casualties, which was a bit of a surprise.
In the second visualization, the number of wounded are represented by the width of the red bars on the right. The states are listed alphabetically. On the top left in blue, the stars change in size based on the number of soldiers killed (also sorted by state and listed alphabetically).
Looking past any political or other social implications of the data, I had to deal with a few details to finish the second visualization. The data set included the 50 states plus 6 other locations (District of Columbia, Porto Rico, Samoa, Virgin Islands, Guam and the Northern Mariana Islands). The extra territories actually worked to my benefit with the red stripes. With 56, I placed 8 locations within each of the 7 stripes. But with the stars, the pattern was already set at 50. I addressed this by first building for the 50 states. I then added stars in between that pattern at the position where the territories fit alphabetically. I dropped the opacity to 50% for them as well (although that seems to not have held up when I generated a PDF file). For 5 of the 6 places, the casualties were quite low and so the stars are quite small. The exception was Porto Rico, which represented 36 killed soldiers. This made for some rather tedious coding. Nonetheless, I think this solution works.
For scale, I added a guide along the bottom for the number of wounded. I also added the totals for California and Texas on both the killed and wounded section, giving some perspective with the highest totals.
The assignment’s goal was to get our feet wet importing a .csv data file, parsing it and generating the type of visualization that we wanted. It was a bit of a challenge to incorporate Jer’s methodology for coding, although it is definitely worth it. As a result, the code is rather easy to sort through and identify where the problems are or what to change to get different effects. Setting these good work habits will pay off down the road for sure.
I was a bit nervous before tackling this assignment in part because data visualization is something that I am very excited about. Yet I do not consider myself that well versed in Processing, the program we use for most of the work in class. I started off re-reading Dan Shiffman’s “Learning Processing,” which was my textbook from last term. After a review of my notes from this term’s class and a few helpful suggestions from classmates, I was able to get things moving faster than I expected, which is exciting.
I expect that some of the code I wrote is done in an inefficient or inelegant way. But I suppose this is part of the learning process. I hope to review my sketches with some of the ITP residents in the coming days and get some advice from them.