Monday, February 20, 2017

Tech Briefing: Communicating Data (Data-Driven Storytelling)



The amount of available data collected every two years roughly doubles, mimicking Moore's Law. Data on the magnitude of petabytes and zetabytes. As a result, big data, and data in general, has become a trend in business and everyday life. This is because data is an underutilized resource. Everyone has it, but few are capable of capitalizing on it. 

To fully realize the potential of data, it must tell a good story. Humans understand and process information best in narrative form, which is why storytelling has been around as long as humans have. When working with data, this can seem difficult as few people associate spreadsheets with a good story. Hopefully after this tech briefing that will change.

As part of my senior thesis I am integrating data-driven storytelling into college media by doing data journalism for the Daily Wildcat. This combines two of my passions and academic activities: working with data (MIS) and story (journalism). Traditional journalism focuses on human sources, while data journalism uses data as the primary source and humans as secondary sources. Below are links to some of the stories I've published this year:  

FOIA-ing for police reports to find where parties are taking place around campus by creating a red tags heat map

7 Data Visualizations that explain the Sean Miller Era of UA basketball

Analyzing 22,000 on campus parking citations find the best place to park illegally 

Fact checking the proponents and opponents of Prop 205 (Recreational Marijuana) by digging into the data

Pac-12 Undergrad Ethnicities Viz

To tell a story with data requires going through a process which mimics data science:



  1. Ask an interesting question
  2. Get the data
  3. Explore the data
  4. Communicate and visualize the findings

Communicating Your First Data-Driven Story:
  1. Find your subject
    1. Pick a topic that has the potential for a good story
    2. Ask an interesting question that people want to know the answer to
  2. Find your data
    1. Define your terms and find the associated
      1. E.g. How to get away with parking on campus → parking citations
      2. Partying at UA → Red tags
      3. Sean Miller is a good coach but what makes him elite → KenPom, team stats vs D1 average
    2. Living in the age of data means an abundance of publicly available data. FOIA-ing public institutions is also an option
  3. Clean your data
    1. Get your data in a format it can be analyzed to relate back to your story subject
      1. Normalizing the data
      2. Pareto Principle: 80% of the time will be preparing the data and 20% will be analyzing and visualizing it
    2. Find the answer to your question
  4. Present your data
    1. VISUALIZE
      1. Humans are visual creatures, and this helps transform raw data into intuitive information
  5. Check your story for the following:
    1. Audience
    2. Lede
    3. Nut graf
    4. Reference point
    5. New understanding from data

Following this process will ensure an effective communication of a data-driven story. If you're more interested in this and data journalism, here are some more infomation to check out:

Sites to Follow:
  • 538 - The standard of data journalism at the moment
  • /r/dataisbeautiful - Lots of interesting content from around the web aggregated here.The Upshot - The NYT data section
  • QZ - International site that does high quality work
  • Flowing Data - Aggregate of different data stories
  • The Economist - Usually has good visualizations that aid reader understanding
  • Polygraph - Really cin-depthepth data stories on pop culture

Stories/Visualizations:


Tools to Use:
  • Google Sheets or Excel
    • This is what I’ve used for more detailed mapping
  • R Programming
    • Useful statistical programming language. If you’re very interested in this stuff, R is worth your while
  • Python
    • Python is the second best option for everything
    • Pandas, NumPy, Matplotlib, ggplot, sci-kitlearn

Sunday, February 5, 2017

Team URLs and First Post

Please comment below with your team name, members, and your team's URLs from the group blogs you created last week, so I can post them on the blog roll for this class.

As you begin to redesign your blog, and search for a client, create a first post on your group blog that is a brief description of your client and some idea of the project you plan to investigate.  You can add more detail about the client and what you will investigate, but keep it to a paragraph for now.
________________

Please remember to add Dr. Suzie and TA, Rhythm Vij to your blog.   To do so, go to Design/Settings/Basic.  Then Add author.  In the blank, add spweisband@gmail.com and rhythm@email.arizona.edu (separated by commas).

Thursday, February 2, 2017