According to Li ( 2020 ), data visualization is the presentation of data in a manner that allows for pattern recognition and problem identification. Data, which in its raw state is unable to be interpreted is therefore processed to create a data visualization. Data can be thought of as fuel with data visualization being the engine that results in valuable information. Kosaro ( 2007, pp2 ) simplifies the process of data visualization into three steps:
- Based on data -
- Produces an image -
- Results in recognizable information and are readable.
Data visualization is by no means a new concept. Data visualization is as old as human communication and is embedded in our ability to interact with the world around us. Data visualization has largely contributed to our advancement as a human race from primitive times and is still a crucial aspect of understanding our world. It is only expected that the introduction of technology to data visualization would broadly impact this field. Technological advances have allowed for greater availability and transparency of data which means that we currently have the most data in human history ( Kirk 2012, pp18 ). With there being a technological presence in almost every step of modern life, the digital footprint is created. This digital footprint consists of data from all our digital interactions. The importance of tracking trends of users for the benefit of organizations has led to the large quantification of our data.
There is a bounty of data but former chief economist of Google, Hal Varian (2009) highlights that there is a “scarcity” in digestible data. While technology has allowed for the automation of data handling, it is expected that humans will remain the end-users of this data for years to come (Chen et. al 2007, pp13-14).
In order to create understandable information, it is crucial to know what data is. Data refers to 'unprocessed information' This means that there is no substance to the data beyond itself and therefore it must be refined to give form to the data, which is often visual. There are two distinct types of data: primary and secondary data . Primary data refers to numbers and people, while secondary data refers to the derivative information that can then be analyzed. This means that is the secondary data that allows for the creation of data visualization.
It was stated earlier that the form given to processed data is largely visual. Cognitive science indicates that visual perception is the most effective as the brain retains 8 times more information when data is presented visually than the other senses ( Kirk 2012 ). Li states that there are two primary types of data visualization that include a variety of methods to display the information, being scientific visualization and information visualization. These types of data visualization are context specific as the reason for using a specific method of visualization is heavily reliant on the use of the raw data.
Scientific visualization
Scientific visualization aims to create processed data for the purpose of scientific research. The key characteristics of scientific research are recognizability and meaning understanding so that predictions based on that data can be made. The main forms of scientific visualization are:
Waveforms, simulations, and volumes
Information visualization
Information visualization refers to the efficient communication of information. Often, this refers to more commercial or widespread data visualization. The key characteristic of information visualization is readability. This means the larger spread of information visualization requires the visualization to be accessible as the implications of visual literacy are highlighted. The main visualization techniques regarding information visualization are:
Tables, charts, trees, maps, scatter plots, diagrams, and graphs.
The need for information visualization has created the field of information design. Due to the need to understand the cognitive implications of visual communication in order to create visual structure, organizations such as the HCi have been created. Historically, the aesthetic value of data visualization has been overseen through the improved understanding of the human psyche, and bodily functions and technology have transformed the methods of presenting data. The best approach to data visualization is still in its infancy regarding research as data perception is fundamentally subjective. The pursuit of objective data visualization thus remains a hope for the future. The key studies regarding data visualization consist of the similarity and guided search theory, the Texton theory, and the Feature Integration techniques.
This is all in an attempt to bridge the gap between the amount of data available to use and the need to understand this data. It is crucial to understand data as an intricate system with many influences. Kirk states that:
The amount of technological advances in the last decade has resulted in a drastic need to understand the new information available to us. This requires nuance when working with raw data. When approaching the field of data visualization, the key factor to remember is that data with context is crucial. The context of data results in the type of visualization and this also enforces the importance of the end goal of the data in creating a visualization. The context of the data’s end goal also includes the end-user of this information and the person receiving the information shapes the data visualization. The context of the end-user is further highlighted by the user’s literacy level, which implicates socio-economic factors in data visualization. In conclusion, it is crucial for technologists, scientists, and creatives working with data to understand that it does not exist in isolation and that data is political at its core!