Quantitative notes from Tufte
I am finally reading his books after attending his conference a while back.
P13 Graphical displays should
- Show the data
- Induce the viewer to think about substance, not graph design
- Avoid distorting the data
- Present many numbers in a small space
- Make large data sets coherent
- Encourage the eye to compare different pieces of data
- Reveal the data at several levels of detail
- Serve a clear purpose
- Be closely integrated with the statistical and verbal descriptions of the data
P42 Small multiples are economical: once viewers understand the design of one slice, they have immediate access to data in all the other slices…the viewer focuses on changes in the data.
P51 Principles of graphical excellence
- …well-designed presentation of interesting data – a matter of substance, of statistics, and of design
- …complex ideas communicated with clarity, precision, and efficiency
- …gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space
- …is nearly always multivariate
- …requires telling the truth about the data
P61 show data, not design variation.
P68 with displays of money use deflated and standardized units
P71 the number of information-carrying (variable) dimensions depicted should not exceed the number of dimensions in the data.
P74 Graphics must not quote data out of context. At the heart of quantitative thinking: ”compared to what?”
P77 Graphical integrity:
- The representation of numbers as physically measured on the graph, should be directly proportional to the numeric quantities
- Clear, detailed, and thorough labeling…write out explanations of the data on the graphic itself. Label important events
- Show data variation, not design variation
- Use deflated and standardized units of money
- The number of information-carrying (variable) dimensions depicted should not exceed the number of dimensions in the data.
- Graphics must not quote data out of context
P90 “everyone spoke of an information overload, but what there was in fact was a non-information overload.” Wurman, What-If, Could-Be (Philadelphia, 1976)
What was the Tufte quote at the Museum of Science and Industry in Chicago?
Principles in graphical design:
- P92 The fundamental principle of good statistical graphics: Above all else show the data.
- P96 Maximize the data-ink ratio, within reason; P93 Data-Ink ratio = data-ink / total ink used to print the graphic. 1.0 is max.
- P96 Erase non-data-ink, within reason. E.g. redundant data-ink
- Revise and edit
P112 the optical illusion caused by many lines close together in hash lines or chart lines is called moiré vibration and has no place in data graphical design. The grid should usually be muted or completely suppressed…- lest it compete with the data.
P116 avoid turning carts into ‘ducks’ where the decorative design takes over.
P125 I really like the conversion of various types of charts into improved versions.
The box plot turned into line dot line plots, e.g. ______ . ____
The bar chart lost the frame, the grid, the vertical axis…. white lines on the columns showed the grid
In the scatter plot he transformed the frame of the graphic into data, showing the data range, a ‘range-frame,’ and further improvements turned the scatter plot into a dot-dash-plot where the frame is turned into data showing the marginal distribution of each variable.
P135 A rug plot puts the axes to work connecting one chart to another by the shared axes.
P136 more information per unit of space and per unit of ink is displayed….the history of devices for communicating information is written in terms of increases in the efficiency of communication and production.
It is a frequent mistake in thinking about statistical graphics to underestimate the audience….why not assume that if you understand it, most other readers will, too?
P139 Mobilize every graphical element, perhaps several times over, to show the data.
P140 Tukey – “If we are going to make a mark, it may as well be a meaningful one. The simplest – and most useful – meaningful mark is a digit.”
P149 Data-based labels…with a coordinate line…erase part of the line to show the data range, leave the ticks on the line, but put the actual numerical max and min at the two ends of the new data-range coordinate line.
P151 He transformed an XY plot with your typical dots and the two labeled and ticked X and Y axes into a plot using the actual X and Y values as the X and Y axes and the Y values as the dots on the plot.
P162,6 the data density of a graphic = number of entries in a data matrix / area of data graphic. The current record is 17000 numbers per square centimeter.
P168 Maximize data density and the size of the data matrix, within reason.
P169 Graphics can be shrunk way down. Repeated applications of the shrink principle leads to a powerful and effective graphical design, the small multiple.
P175 Well-designed small multiples are
- Inevitably comparative
- Deftly multivariate
- Shrunken, high-density graphics
- Usually based on a large data matrix
- Drawn almost entirely with data ink
- Efficient in interpretation
- Often narrative in content showing shits in the relationship between variables as the index variable changes.
For non-data-ink, less is more; for data-ink, less is a bore.
P177 Graphics elegance is often found in simplicity of design and complexity of data. Attractive displays of statistical information:
- Have a properly chosen format and design
- Use words, numbers, and drawings together
- Reflect a balance, a proportion , a sense of relevant scale
- Display an accessible complexity of detail
- Often have a narrative quality, a story to tell about the data
- Are drawn in a professional manner, with the technical details pf production done with care
- Avoid content-free decoration, including chart-junk
P178 The conventional sentence is a poor way to show more than two numbers because it prevents comparisons within the data. There are nearly always better sequences that alphabetical (mindtrail to Everything Is Miscellaneous) Use a text-table; they are clearly the best way to show exact numerical values. Tables are preferable to graphics for many small data sets. Given their low data density and failure to order numbers along a visual dimension, pie charts should never be used.
P181 Data graphics are paragraphs about data and should be treated as such.
P183 In friendly data graphics:
- Words are spelled out…elaborate encoding avoided
- Words run left to right
- Little messages help explain the data
- Elaborate shadings, cross-hatchings, and colors avoided; labels are placed on the graphic itself; no legend is required
- Graphic attracts viewer, provokes curiosity
- Colors, if used, are chosen sensitive to color-blindness
- Type is clear, upper and lower case, with serifs.
P186 on a graphics the contrast in line weight represents contrast in meaning. The greater meaning is given to the greater line weight; thus the data line should receive greater weight.
Graphics should tend to toward the horizontal, greater in length than height.
P191 Epilogue What is to be sought in designs for the display of information is the clear portrayal of complexity. Not the complication of the simple; rather the task of the designer is to give visual access to the subtle and the difficult – that is, the revelation of the complex.
Entry filed under: feed my pet brain.