Eric L. Nelson, Ph.D.
The 2013 Annual
Census of Employees in State Civil Service (aka Annual Census Report, or ACR)
contains a number of mistakes. The
purpose of this blog is to illuminate some of these problems, and to suggest
how they could be corrected in future ACR's.
Problem #1: Improper Use
of Trend Line Graphs
Figure 1.
Trend line
graphs (aka figures) are used to show within
group changes over time. For example,
in the last 10 years the state's hiring of people with disabilities has
consistently been much lower compared to the four comparison groups, as seen in
Figure 1 (left). Click
here to see the data table used to construct this figure.
The essential
point is that trend lines should only
be used to compare measures taken at different times of the same group. Trend lines should never be used to join discrete (different) groups, because doing so
improperly suggests that a relationship may exist between these groups.
Figure 2. Reproduction of Figure 24, ACR 2013
An example of improper use of a trend line graph is found in Figure 24, 2013 ACR, reproduced in Figure 2 (click to enlarge). As can be seen, a trend line joins discrete groups together, i.e., males and females, people with disabilities, and many races. This linkage could cause readers to mistakenly conclude relationships exist where there may actually be none.
Figure 3. Reorganization of Order of Presentation,
Figure 24, ACR 2013
A similar problem is created when discrete data is sorted into some type of order such as smallest to largest, and then graphed using a trend line, as seen in Figure 3 (click to enlarge). This figure was created using the same data used to construct Figure 24. Notice the rising trend line which seems to suggest a relationship between discrete groups where none may actually exist.
Figure 4. Bar Graph Using Figure 24 Data
Bar graphs are one way discrete data can be presented, as demonstrated in Figure 4 (click to enlarge). Bar graphs display individual measures using bars that are not connected by a trend line. This figure was created using Figure 24 data.
Recommendation: The state should consider using bar
graphs to display discrete data.
Problem #2: Failure To
Provide Data Tables Behind Some Figures
It is helpful to provide data used to construct a figure for at least two reasons. First, the figure may draw the attention of a reader who wants more detail, as provided by actual measures in a table. Second, a reader may not believe a graph has been created accurately. By providing tabled data, (s)he can either become convinced of the accuracy of the graph, or may find a mistake and thereby be able to bring that to the attention of the graph's author.
Figure 24 (ACR, 2013), is an example of a figure for which data is not cited nor provided. Deduction shows it to have been created using data from Tables B and J of ACR 2013. Table 1 (above left) is an example of how these data could have been provided.
Recommendation: The state should consider citing data
used to create figures in ACR's. If data
from multiple sources are used, the state should consider providing a table which
demonstrates how graphed values were calculated.
Problem #3: Reversal of
Graphing Coordinates
Figure 5. Reproduction of Figure 22, ACR 2013
Traditionally categories of data are plotted along the abscissa (x axis) of a graph (aka figure). Categories can include groups such as disability status, race, and sex. The abscissa is the horizontal line running along the bottom of a graph. Outcomes are listed on the ordinate (y axis), which is the vertical line on the left side of the graph.
An example of
the customary way to organize a graph can be seen in Figure 1 (top of first page). The data category is years, plotted along the
abscissa, and the outcome category is percentages, plotted along the
ordinate.
In some ACR
figures the graphing coordinates are reversed.
For example, see Figure 22,
reproduced as Figure 5 here (click to enlarge). Notice the data
categories (sex, disability status, veteran status, and race) are placed on the
ordinate, and outcomes (amount earned per year) are placed on the
abscissa.
Recommendation: The state should consider plotting data
categories on the abscissa and outcomes on the ordinate.
Problem #4: Color Coding
& Labeling Deficiencies
Using Figure
22 again, notice 22 data categories are color coded. Women,
Native American, and Hawaiian people are plotted with an almost the identical
shade of dark brown. Similarly, veterans,
Guamanian, Japanese, Laotian people and Other Race or Ethnicity are plotted
using similar shades of black or dark gray.
Consequently it is difficult to distinguish between many groups. This can lead to interpretive mistakes due to
mis-identification.
Recommendation: The state should consider adding
labels to each plot line, and using varied patterns to create trend lines,
e.g., small dashes, large dashes, stars, asterisks, and so forth.
Problem #5: Too Many
Data Elements In A Graph

California's rich tapestry of people, by race, is reflected in its workforce. Prior to 2013 the state reported eight (8) race groups. Beginning in 2013 the state expanded to 22. These are compared in Table 2. Unfortunately, when a disparate number of individual groups are graphed, an almost uninterpretable display results. Figure 22 is an example of this problem. To understand trends between the races, groups that can aggregated must be joined together. Thus, it is necessary to re-aggregate the subcategories of Asian in order to perform meaningful trend analysis of the type demonstrated in Figure 1 (top of first page).
Recommendation: The state should continue to provide expanded
race group data in tables; however, when graphing, these data should be aggregated.
Problem #6: Manner of
Data Presentation
Tables in the
Annual Census Report traditionally provide counts and percentages in each
cell. An example is seen below in Table 3 (click to enlarge).
Table 3. Partial Reproduction of Table C, 2013 ACR
Presentation of two forms of data in a single cell may be a convenience for some readers, because it enables them to be able to evaluate actual counts and their corresponding percentages side by side. However, this method of data presentation leads to substantial difficulty when one attempts to convert these data tables, which are provided in pdf reports, into spreadsheets using programs capable of doing so such as Adobe Pro 11. When data is mixed even powerful software is unable to parse it out. Further adding to the difficulty of checking the state's work for sufficiency and accuracy is the fact that although these data are likely to exist, as evidenced by the style of figures presented, these being consistent with Microsoft Excel spreadsheets, never the less the state does not provide these spreadsheets along with the ACR.
A further and
very serious concern is that these data tables are complicated and may not be
easily available to person with low or no vision, who rely upon screen reader
software to try and read state documents such as these.
Recommendation: The state should post spreadsheets
containing data used to create the ACR. The state should provide one data per
cell in independent tables that may be more easily acccessed by persons using
screen reader software. The inaccessibility of the Annual Census Report, and 5112 reports, and 5102 reports, etc., is a serious issue. Friends of mine who have blindness are unable to read them using screen reader software because they are not tagged, provided in an acceptable font at an appropriate size, and most often without alt text. To their credit, CalHR added alt text to parts of the 2013 Annual Census. However in total, it is 25 years since the Americans with Disabilities Act was made federal law; yet, the state in general is still failing to make many of its documents accessible to persons with blindness or low vision. This is duscussed further in another blog (click here TBA).
Problem #7: Unclear
& Mistaken Table & Figure Descriptions
Table 4. Reproduction of Figure 1 (sic) 2013 ACR
Some table and figure titles in the ACR are not clearly stated. The example in Table 4 (click to enlarge) is taken from the 2013 ACR. It contains two errors.
First, it is
a table, not a figure. Second, the title
is ambiguous. It fails to describe the
contents of the table, or provide needed temporal (length of time)
context. A more appropriate title could
be, "Table X: Changes in Workforce
Race Representation, 2009 to 2013", or similar.
Recommendation: The state should consider properly
distinguishing between tables and figures, and to also use accurate labels
which describe tabled data and their relationships.
Please cite
this blog as: Nelson, Eric L. (2015). Errors
in the 2013 Annual Census Report, With Recommendations For Future Reports. Trends in State Work, http://trendsinstatework.blogspot.com/2015/05/errors-recommendations-for-acr.html