Yeah, I’m not done with the CFSAN adverse event report system data. It’s just too fun.

To recap, in 2016 FDA made all of the data from the CFSAN Adverse Event Reporting System (CAERS) available for anyone to see and analyze. These reports are all voluntarily provided by consumers, public health authorities, food companies, or physicians. They stay there forever, and represent a huge opportunity to see what people report as causing illness. As I noted in my last post, these can be a reliable indicator of consumer trust as they attribute a particular food or brand with their symptoms, regardless of whether those illnesses are actually confirmed to be associated with an outbreak or not.

When sorting the CAERS report data by product category, we got a glimpse into the kinds of foods (or food presentations) that gave consumers pause:


Now BEWARE: Because it’s fun, I’m going to abuse this data a little below and do some speculation. The cleaning up of the data below isn’t super in-depth, though I did my best not to change any overall trends from the data that was already entered. However, all of this should be taken with a heaping bowl of salt.

Looking at the graph above, some of these categories might end up higher than others based on their susceptibility to oxidation or spoilage, which may lead consumers to think their foods have made them ill. Others are associated with “healthy” marketing characteristics that may make consumers wary, such as meal replacement items or infant foods.

What I was curious about was how this data compared to the CDC national outbreak reporting system (NORS) data, which includes suspected or confirmed illnesses tied to a particular food. And hey, I can download that stuff too!

So, as before, I had to do a little data cleanup since CDC is interested in different categories than FDA was, and there were a lot of “other” categories. however, I was able to get to a breakdown which included more than 90,000 illnesses represented.

Already it appears that there are some significant differences in what’s represented here vs. the other data set. So first some clarifications about limitations here. First, while the CAERS data above includes some meat/poultry data, FSIS also has its own reporting system which would capture the majority of consumer complaints about meat and poultry, so a comparison of those categories doesn’t make sense. Second, there are going to be some folks who look at this CDC breakdown and try to draw conclusions about how risky a particular food category is.

This data reports the number of illnesses associated with that category, and so the overall number of illnesses caused by that food. However this is not an accurate characterization about how risky it is to eat that food. Risk depends on two factors: how many people get sick AND how common that food is. For example, many more people die from car crashes than base-jumping accidents. However, that doesn’t mean that vehicles are more risky than base jumping, as the motor vehicle risk is 1.25 deaths per 100 million miles driven vs. base jumping risk of one death per 1000 jumps.

Falsely equating incident of illness between foods without taking into account how much of those foods are eaten often causes a lot of confusion for people who get defensive about a particular product, whether it’s a dietary supplement you like, a favored brand, a method of processing, or your own company. Ma and pa may think only 6 people getting sick from their catering business isn’t meaningful next to the 128,000 that will be hospitalized nationwide, but the entire nation isn’t eating at their restaurant.

The CAERS data and the CDC data used very different categories, and the CDC data set includes a lot of “multiple” product categories or more specific than “snacks” in the other data set. However, I did my best to group similar categories from each data set into various foods. I really wanted to see if this (again, not rigorously analyzed) data could tell us how our perceptions of foods that make us ill line up with foods that are actually making us ill.


This is interesting! Now again, we’re only comparing the proportions from each data set, so from a stats viewpoint, I’m not sure we can really draw conclusions from this comparison, but the general idea is that where we see these two bars at similar heights, our perception of how often a food makes us sick may be correlating with the actual incidence of illness. Where they are drastically different, we either underestimate how likely that food was to make us sick, or we quickly assign blame when in fact that food isn’t responsible for as many illnesses as we give it credit for.

I’d love to see this data analyzed with more rigor, and to include some recall statistics as a third data set to gauge how often companies struggle to produce each food category without issue. Even if we can’t necessarily take action from this data, it’s certainly interesting and good for both consumers and manufacturers to be aware of.