Natalie E. Dean, PhD+ Your Authors @nataliexdean Assistant Professor of Biostatistics at @UF specializing in emerging infectious diseases and vaccine study design. @HarvardBiostats PhD. Tweets my own. Apr. 18, 2020 2 min read + Your Authors

A rapid, unsolicited peer review on emerging serosurvey data from Santa Clara County, and why I remain skeptical of claims that we are identifying only 1 out of every 50 to 85 confirmed cases.


The study recruited patients in the county through Facebook ads. To make the survey representative, enrollment was capped when quotas were reached in certain areas, and encouraged in areas with lower participation. A smart strategy for easy recruitment, in my opinion. 2/10

After completing an online demographic survey (zip code, age, sex, race/ethnicity, comorbidities, prior symptoms), consenting adults could bring a child with them for drive-thru testing at one of three locations. Their final sample size was 2,718 adults and 612 children. 3/10

Participation rate was not even across the region. This is relevant because the authors report both a crude % seropositive and a population-adjusted % seropositive. Adjustment is achieved through survey weights, to account for undersampling. 4/10

The crude seroprevalence is 1.5% (1.11-1.97%). The population weighted seroprevalence is 2.81% (2.24-3.37%). These are different because under-represented combinations of zip code, sex, and race will receive more weight. 5/10

Having had experience with these types of weighted surveys, I am always a little skeptical when the weighted result is very different from the unweighted result. Here, nearly double. This can be due to a few highly influential observations. Weights can be wonky. 6/10

In Table 2, they also show the confidence intervals adjusting for clustering (since some adults brought children with them, and presumably seropositivity is correlated within households). These have wider bounds, especially the adjusted estimate. 7/10

Finally, they present several estimates adjusting for the qualities of the test (sensitivity, specificity). Notably, their internal validation returns very high specificity (99.5% and 100% by two methods). This means they anticipate few false positives. 8/10

When they adjust for characteristics of the test their estimates increase to as high as 4.16% (2.58 to 5.70%). This is because the adjustment predicts that there are no false positives but some false negatives (low sensitivity). 9/10

So, the major reasons why I remain skeptical:
- Unstable population weighting
- Wide bounds after adjusting for clustering
- Is test specificity really that high?
- Unavoidable potential for consent bias
- Is this consistent with other emerging serosurvey data?
Fin 10/10

Addendum: Folks commenting that people self-selected into the survey because they had been sick and wanted confirmation. Agreed, this is what I meant by “consent bias.” This is expected w/ volunteer studies. How to adjust/avoid it? Household samples are great but take longer.

You can follow @nataliexdean.


Tip: mention @threader_app on a Twitter thread with the keyword “compile” to get a link to it.

Enjoy Threader? Sign up.

Since you’re here...

... we’re asking visitors like you to make a contribution to support this independent project. In these uncertain times, access to information is vital. Threader gets 1,000,000+ visits a month and our iOS Twitter client was featured as an App of the Day by Apple. Your financial support will help two developers to keep working on this app. Everyone’s contribution, big or small, is so valuable. Support Threader by becoming premium or by donating on PayPal. Thank you.

Follow Threader