The Fundamental Flaw in the 23andMe Methodology

Theo Pavlidis,©2018

Last May I received an e-mail from 23andMe that pointed to several reports, including one,, that claimed that I am a "morning person".

That was quite a surprise because all my life I had trouble getting up early in the morning. I recall when I had finished taking courses in graduate school, I would get up around 10 am or even 11 am and have breakfast while others had lunch. Now that I am over 80 years old I am getting up even later. How 23andMe could have reached such an erroneous conclusion?

One possible explanation is insufficient data. This is illustrated below. (For simplicity, I assume a single scalar variable used in the classification but the conclusions are equally valid in a high dimensional space. They are also more likely when the separating hyper-surface is not planar.) When many samples are available, I am classified as a night person. When only few samples are available (those pointed by the arrows) I am classified as a day person.

23andMe uses the set of all its customers as being representative of the world population, clearly an incorrect assumption.

An even more striking example of the failure of such a strategy is shown next. (The table below is a "screen dump" where I have erased the labels in my browser.)

What does this table mean? There is a counter-intuitive result that I have many more relatives in the United States than In Greece. (In spite of being classified by 23andMe as mainly Italian, Italy does not appear in the table.) The countries (other than Greece) in the table have had significant Immigration from Greece. The most obvious explanation is that Greek-Americans or Greek-Canadians are far more likely to be customers of 23andMe than Greeks living in Greece.

23andMe offers a valuable service that, unfortunately, is too often marred by erroneous application of statistics. It would be to everyone's best interest if they limited their conclusions to cases where their set of customers is truly representative of the world population.