PyCon Israel 2018

Monday 10:20 a.m.–11 a.m. in PyData

A tale of DNA, Numpy, and 3 types of Jews

Daniel Levy, Luis Voloch

Audience level:
Novice

Abstract

We all know that our DNA is what makes us US. So if we have our DNA data, what can we tell about us ? Our presentation will focus on one of the problems that we’ve been working on at MyHeritage, which is telling you, based only on your DNA, what is your ethnic background.

In the talk we’ll discuss how to treat this question as a machine learning problem (using standard tools like sklearn and numpy for a distinctive set of problems), what sets it apart from classical classification problems, and our approach for answering this question using thousands of models, not too many samples, but way too many features.

We will also talk about how Jewish genealogy introduces additional difficulties.

This is joint work by Professor Yaniv Erlich, Dr Daniel Levy, and Luis Voloch.