Using SNP profiles of over 10k individuals from the Alzheimer's Disease Sequencing Project (ADSP) we have developed a computational framework for making diagnostic predictions regarding the likelihood that someone will develop Alzheimer's Disease (AD). A key feature of this framework is a neural network algorithm that, through machine learning, has been trained to predict AD patients or non-AD controls with high accuracy. Importantly, these predictions were made on individuals never seen by the classifier, suggesting high accuracy diagnoses could transfer to the general population. In fact, only a few hundred genomic loci are needed, and have been identified by the learning algorithms. The neural net outputs a ‘confidence’ level for each prediction; for individuals registering high-confidence predictions, the classifier is over 90% accurate. Since network weights have already been trained, and only a relatively small number of key variant loci are needed, this system could aid in clinical diagnostics; and as new genomes and clinical status are added, the system will continue to improve performance over time.
Here I provide a step-by-step analysis-walkthrough towards the goal of developing a platform for Alzheimer's Disease diagnosis based on machine learning techniques. Here are some entry pages: Intro, More Neural Nets, PCA, t-SNE.