Electronic health records-driven phenotyping: challenges, recent advances, and perspectives 论文
摘要
With the completion of the Human Genome Project1 as well as recent advances in genomic science and comparative biological studies, a new era of individualized medicine is evolving where novel biomedical discoveries are leading to more effective prevention, treatment, and diagnosis of disease. Although altered phenotypes are one of the most reliable manifestations of altered gene functions, research in extracting, representing, and analyzing phenotype–genotype relationships is still evolving. This has led to the emergence of a trans-discipline field, called ‘Phenomics,’2 that aims to capitalize on high-throughput computation and informatics technologies for the systematic study of phenotypes and how they might influence personal genomics.3 Many comparative phenomics studies in the recent past4 ,5 have demonstrated the power of positively correlating phenotypes with several measures of gene functions. However, despite the advances, research in phenomics is presented with various challenges, including (i) developing approaches for high-throughput extraction and representation of phenotypes, (ii) building techniques for storing, integrating, and querying phenotype data, and (iii) advancing phenotypic-driven analysis to derive phenotype–genotype associations. A significant barrier in the discovery of new genetic variants is the requirement to obtain the large sample sizes needed for an effective study (since variants may be rare within a population) leading to time-consuming and onerous sample collection efforts. Electronic health records (EHRs) can accelerate clinical research and genomic medicine, but are hindered by the limited number of validated processes and tools to enable accurate and rapid phenotype extraction.6 EHRs are increasing in ubiquity, functionality, and comprehensiveness across the USA, in part due to Meaningful Use standards7 implemented as part of the Health Information Technology for Economic and Clinical Health (HITECH) Act. One recent advance has been the coupling of DNA biorepositories to EHR data,8–13 combined with advances in informatics techniques, … Correspondence to Dr Jyotishman Pathak, Department of Biomedical Statistics and Informatics, Mayo Clinic, 200 1st Street SW, Rochester, MN 55905, USA; pathak.jyotishman{at}mayo.edu