Jiliang Tang and other colleagues awarded NSF grant
Abstract:
Genetic-by-environment (G×E) interactions are a major source of variation in plant phenotypes. This makes breeding for target environments exceptionally challenging because future environmental conditions are mainly uncertain. To achieve fast genetic gains, breeding programs need to advance genotypes with limited years of testing. Thus, in the early stages of breeding, many cultivars are selected without testing them under weather conditions (e.g., drought, cold, or heat stress) that may critically affect their performance. To address the longstanding problem of predicting cultivars' phenotypes under largely uncertain weather conditions, this project will develop a computer simulation platform that will integrate field trial data, DNA sequences, and historical weather records into models that will enable researchers and breeders to predict plant phenotypes under likely weather conditions.
An interdisciplinary team of data scientists, maize geneticists and breeders will use reaction-norm models and Deep Learning, a methodology that has proven to be very effective at capturing complex patterns in high-dimensional data, to learn G×E patterns from DNA sequences, environmental covariates, and phenotype data from the Genomes to Fields (G2F) project, the largest publicly initiated and led G×E research initiative. The models will be used together with historical weather data, to simulate and predict maize performances for a diverse set of maize hybrids at many locations within the U.S. The simulated data will then be used to uncover genetic regions underlying G×E, with findings validated using data generated in trials with controlled weather conditions. With respect to broader impacts, the project will provide interdisciplinary training in data science and genomics to students and postdoctoral fellows and will leverage existing programs to provide outreach for high school students and the general public. All project outcomes will be accessible through a dedicated project database and publicly available long-term repositories. All software will be released as open-source.
(Date Posted: 2021-06-04)