Page 265 - 53rd Annual Drosophila Research Conference

Poster Full Abstracts - Evolution and Quantitative Genetics

Poster board number is above title. The first author is the presenter

263

linearly with evolutionary distance, and CTCF binding profiles are diverging rapidly at the rate of 2.22% per million years (Myr). At least 89 new CTCF

binding events have originated in the Drosophila melanogaster genome since the most recent common ancestor with Drosophila simulans. Comparing these

data to genome sequence data from 37 different strains of Drosophila melanogaster, we detected signatures of selection in both newly gained and

evolutionarily conserved binding sites. Newly evolved CTCF binding sites show a significantly stronger signature for positive selection than older sites.

Comparative gene expression profiling revealed that expression divergence of genes adjacent to CTCF binding site is significantly associated with the gain

and loss of CTCF binding. Further, the birth of new genes is associated with the birth of new CTCF binding sites. Our data indicate that binding of

Drosophila CTCF protein has evolved under natural selection, and CTCF binding evolution has shaped both the evolution of gene expression and genome

evolution during the birth of new genes.

490A

Strong evidence of biased gene conversion in

Drosophila melanogaster

.

Matthew C. Robinson, Eric A. Stone, Nadia D. Singh. Genetics Dept, NCSU,

Raleigh, NC.

Gene conversion is the non-reciprocal exchange of genetic information between homologous chromosomes during meiosis. Biased gene conversion (BGC)

reflects the favoring of certain alleles over others during this process. In particular, gene conversion appears to be GC-biased across a wide variety of

eukaryotic genomes. Preliminary evidence in Drosophila from a subset of coding and noncoding loci are consistent with BGC, but it remains unknown

whether BGC is a general feature of Drosophila genome evolution. Here we systematically test for BGC at a genomic scale in

D. melanogaster

using newly

available population genomic data. We take advantage of the Drosophila Genetic Reference panel, a set of 162 fully sequenced lines derived from a natural

population from Raleigh, North Carolina. We test for BGC in four types of sequences: short introns, long introns, intergenic regions, and four-fold

degenerate synonymous sites. In addition, we explore the effects of genomic context including local recombination rate, GC-content, and chromosome, on

the degree of GC-bias in gene conversion. To test whether patterns of polymorphism are consistent with BGC, we examined the site frequency spectra of bi-

allelic single nucleotide polymorphisms (SNPs) (polarized to

D. simulans

) from different sequence types and from varying genomic contexts. BGC should

result in an excess of high-frequency AT->GC polymorphisms relative to neutral expectation; the degree of this skew indicates the magnitude of the BGC.

We quantify the skew of the right-handed tail of the site frequency spectrum using a summary statistic “Q” and test for differences in the strength of BGC by

comparing the Q statistic among our four sequence types and across genomic contexts. Our results are consistent with pervasive BGC in

D. melanogaster

.

The degree of bias appears exaggerated on the X chromosome relative to the autosomes, as well as in regions of high recombination versus low

recombination. BGC is thus likely to be a significant contributor to genome evolution in

D. melanogaster

.

491B

Mutation accumulation reveals a large duplication bias and substantial variation in substitution rates in

Drosophila melanogaster

.

Daniel R.

Schrider

1,2

, Michael Lynch

1

, David Houle

3

, Matthew W. Hahn

1,2

. 1) Department of Biology, Indiana University, Bloomington, IN; 2) School of Informatics

and Computing, Indiana University, Bloomington, IN; 3) Department of Biological Science, Florida State University, Tallahassee, FL.

Because all genetic variation on which natural selection operates originates via spontaneous mutation, the rates at which mutations appear in natural

populations have important evolutionary consequences. Genome-wide mutation accumulation (MA) experiments are a powerful method for estimating

mutation rates, and can therefore improve our understanding of the rate at which adaptive alleles arise and illuminate how natural selection shapes variation

within and among species. Unfortunately, the small number of generations captured by most MA experiments limits their statistical power. In addition, most

of these studies used sequencing methods that do not allow for comprehensive detection of large genomic duplications and deletions that result in genomic

copy number variants (CNVs). We present results from an MA experiment in

Drosophila

that does not suffer from these shortcomings. More generations are

captured in this experiment (~1160) than in all other eukaryotic MA studies combined. This wealth of data reveals >2-fold variation in substitution rates

between MA lines derived from different isofemale ancestors, suggesting that mutation rates can vary greatly among individuals within natural populations,

and that the mutation rate of a species cannot be represented by a single estimate. We also present the first accurate estimates of the rates of large

duplications and deletions. We confirm the previously observed bias of small deletions over small insertions; however, at scales larger than ~150 bp, we find

that the rate of duplication is much higher than the rate of deletion, reversing the current view that

Drosophila

exhibits a deletion bias. While this result

implies that mutational forces alone would cause the

Drosophila

genome to grow rapidly, we show that selection against large duplications has prevented

this growth from occurring.

492C

Classifying the evolutionary causes of nucleotide fixation.

Alexander Shanku

1

, Andrew Kern

2

. 1) BioMaPS Institute, Rutgers University, Piscataway, NJ;

2) Department of Genetics, Rutgers University, Piscataway, NJ.

A long term goal of population genetics has been to determine to what extent natural selection impacts patterns of genomic variation within and between

populations. In an attempt to localize the effect of selection within genomes, much attention has been paid towards detecting the tell-tale signatures of

selective sweeps, whereby a newly arising beneficial mutation rapidly increases in frequency to fixation. Such efforts have classically used population

genetic summary statistics (Tajima 1983, Fu and Li 1993,Fay and Wu 2000) or more recently likelihood techniques (Kim and Stephan 2002, Kim and

Nielsen 2004, Nielsen et al. 2005). Formally there are at least three possible routes (in the evolutionary sense) by which a novel mutation may fix: 1) it may

be a neutral mutation, or nearly so, and drift to fixation, 2) it may be an unconditionally beneficial mutation and rapidly sweep to fixation (hard sweep) or 3)

it may be initially neutral, or nearly so, but at some later point in time (e.g. after an environmental change) becomes beneficial and then fixes rapidly (soft

sweep). Here we describe the use of supervised machine learning algorithms for the classification of nucleotide fixations into these three classes based on

combinations of population genetic summary statistics. We examine the efficacy of three classes of algorithms, logistic regression, support vector and

relevance vector machines, first by training and testing these algorithms on simulated data generated from the coalescent, then applying these techniques to

first generation DPGP data. In training our classifiers by integrating over model parameters and altering demographic histories, we demonstrate that we have

power to classify hard vs. neutral sweeps at 90% accuracy and hard vs. soft sweeps at >80% accuracy out to 0.4 units of time since the sweep fixed in the

population (time=2N generations, N=population size). This work represents a novel application of supervised classifiers in population genetics and a

potential new tool in searching for selection across the genome.