Functional Annotation of ANimal Genomes (FAANG) Project
A coordinated international action to accelerate Genome to Phenome
1. Action plan for the international community- current financial support and plans for garnering additional funding
The EU-US Animal Biotechnology Working Group (ABWG) promotes trans-national discussions on the need for coordinated, functional annotation of genomes of domesticated animals. They have identified the improvement of animal reference genome sequences and the comprehensive annotation of their functional elements and variants as priorities for understanding the link between genotype and phenotype. Most recently, the ABWG convened a workshop of more than 170 scientists in San Diego to promote the drafting of a White Paper and coordinated efforts to establish an international collaborative and coordinated programme of research to identify and annotate the functional elements of farm animal genomes. We propose an integrated set of actions to turn this vision into a reality for animal genomics research.
Financial support is available currently through the USDA (US), and several European countries (France; UK and NL) to fund the early creation of datasets that will demonstrate the value of ENCODE-type data for animal genomics. These early projects are just launched (“FR-AgENCODE”, WUR-pigENCODE and Farm Animal Encode), and are being developed in close collaboration with the US-EC ABWG and current FAANG members. BBSRC (UK) funding provides support for bioinformatics and infrastructure aspects of functional annotation in farmed animals. For example, BBSRC have recently confirmed funding to The Roslin Institute and European Molecular Biology Laboratory – European Bioinformatics Institute (EMBL-EBI) to support the ‘Ensembl genome portal for farm and companion animals’ until summer 2019. In addition, we will develop funding that is critical for coordination of data creation and analysis. Important early goals of these efforts will be to gather support for this project from the broader animal genome community and enthusiasm for FAANG within funding agencies. During this stage we will further develop project milestones for coordination and implementation of FAANG. All these projects will require significant new funding from interested governmental agencies such as USDA, NSF, NIH, DOD in the US and EC (H2020), ANR (French National Research Agency), BBSRC and several others in EU countries.
Institutions in 12 European countries (France, Denmark, Germany, Greece, Ireland, Italy, Norway, Portugal, Slovenia, Spain, The Netherlands, UK), Australia, New Zealand and USA will submit a proposal for a Trans-Domain COST Action for “A collaborative network to facilitate the functional annotation of domesticated animals - FUNCAGEN”. If awarded, this Action will support networking and coordination functions.
A consortium led by The Roslin Institute together with The Genome Analysis Centre (TGAC) and EMBL-EBI has applied for two major grants from the BBSRC: i) for the capital to establish the necessary data infrastructure and staff to develop the necessary data management systems and ii) to generate and analyse a substantial body of experimental data for functional annotation. An announcement about the infrastructure grant will be made in February 2015. The review of the large data generation proposal should be completed by summer 2015.
Several additional countries, including Korea and Canada, have indicated a willingness to consider funding research on the functional annotation of animal genomes. The US chicken, cattle and swine coordinators for the USDA NRSP-8 project have also committed support for the FAANG project through the above-mentioned USDA-funded project to work in these three species, and NRSP-8 representatives have begun organizing with USDA-ARS representatives a meeting to initiate a coordinated educational and fund-raising effort within the USA.
Several additional countries
2. Biological targets and resources
The first phase of the FAANG project will focus on sampling biological replicates representing a limited number of specific biological states to maximize comparisons across species. Where possible, animals with minimal genetic diversity within a species will be sampled. FAANG members are committed to collecting, storing and sharing tissues for initial data collection as well as held in reserve for future additional assays. Similarly to ENCODE and modENCODE in their recent phases, FAANG will mostly focus on tissue samples.
3. Choice of assays for identifying genomic elements
Key genomic elements will be studied using a set of well-established and more informational assays on all samples to create the basal reference catalogue. Moreover, these assays will be performed using common protocols and standard operating procedures so that the data generated can effectively be shared and subjected to meta-analyses, increasing experimental power. Both the ENCODE consortium and the International Human Epigenome Consortium (IHEC) defined experimental and metadata protocols. We will use these standards as the starting point to define appropriate operating procedures for this project, adapting them where necessary to reflect the complexities of animal breeds and the different tissues available for animal based experiments.
3.1 Core Assays
RNA-seq experiments provide exhaustive catalogues of gene expression of the whole sample (primary cells or tissues).
Chromatin Accessibility and Architecture:
DNase I footprinting and ATAC-seq are being considered.
The following four histone modification marks are being considered:
3.2 Additional Assays Being Considered
4. Development of a common data infrastructure.
Effective coordination, data management and robust quality control are essential to converting data generated across multiple laboratories into knowledge. The FAANG consortium will promote standardization of experimental protocols and procedures in computational analysis. A FAANG Data Coordination Centre (DCC) and a Data Analysis Centre (DAC) will be established to ensure high quality and standardized data generation, analysis and accessibility of the data to the wider community.