This is an old revision of the document!
UGLI is one of Lifelines' additional assessments. UGLI is the abbreviation for UMCG Genetics Lifelines Initiative. UGLI aims at facilitating and accelerating genetic data generation and data analysis and thereby scientific output through using the Lifelines genomics data.
Genome-wide association (GWAS) data is highly valuable for biobanks such as Lifelines in identifying disease/trait associations, predicting future disease development and personalized treatment.
To facilitate the generation, analysis and study of genetic data in Lifelines, the UGLI consortium was founded. UGLI brings together many groups and PIs within the UMCG, RUG and beyond that are interested in performing such research with Lifelines data. They have brought the funding together which led to the initial genotyping of a total of 38,030 Lifelines participants, including children, as part of the HUGE consortium in Rotterdam on the Infinium Global Screening Array® (GSA) MultiEthnic Disease Version 1.0. Together with 15,400 samples already genotyped on the Cytochip (GWAS), two quality controlled GWAS datasets with a combined sample size of n~50,000 subjects will be made available to members of the UGLI consortium based on specific proposals approved by the UGLI steering committee and by Lifelines.
The UGLI consortium is actively raising funding for the genotyping of additional samples. The genotyped additional samples are generally referred to as UGLI2. With additional funding of new UGLI members, the consortium will increase the number of genotyped Lifelines participants. These efforts will make Lifelines a more interesting partner for national and international collaborations as well as with non-academic partners that work on healthy ageing.
38,030 Lifelines participants were selected for UGLI1 using the following criteria:
The genotype of 38,030 participants was assessed using the Infinium Global Screening Array® (GSA) MultiEthnic Disease Version 1.01). In the QC screening all genotyped samples were included, and the focuss of the QC of genetic markers was on the autosomes and chromosomes X (N=691,072 markers). A final set of 36,339 samples and 548,029 markers on autosomal and X chromosomes passed the QC steps described in qc_report_ugli1_release_2_-v1.pdf.
UGLI1 - GSA cohort - samples that passed QC | |
---|---|
Subgroup | N |
Total | 36,339 |
Male | 15,098 |
Female | 21,241 |
Age* 8-17 | 3,522* |
Age* 18-64 | 30,416 |
Age* >64 | 2,401 |
Table 1: UGLI1 - GSA cohort information. These are samples that passed QC. Age at Baseline assessment first visit. *One participant did not visit during Baseline, but did visit during 2nd screening. Since participant was under 18 years of age at 2nd screening visit 1, this participant has been added to the children 8-17 group.
An UGLI1 - GSA (release 2.0) Quality Control Report is available, describing in detail the QC steps that were taken during the quality control (QC) process of the first release of UGLI comprising the genotype of 38,030 participants assessed using the Infinium Global Screening Array® (GSA) MultiEthnic Disease Version 1.0. qc_report_ugli1_release_2_-v1.pdf
A final set of 36,339 samples and 548,029 markers on autosomal and X
chromosomes passing all QC steps described in qc_report_ugli1_release_2_-v1.pdf were used for genetic imputation. Genetic imputation was done through the Sanger imputation service
using the Haplotype Reference Consortium
( http://www.haplotype-reference-consortium.org ) panel. The dataset was formatted
following the instructions from the Sanger webpage
( https://www.sanger.ac.uk/science/tools/sanger-imputation-service ).
Raw intensity data from the GSA will be made available to the researchers.
As of March 2023, data of an additional 28,149 genotyped participants has been made available. Samples in this release, called UGLI2, were genotyped using the FinnGen Thermo Fisher Axiom® custom array.
29,366 participants were selected for UGLI 2 release and assessed using the pre mentioned array. All genotypes were included for QC screening, but the QC focussed on the the autosomes and chromosomes X for which there are N=617,715 and 22,346 markers available, respectively. A final set of 28,149 samples and 441,596 markers on autosomal and 18,450 X chromosomes markers passed the QC steps described in qc_report_ugli2_release_1_-v1.pdf.
UGLI2 - Affymetrix cohort - samples that passed QC | |
---|---|
Subgroup | N |
Total | 28,149 |
Male | TBA |
Female | TBA |
Age* 8-17 | TBA |
Age* 18-64 | TBA |
Age* >64 | TBA |
Table 3: UGLI2 - Affymetrix cohort information. These are samples that passed QC.
Please note that the array used for UGLI2 differs from the one used in UGLI1. Overlap in SNPs between these two arrays (GSA chip from Illumina=UGLI1 and FinnGen array from Affymetrix/ThermoFischer=UGLI2) is small, namely 1000-10000 SNPs.
An UGLI2 - Affymetrix (release 2.0) Quality Control Report is available, describing in detail the QC steps that were taken during the quality control (QC) process of the second release of UGLI comprising the genotype of 29,366 participants assessed using the FinnGen Thermo Fisher Axiom® custom array. qc_report_ugli2_release_1_-v1.pdf
A final set of 28,149 samples and 460,136 markers on autosomal and X chromosomes passing all QC steps described above were used for genetic imputation. Genetic imputation was done through the Sanger imputation service using the Haplotype Reference Consortium (http://www.haplotype-reference-consortium.org) panel.
Raw intensity data from the FinnGen Thermo Fisher Axiom® custom array will be made available to the researchers.
TBA
Study name | N in UGLI1 | N in UGLI2 |
---|---|---|
DEEP (DAG1) | ~500 | |
DAG3 | ~9000 | |
GoNL | 143 | |
GWAS4 | 938 |
Table 2: A number of participants in UGLI1 also participated in other studies, i.e. DAG1, DAG3, DAG2/GoNL and GWAS4. In the second column the sample sizes that overlap between these studies and UGLI1 can be found. For DAG1 and DAG3 these are approximations.
UGLI data is available on the HPC (Linux environment) of the UMCG. The data will not be accessible through the Lifelines workspace. The applicant’s proposal will be reviewed by both Lifelines and the UGLI steering committee (UGLI SC).
Requesting UGLI data: The applicant applies via the regular Lifelines application procedure. This means the applicant submits the proposal together with the dataset order using our online catalogue (https://data-catalogue.lifelines.nl/). UGLI data cannot be selected through the online catalogue. The applicant can request UGLI data by stating this in the application form (Appendix: Request for Source Data (Not in catalogue).
GWAS | Genome Wide Association Study |
UGLI | UMCG Genetics Lifelines Initiative |
UGLI SC | UGLI steering committee |
GSA | Global Screening Array |
SNP | Single-nucleotide polymorphism |
HW | Hardy-Weinberg Equilibrium |
WGS | Whole Genome Sequencing |
MAF | Minor allele frequency |
PCA | Principle Components Analysis |
HPC | High Performance Computing |
PLINK | PLINK is a command line program written in C/C++ |