Multi-Ancestry Analysis
Online Multi-Ancestry PRS Training

Quick-Start Working Example
In this section, we provide a working example of the PennPRS multi-ancestry data analysis pipeline. More detailed instructions can be found in the section below.
Suppose a user wants to train PRS models for human height in four populations, including AFR, AMR, EAS, and EUR. The user can navigate to our GWAS Queryable Database and search for 'Height':

There are four studies can be used for this multi-ancestry PRS training, including GCST90095033 (AMR, 59,771 subjects), GCST90018739 (EAS, 165,056 subjects), GCST90029008 (EUR, 673,878 subjects), and GCST90013468 (AFR, 25,369 subjects). Users can view more details about these studies by clicking the corresponding 'View Study' link, which redirects to the corresponding GWAS Catalog page. Here is an example for GCST90013468.

To begin PRS training using these datasets from the GWAS Catalog, the user can navigate to our multi-ancestry analysis page.

Input '4' ancestries in this question.

Next, enter the relevant information in each of the four ancestries one by one, which can be obtained from the corresponding GWAS Catalog page. Then click 'Save & Continue'.

Next, the user selects the specific PRS training methods. PennPRS currently support the PROPSER-pseudo method. We strongly recommend using the default settings for this method, although users have the option to modify them if needed.

Next, the user names the job. We recommend enabling email notifications and double-checking the job details before submission. Then click 'Submit'.

The user will then be directed to a page confirming successful job submission. If email notifications are enabled, the user will also receive updates on the job status.

A typical job takes approximately 5 to 20 hours to complete, depending on server load as well as the nature of the datasets selected. The user will receive an email notification once the job is finished. If no errors occur, the user can then download the trained PRS models (for four ancestries in this case) and the log file in a zip file from the 'Job Center'.


If the job fails, the user can check the returned log files (available from the "Download Error Log"), browse the FAQ Section, or contact the PennPRS team directly for support.
Detailed Steps of Job Submission
Our cloud-based, end-to-end multi-ancestry PRS model training job consists of four steps:
Step 1. Upload or query one GWAS summary-level data file for each ancestry population.
Step 2. Select the multi-ancestry PRS method and specify the model parameter setting.
Step 3. Configure and submit the job.
Step 4. Monitor job status and download results.

Below we provide details for each step.
Step 1. Build input GWAS summary data files from multiple ancestries.
Similar to single-ancestry analysis, users can either upload their local GWAS summary data files or query summary data from our public GWAS summary database built based on over 27,000 harmonized datasets from the GWAS Catalog.
The required data format is the same as that for single-ancestry analysis, and users are allowed to upload their local data for a subset of ancestries and query data for the remaining ancestries. Please also make sure to upload one file at a time (maximum file size allowed: 800MB). For detailed instructions, please refer to Step 1 on single-ancestry analysis.
Note: each multi-ancestry analysis job only allows uploading or querying two to five GWAS summary data files, with all the uploaded files coming from different ancestries.

Update April 06, 2025: We have supported direct querying of GWAS summary statistics of over 2400 disease phenotypes from the FinnGen database (R12) (https://pennprs.org/data).
Step 2. Select the multi-ancestry PRS method and specify the model parameter setting.
We currently support one multi-ancestry method, PROSPER, using the pseudo-training version developed and tested by the PennPRS team. The user can either use the default setting (highly recommended) or customize the settings.

Step 3. Configure and submit the job
In Step 3, the user can provide a job name and enable email notifications. The user will then review the input data and method information before submitting the job.


Step 4. Monitor job status and download results.
Once a job is successfully submitted, the user will see the following page, and the user can monitor the job status by clicking "View Job Status".

If the email notifications are enabled in Step 3, the user will receive separate status updates from nonreply.pennprs@gmail.com at each stage:
(i) when the job is successfully submitted
(ii) when the job starts running
(iii) when the job finishes (either completed successfully or failed)
The user can view the job status and download the results from the "Job Center".
If the job is completed successfully, the user will be able to obtain the PRS weights by clicking "Download Results".
If the job fails, the user can check the returned log files (available from the "Download Error Log"), browse the FAQ Section, or contact the PennPRS team directly for support.

Last updated