Predicting Regulatory Effect of Genomic Variants

ZS97 Model

Variant Effector

(1) Upload a VCF file

Please select the reference genome corresponding to the VCF file

Please upload one file for prediction.

Sequence Profiler

(2) Upload a BED file

Please select the reference genome corresponding to the BED file

Please upload one file for prediction.

(3) Select a Chromosome and input a Position

Please select the reference genome corresponding to the position

Do not know the position? You can go to Blast Page.

(4) Input a sequence (>20-bp is allowed, ≥1-kb is recommended)


This model was constructed based on ATAC-seq data from mutiple tissues of the Rice (Oryza sativa L) variety Zhenshan 97 (with reference genome version RS2) using the deep learning-based algorithmic framework DeepSEA (Zhou et al., Nature Methods, 2015) and was implemented using the Selene SDK (Chen et al., Nature Methods, 2019).

We provide two services based on this model.

  1. Variant Effector, aims to predict the effects of sequence variants on chromatin accessibility. The accepted input is a VCF file containing information on the sequence variants. The results contain information on the effect of variants on chromatin accessibility in each tissue.
  2. Sequence Profiler, is a utility that performs "in silico saturated mutagenesis" analysis for discovering high-impact sites within a 1-kb sequence. Specifically it performs computational mutation for every base of the input sequence and predicts the effect of every mutation on chromatin accessibility. The accepted inputs are a chromosome and a position, a BED file containing multiple coordinates of genomic regions or a custom sequence. Due to the high computational intensity, for the VCF file, our service only runs the top 2000 SNPs; for the BED file, our service only runs the first 5 regions. For the custom sequence, a sequence with an effective length greater than 20-bp is accepted. However, since the input of the DNN model is a 1-kb sequence, if the input sequence is less than 1-kb in length, N will be added to both ends of the sequence until the length is equal to 1-kb, which may cause bias in the prediction. When the sequence is greater than 200-bp in length, PlantDeepSEA will perform in silico saturation mutagenesis analysis on the middle 200-bp of the input sequence.

Please remember your task ID and check the result page in a few minutes. In addition, your prediction results will be stored on our server for 7 days only.

Examples of Upload Files(Based on Minghui63 Genome of Rice): VCF BED FASTA