This model was constructed based on ATAC-seq data from mutiple tissues of the Arabidopsis (Arabidopsis thaliana) variety Col-0 (with reference genome version TAIR10.1) using the deep learning-based algorithmic framework DeepSEA (Zhou et al., Nature Methods, 2015) and was implemented using the Selene SDK (Chen et al., Nature Methods, 2019).
We provide two services based on this model.
- Variant Effector, aims to predict the effects of sequence variants on chromatin accessibility. The accepted input is a VCF file containing information on the sequence variants. The results contain information on the effect of variants on chromatin accessibility in each tissue.
- Sequence Profiler, is a utility that performs "in silico saturated mutagenesis" analysis for discovering high-impact sites within a 1-kb sequence. Specifically it performs computational mutation for every base of the input sequence and predicts the effect of every mutation on chromatin accessibility. The accepted inputs are a chromosome and a position, a BED file containing multiple coordinates of genomic regions or a custom sequence. Due to the high computational intensity, for the VCF file, our service only runs the top 2000 SNPs; for the BED file, our service only runs the first 5 regions. For the custom sequence, a sequence with an effective length greater than 20-bp is accepted. However, since the input of the DNN model is a 1-kb sequence, if the input sequence is less than 1-kb in length, N will be added to both ends of the sequence until the length is equal to 1-kb, which may cause bias in the prediction. When the sequence is greater than 200-bp in length, PlantDeepSEA will perform in silico saturation mutagenesis analysis on the middle 200-bp of the input sequence.
Please remember your task ID and check the result page in a few minutes. In addition, your prediction results will be stored on our server for 7 days only.