Frequently Asked Questions

Answers to common questions about EpiNexus, the analyses it runs, and the assays it supports.

Getting Started

What is EpiNexus?

EpiNexus is a web-based platform for analysing epigenomic data. You upload your sequencing files (FASTQ, BAM, or peak files), choose a genome, and click run. EpiNexus handles quality control, peak annotation, differential analysis, multi-mark integration, and super-enhancer detection — all without writing any code.

Do I need bioinformatics experience?

No. EpiNexus is designed for biologists who generate ChIP-seq or related data but don't want to learn command-line tools. Everything is done through a point-and-click web interface.

How do I get access to EpiNexus?

EpiNexus is a commercial platform. Contact us at info@epinexus.io to learn about licensing options and request a demo.

How do I access EpiNexus?

EpiNexus runs as a web application. Once your institution has a licence, you will receive a URL to access it. Simply open the link in any modern browser (Chrome, Firefox, Safari, or Edge) and create a project to get started.

Can multiple people use the same EpiNexus instance?

Yes. Each user can create their own projects. Projects are independent, so your data and results stay separate from other users on the same server.

Data & File Types

What file types does EpiNexus accept?

EpiNexus accepts three types of input, listed from earliest to latest in the processing pipeline:

FASTQ files — Raw sequencing reads straight from the sequencer. If you have these, EpiNexus can handle the entire workflow: alignment, peak calling, and downstream analysis.

BAM files — Aligned sequencing reads (already mapped to a genome). This is the most common starting point. Your core facility usually provides BAM files after sequencing.

Peak files — Already-called peaks in .narrowPeak, .broadPeak, or .bed format. Upload these alongside BAM files if your facility already ran peak calling. If you don't have them, EpiNexus calls peaks for you.

My core facility gave me FASTQ files, not BAM files. What do I do?

No problem! Choose the Full pipeline when creating your run. EpiNexus will align the reads to the genome, call peaks, and run the full analysis automatically. You don't need to do any preprocessing yourself.

What is the difference between FASTQ, BAM, and peak files?

FASTQ files contain the raw sequence reads exactly as they came off the sequencer — each read is a short DNA fragment (~75–150 bp) with quality scores. BAM files are the same reads but mapped to a reference genome, so each read now has a genomic coordinate. Peak files are the result of peak calling — they list the specific genomic regions where your mark is enriched, condensing millions of reads into thousands of peaks.

How big are the files I need to upload?

FASTQ: 5–30 GB per sample (paired-end generates two files). BAM: 1–10 GB per sample. Peak files: 0.5–20 MB (very small). We recommend a stable internet connection, especially for FASTQ and BAM uploads. If an upload is interrupted you can re-upload the same file.

Do I need an input/control sample?

An input or IgG control improves peak calling accuracy, but it is not strictly required. If you have one, upload it and EpiNexus will use it automatically. If you don't, EpiNexus will still call peaks using a background model.

How many replicates do I need?

For basic QC and annotation, one sample is enough. For differential analysis (comparing conditions), you need at least two replicates per condition — ideally three or more. More replicates give you better statistical power to detect real changes.

Supported Assays

Which assay types does EpiNexus support?

EpiNexus works with any assay that produces peaks from sequencing data, including:

ChIP-seq — The standard method for profiling histone modifications and transcription factor binding. This is what EpiNexus was built for.

CUT&Tag — A newer, low-input alternative to ChIP-seq. Works well for histone marks, especially in samples with limited cell numbers.

CUT&Run — Another low-input chromatin profiling method. EpiNexus routes CUT&Run and CUT&Tag data through a dedicated nf-core/cutandrun pipeline with SEACR peak calling by default.

ATAC-seq — Measures open chromatin accessibility rather than histone marks. The peak-based analysis modules (QC, annotation, differential, super-enhancers) all work with ATAC-seq data.

DNA methylation — Bisulfite sequencing (WGBS, RRBS) and enzymatic methyl-seq (EM-seq). EpiNexus identifies differentially methylated regions (DMRs) and annotates them with nearest genes and genomic features.

Which histone marks can I analyse?

Any histone mark that produces peaks. Commonly used marks include:

H3K27ac — Active enhancers and promoters. Ideal for super-enhancer analysis.

H3K4me1 — Primed and active enhancers.

H3K4me3 — Active promoters.

H3K27me3 — Polycomb-repressed regions. Use broad peak mode.

H3K9me3 — Heterochromatin / constitutive silencing. Use broad peak mode.

H3K36me3 — Gene bodies of actively transcribed genes.

You can also analyse transcription factor ChIP-seq (e.g. p53, CTCF, ERα) using narrow peak mode.

Can I combine multiple histone marks in one analysis?

Yes! This is what the Multi-Mark Integration step does. Upload samples for each mark (e.g. H3K27ac and H3K27me3) and EpiNexus will classify each region as active, poised, bivalent, or repressed based on the combination of marks present.

Does EpiNexus support RNA-seq or whole-genome sequencing?

Not currently. EpiNexus is specialised for epigenomic assays — histone modifications, chromatin accessibility, and DNA methylation. RNA-seq and WGS require different analysis pipelines.

Analysis & Results

How long does an analysis take?

A typical analysis with 4–6 BAM samples completes in 5–15 minutes. Starting from FASTQ files adds several hours for alignment and peak calling (typically 4–8 hours for ChIP-seq, 2–4 hours for ATAC-seq, depending on sequencing depth). You can close your browser and come back later — the analysis runs on the server.

What are the analysis steps?

If you start from FASTQ files (Full pipeline), EpiNexus first runs alignment and peak calling, then proceeds to the downstream analysis. If you start from BAM files (Analysis pipeline), EpiNexus skips alignment and goes straight to the downstream steps:

1. Quality Control — Checks mapping rates, duplication, signal enrichment (FRiP), and fragment sizes.

2. Annotation — Labels each peak by genomic feature (promoter, intron, intergenic) and finds the nearest gene.

3. Peak–Gene Linking — Connects enhancers/peaks to the genes they likely regulate.

4. Differential Analysis — Compares two conditions to find peaks that gain or lose signal.

5. Multi-Mark Integration — Classifies chromatin states by combining multiple marks.

6. Super-Enhancer Detection — Identifies large, high-signal regulatory regions.

If you start from FASTQ, alignment and peak calling run first (sequentially). The downstream analysis steps (QC, annotation, differential, etc.) then run independently of each other. Steps 5 and 6 are optional and can be enabled in the run settings.

I only have one condition (no control). Can I still use EpiNexus?

Yes. The QC, annotation, peak–gene linking, and super-enhancer steps all work with a single condition. Only the differential analysis step requires two or more conditions to compare.

What does "differential" mean in this context?

A differential peak is one whose signal changes significantly between two conditions (e.g. treated vs. untreated). "Gained" peaks have more signal in the treatment, "lost" peaks have less. This helps you identify the genomic regions affected by your experimental condition.

What is a super-enhancer?

Super-enhancers are clusters of nearby enhancers with exceptionally high levels of an active histone mark (typically H3K27ac). They drive the expression of cell-identity genes and are frequently disrupted in disease. EpiNexus identifies them using the ROSE algorithm, which ranks enhancers by signal strength and finds the inflection point separating super-enhancers from typical enhancers.

What is FRiP score?

Fraction of Reads in Peaks — the percentage of your sequencing reads that fall within called peaks. It's a key quality metric: a FRiP above 1% meets the ENCODE minimum; above 5% is good; above 20% is excellent. Low FRiP may indicate low enrichment, too much background, or a failed experiment.

What does FDR mean?

False Discovery Rate — the probability that a result is a false positive. An FDR of 0.05 means there is a 5% chance the peak is not truly different between conditions. EpiNexus defaults to an FDR threshold of 0.1 (10%). You can make this stricter (e.g. 0.05) in the run settings for higher confidence.

What are bivalent domains?

Regions carrying both an activating mark (e.g. H3K4me3) and a repressive mark (e.g. H3K27me3) at the same time. These genes are "poised" — ready to be rapidly activated or silenced. Bivalent domains are especially important in stem cells and during development.

Can I download my results?

Yes. Each results section has a download button that exports the data as a CSV file, which you can open in Excel or Google Sheets for further analysis, custom figures, or inclusion in publications.

Genomes & Species

Which organisms does EpiNexus support?

EpiNexus ships with 17 pre-configured genomes covering the most widely used model organisms:

Human — hg38 (GRCh38), hg19 (GRCh37)

Mouse — mm39 (GRCm39), mm10 (GRCm38)

Rat — rn7 (mRatBN7.2), rn6 (Rnor_6.0)

Zebrafish — danRer11 (GRCz11)

Fruit fly — dm6 (BDGP6)

C. elegans — ce11 (WBcel235)

Yeast — sacCer3 (R64)

Chicken — galGal6 (GRCg6a)

Macaque — rheMac10 (Mmul_10)

Pig — susScr11 (Sscrofa11.1)

Cow — bosTau9 (ARS-UCD1.2)

Dog — canFam6 (UU_Cfam_GSD_1.0)

Frog — xenLae2 (v10.1)

Arabidopsis — tair10 (TAIR10)

Any other organism can be added via NCBI accession or by providing your own FASTA and GTF files. See the Pipelines page for details.

How do I know which genome build to choose?

Check with whoever aligned your data (usually your core facility). For recent human data, hg38 (GRCh38) is the standard. For mouse, mm39 (GRCm39) is the latest. If you're unsure, look at the BAM file header or the aligner output — it will mention the genome reference used.

My organism isn't listed. Can I still use EpiNexus?

If your organism has a reference genome and gene annotation (GTF), your system administrator can add it to EpiNexus. Contact us at info@epinexus.io for guidance on adding custom genomes.

Troubleshooting

My analysis failed. What should I do?

Check the run details for error messages. The most common issues are:

Wrong genome selected — Make sure the genome in your project matches the genome your FASTQ/BAM files were aligned to.

Corrupted or partial upload — Try re-uploading the file. A green checkmark confirms a successful upload.

Very low quality data — If very few peaks are detected, the downstream steps may have nothing to work with. Check the QC results for low FRiP scores.

Missing replicates for differential — Differential analysis requires at least two samples per condition.

If you're stuck, email us at info@epinexus.io with the run details and we'll help troubleshoot.

One analysis step failed. What happens to the rest?

It depends on which step failed. The pipeline is sequential — each step depends on the output of earlier steps. If an upstream step like alignment or peak calling fails, downstream steps (annotation, differential, super-enhancers) cannot run because they need that output. However, some downstream steps are independent of each other: for example, if differential analysis fails because you only have one condition, annotation and super-enhancer detection can still complete normally. Check the error details, fix the issue, and re-run.

Can I delete a project?

Yes. Go to the Projects page, find the project you want to remove, and click the delete button (trash icon). This permanently deletes the project, all uploaded files, and all analysis results. You will be asked to confirm before anything is deleted.

Can I re-run an analysis with different settings?

Yes. Within the same project, you can create multiple runs with different settings (e.g. different FDR thresholds or different pipelines). Your uploaded files are reused automatically.

My upload is very slow. Any tips?

FASTQ and BAM files can be several gigabytes. Use a wired connection if possible, and avoid uploading during peak network hours. If an upload is interrupted, you can re-upload the same file and it will resume or replace the partial upload.

Concepts & Terminology

What is a "peak"?

A peak is a region of the genome where your histone mark or factor is enriched above background. Think of it as a "hotspot" of epigenetic activity. Each peak has a location (chromosome, start, end) and a signal strength. A typical ChIP-seq experiment yields thousands to tens of thousands of peaks.

What is the difference between narrow and broad peaks?

Narrow peaks are sharp, well-defined regions (typically a few hundred base pairs). They are characteristic of transcription factors and marks like H3K4me3 and H3K27ac. Broad peaks are wider domains (tens of kilobases) produced by marks like H3K27me3, H3K9me3, and H3K36me3. EpiNexus handles both types automatically.

What does "distance to TSS" mean?

TSS = Transcription Start Site, the position where a gene begins to be transcribed. The distance from a peak to the nearest TSS tells you whether the peak is at a promoter (within ~3 kb) or a distal enhancer (often 10–1000 kb away).

What is chromatin state classification?

When you profile multiple histone marks, EpiNexus assigns each region a chromatin state based on the combination of marks present:

Active — Marked by activating modifications (e.g. H3K27ac, H3K4me3). These genes are being expressed.

Poised — Marked by H3K4me1 but not H3K27ac. These enhancers are ready but not yet fully active.

Bivalent — Carry both activating and repressive marks simultaneously. Important in stem cells.

Repressed — Marked by repressive modifications (e.g. H3K27me3) without activating marks.

What is the ROSE algorithm?

ROSE (Ranking of Super-Enhancers) is the standard method for identifying super-enhancers. It works by: (1) stitching together nearby enhancer peaks within 12.5 kb, (2) ranking all stitched enhancers by their total signal, and (3) finding the inflection point in the ranking curve. Regions above the inflection point are classified as super-enhancers.

What is the ABC model for peak–gene linking?

The Activity-by-Contact (ABC) model predicts which gene each enhancer regulates based on two factors: the enhancer's activity level (signal strength) and its 3D contact frequency with gene promoters. It's more accurate than simple distance-based linking, especially for distal enhancers.

Still have questions?

We're happy to help. Reach out and we'll get back to you.

Email info@epinexus.io