What do ref and Alt mean in VCF?
REF and ALT The reference allele and alternative allele (s) observed in a sample, set of samples, or a population in general (depending how the VCF was generated). The REF and ALT alleles are the only required elements of a VCF record that tell us whether the variant is a SNP or an indel (or in complex cases, a mixed-type variant).
How to find the total allelic depth in VCF?
Figure 2. Alt Allele Freq = AO for each alternate allele / (sum of all AO entries + RO) Next, we will look for observed alternate allele counts and the total allelic depth fields, the alternate allele counts will once again come in either AO or FAO fields. The total allelic depths will be found in the DP or FDP fields respectively. Figure 3.
Where does the alt allele frequency come from?
Depending on the Variant Caller that was used to produce your files the allelic depth information can come from a variety of fields within the VCF file and VarSeq can use them to compute the Alternate Allele Frequency (Alt Allele Freq).
What does non ref mean in gVCF genomic variant call format?
The first thing you’ll notice, hopefully, is the symbolic allele listed in every record’s ALT field. This provides us with a way to represent the possibility of having a non-reference allele at this site, and to indicate our confidence either way.
What can a VCF file be used for?
VCF file format comes with a lot of interesting quality assurance and statistics fields that can be used for filtering in VarSeq. Open your files in a text editor to see all the fields that are available in your files, each field will have a header line with a description of its content.
Where to find the DP4 field in VCF?
As a last resort, VarSeq will look for the DP4 field which can commonly be found in VCF files prepared by SAMTools. This field has four entries in the following order: forward reference count, reverse reference count, forward alternate count and reverse alternate count.