The prior probability of k-tuple composition appears in one type of class can be calculated by:
where mj is the total number of k-tuple composition that appeared in the j-th type of class and M is the total occurrence
frequency of all k-tuple composition in the benchmark dataset.
The probability of the i-th k-tuple composition occurring in the j-th class of location can be calculated
by the following formula:
where Ni is the total count of the i-th k-tuple composition in the benchmark dataset, nij represents the number of occurrences of the i-th nonamer in the j-th type of location, and the sum is taken from nij to Ni. The confidence level of the i-th k-tuple composition in the j-th class can be given by:
2. How to
Use
:(1) Download the executable package BinomialDistribution.exe directly;
(2) Or you can use github to download them.
(3) Example to use.
3. The
Parameters
description: -k [the number of kmer]
-t [type of sequence(require:DNA ,RNA or protein)]
-f [the path of sequencefile]