JOKERCAKE is a software to generate the shortest sequnece to cover k-mers
using joker characters.
The algorithm is given:
It outputs two sequence files as textual files.
JOKERCAKE was developed by Yaron Orenstein
in Bonnie Berger's group
at Massachusetts Institute of Technology: MIT.
The algorithm is based on 2 methods. The first is a greedy heuristic that adds k-mers that
cover as many uncovered k-mers. The second is an ILP formulation solved by black-box solvers.
Input and Output
JOKERCAKE takes as input nine parameters:
1. joker template - string over 0's and 1's, where 1's tell where the joker is.
2. alphabet + joker character - for example, DNA: ACGTN, Amino acid: ARNDCQEGHILKMFPSTWYVX
3. multiplicity filename - in the file for each k-mer how many times it has to occur.
4. reverse complement - 0 for coering all k-mers, 1 for covering either a k-mer or its reverse complement
5. time_limit - time limit for the ILP run (in seconds). When set to 0, no ILP solution is provided.
6. heuristic of the first algorithm - 0 no heuristic, 1 for greedy, 2 for random.
7. first algorithm output filename.
8. ILP output filename.
Get the software
A binary is available here (requires java 1.7 or higher):
java -jar joker.jar <template> <alphabet+joker> <multi_file> <reverse_complement> <time_limit> <heuristic_to_use> <heuristic_filename> <ILP_filename>
Example run:
java -jar joker.jar 00001 ACGTN multi1_1024.txt 0 200 1 heuristic.txt ILP.txt
Without ILP:
java -jar joker.jar 00001 ACGTN multi1_1024.txt 0 0 2 heuristic.txt ILP.txt
The file multi1_1024.txt has 1 for each of the 1024 k-mers.