e-journal
Constructing a Gene Team Tree in Almost Oðn lg nÞ Time
An important model of a conserved gene cluster is called the gene team model, in which a chromosome is defined to be a permutation of distinct genes and a gene team is defined to be a set of genes that appear in two or more species, with the distance between adjacent genes in the team for each chromosome always no more than a certain threshold d. A gene team tree is a succinct way to represent all gene teams for every possible value of d. The previous fastest algorithm for constructing a gene team tree of two
chromosomes requires Oðn lg n lglg nÞ time, which was given by Wang and Lin. Its bottleneck is a problem called the maximum-gap problem. In this paper, by presenting an improved algorithm for the maximum-gap problem, we reduce the upper bound of the gene team tree problem to Oðn lg n aðnÞÞ. Since a grows extremely slowly, this result is almost as efficient as the current best upper bound, Oðn lg nÞ, for finding the gene teams of a fixed d value. Our new algorithm is very efficient from both the theoretical and practical points of view. Wang and Lin’s gene-team-tree algorithm can be extended to k chromosomes with complexity Oðkn lg n lglg nÞ. Similarly, our improved algorithm for the maximum-gap problem reduces this running time to Oðkn lg n aðnÞÞ. In addition, it also provides new upper bounds for the gene team tree problem on general sequences, in which multiple copies of the same gene are allowed.
Index Terms—Algorithms, data structures, gene teams, comparative genomics, conserved gene clusters
Tidak ada salinan data
Tidak tersedia versi lain