With the rapid accumulation of high-throughput metagenomic sequencing data,it is possible to infer microbial species relations in a microbial community systematically.In recent years,some approaches have been proposed for identifying microbial interaction network.These methods often focus on one dataset without considering the advantage of data integration.In this study,we propose to use a similarity network fusion(SNF)method to infer microbial relations.The SNF efficiently integrates the similarities of species derived from different datasets by a cross-network diffusion process.We also introduce consensus k-nearest neighborhood(Ck-NN)method instead of k-NN in the original SNF(we call the approach CSNF).The final network represents the augmented species relationships with aggregated evidence from various datasets,taking advantage of complementarity in the data.We apply the method on genus profiles derived from three microbiome datasets and we find that CSNF can discover the modular structure of microbial interaction network which cannot be identified by analyzing a single dataset.
基于留一准则的正交前向选择算法(Orthogonal Forward Selection based on Leave-One-Out Criteria,OFS-LOO)是最近提出的一种数据建模方法,它能够产生鲁棒性好的参数可调的核函数回归模型。OFS-LOO采用贪婪算法策略,利用全局优化算法逐项调节每个回归项的参数,逐步地增加模型的项数,减少留一准则函数值。但是OFS-LOO仅保留当前最优解作为新回归项的参数,而忽略当前的选择对以后步骤的影响,破坏了模型的稀疏性。本文在OFS-LOO的框架下提出了一种新颖的树型算法。在选择核函数模型的每一项时,采用重复加权增进搜索(Repeated Weighted Boosting Search,RWBS)算法,同时保留RWBS得到的多个局部极值作为核函数参数的候选项。新方法试图找到传统OFS-LOO和全局最优解之间的折衷。实验表明,与传统方法相比,新方法得到的核函数模型稀疏性更好,泛化能力更强。