CoreArray Library Framework (v1.0)

CoreArray project is to develop portable and scalable storage technologies for bioinformatics data, allowing parallel computing at the multicore and cluster levels.


C++ library



R packages

gdsfmt          an R interface for the CoreArray library
unit testing for gdsfmt
SNPRelate a high-performance computing R package for relatedness and principal component analysis of SNP data
unit testing for SNPRelate
SeqArray an R/Bioconductor package for big data management of genome-wide sequencing variants


Install R packages

SeqArray source(""); biocLite("SeqArray")



gdsfmt           online
SNPRelate           online         pdf         Validation using PLINK v1.07 and EIGENSTRAT v3.0 (Validate.SNPRelate.r, Validate.SNPRelate.r.Rout)
SeqArray           online





Xiuwen Zheng





Copyright notice

CoreArray C/C++ library
Copyright (C) 2007-2014 Xiuwen Zheng
gdsfmt R package
Copyright (C) 2011-2014 Xiuwen Zheng
All rights reserved.
CoreArray/gdsfmt is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License Version 3 as published by the Free Software Foundation.
CoreArray is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with CoreArray. If not, see



Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS. A High-performance Computing Toolset for Relatedness and Principal Component Analysis of SNP Data. Bioinformatics. 2012 Dec 15; 28(24):3326-8. doi: 10.1093/bioinformatics/bts606.