Philadelphia University + Thomas Jefferson University


The following software is available and has been developed by the Jefferson Genomics Scientists and their collaborators.


Improvements in experimental technologies, as well as size and complexity of research data sets have made computational methodologies critically important for successful research. The Jefferson Cancer Genomics Laboratory Shared Resources provide specialized computational expertise for the generation, management, analysis and delivery of information derived from the research setting. Areas include nucleic acid, genomic, and functional genomic data analysis; large data set manipulation and management; data visualization; high performance computing; database design and implementation; Internet-based data dissemination solutions. The Bioinformatics component of the Jefferson Cancer Genomics Laboratory Shared Resource has strong working relationships with the research and clinical departments, individual investigators, and other research entities to optimize the seamless transfer of information and maximize efficiency and productivity.

The following computing resources for Bioinformatics purposes are available:

10-node/80-core Relion High Performance Computing Cluster
running CentOS 4.5 Enterprise Linux
Each node is equipped with dual Xeon 2.53GHz quad core CPUs, 48GB of RAM, and 2TB hard drives. In addition, the cluster also includes a Master Node with dual Xeon 2.53GHz quad core CPUs, 48GB of RAM, and 500GB storage and a Storage Node with dual Xeon 2.4GHz quad core CPUs, 24GB of RAM, and 35TB of usable storage for long-term storage of Next Generation Sequencing data.
An 8-node Dell Poweredge Computing Cluster running RedHat Linux
Each node has two dual-core 64-bit 2.66 GHz CPU's and 4 GB of RAM. It is also equipped with a 2 Terabyte file storage array.
Two Dell Precision workstations running RedHat Enterprise Linux
The Linux workstations are equipped with 64-bit quad core 2.4GHz Xeon processors and 12 GB of RAM, and 500GB of storage.
Four Dell Precision workstations running Microsoft Windows
The dell precision desktops are equipped with 64-bit dual core 2GHz Xeon processors and 4 GB of RAM. These workstations also share a 2TB networked storage drive.

The Bioinformatics core maintains access to a wide range of software applications, including: Affymetrix  Expression Console, Affymetrix Genotyping Console 4, Sequencher 4.8, Nexus Copy Number, GeneSpring GX, and Ingenuity Pathways Analysis. Genespring GX and Ingenuity Pathways Analysis are efficient and powerful tools for comprehensive microarray and functional genomics data analysis. Next Generation Sequencing software packages include ABI LifeScope .5, GenePattern and GeneSpring GX, as well as several other application-specific packages. The core also provides customizable analysis pipelines using the Matlab Bioinformatics toolbox in addition to open source solutions such as bioconductor for the R platform.

Data Sharing

Our facility manages a high performance computing cluster equipped with 35TB of usable storage for long-term data access and storage. This equipment is housed in a high-security Tier IV datacenter and connected to our facility over a redundant 10 Gb fiber ring network. Following each sequencing run performed on the SOLiD 4 platform, the data generated from primary analysis is transferred to the storage server for secondary analysis and long-term storage. While each flow cell slide run on the SOLiD 4 5500x generates several terabytes of raw data, primary processing steps reduce the amount of sequence data down to approximately 100 GB per slide. This configuration provides access to historical data, making it readily available for retrospective analysis as each project generates data. Additionally, secure backup services are provided by our offsite data center. Incremental backup is performed daily while full backups are available on a bi-weekly schedule. This configuration ensures uninterrupted service and a high level of data security for our collaborators.