1

The SNP Detection System

Java
Healthcare

The customer

The customer helps scientists and laboratories to conduct research and experiments in the field of life sciences. Their key services include next-generation sequencing, bioanalytical and mass spectrometry, as well as DNA sequencing. The customer turned to Altoros to develop a solution that would detect SNP in digitized DNA sequences saved in the FASTA/FASTQ format easier and less time-consuming.

The need

A common problem for researchers who work on genome analysis is the need to store and process terabytes of data fast. The customer helps scientists and laboratories to conduct research and experiments in the field of life sciences. Their key services include next-generation sequencing, bioanalytical and mass spectrometry, as well as DNA sequencing. The customer turned to Altoros to develop a solution that would detect SNP in digitized DNA sequences saved in the FASTA/FASTQ format easier and less time-consuming.

The challenges

Apart from building an algorithm for detecting SNP, we were to determine what hardware configuration could provide the required data processing speed.

The solution

The team completed the following tasks for this project:

  • Implementation of the data analysis algorithm. Our team designed a Web application to detect SNP and unite all tools required for genome analysis in one user-friendly interface. The software used Bowtie and SAMtools to align short DNA reads to the human genome and SOAPsnp to assemble consensus sequences and align raw sequencing reads on the known reference.
  • Assessment of computation capacities. Our customer wanted to analyze heavy sets of sequencing data with an average size of 150 GB about 2-3 times a month. All computations had to be done within a maximum of 24 hours. We deployed the system on the Amazon cloud to keep the right balance between the cost of the solution and the throughput.
  • Feasibility study and the system testing. Our team built a testing infrastructure using Amazon Web Services and Amazon Elastic MapReduce and provided a detailed report, where we indicated the cost of every solution depending on frequency of use, processing time, and amount of processed data.
  • Building a private infrastructure. Although, the company was delighted with the results they achieved, they faced a new issue. The amount of data continued to grow and–eventually–they had to use AWS more frequently. It was decided to build a private infrastructure inside the customer’s laboratory.

The outcome

With the help of the automated SNP detection system, the biological laboratory of our customer managed to process 150GB of genome sequence data within 24 hours at minimum cost. We started with development of a prototype to test the possible deployment options and make sure the functionality works correctly. The system for SNP detection was later installed on the customer’s private distributed infrastructure and data processing was performed with Apache Hadoop.

Technology stack

Server Platform

Linux, Amazon Web Services

Client Platform/Application Server

Internet Explorer, Firefox, Safari, Chrome

Technologies

Map / Reduce, Java, HTML, Apache Hadoop, Amazon EMR

Programming languages

Perl, Java, Bash

Database, Storage

HDFS

Development Environment

Linux editors, Java IDE, Amazon AWS console

You May Also Like

Automation of In-field Job Planning and Performance Optimization
Java
JavaScript
PostgreSQL
Information technology
Marketing
Call Recording, Analytics, and Workforce Optimization Solution
.NET
jQuery
C#
JavaScript
MS SQL
Information technology
Highly Scalable System for DNA Analysis
Hadoop
Java
Information technology
Healthcare
Sport
A Highly Secure Smart Home System Wins a Kickstarter Funding
Ruby
Ruby on Rails
JavaScript
Angular
PostgreSQL
MySQL
Information technology
The Image Recognition System
Java
MongoDB
NoSQL
e-Commerce
Integrated logistics solutions to the offshore industry
Android
LikeFolio: Best Practices of Cloud and Ruby Development for Application Optimization
NoSQL
MySQL
Ruby
Ruby on Rails
Marketing
Social media
Telecommunications
Finance
Data-Driven Analytics
Software for Selecting and Mixing Paint
.NET
MS SQL
C#
WP
Information technology
Retail
Software Suite for Mobile Technicians and Field Service Management
.NET
MS SQL
iOS
Android
Logistics and transportation
The System for Emergency Control Centers
.NET
C#
MS SQL
Healthcare
Sport
Logistics and transportation
The Cloud-based Document Exchange System
Java
jQuery
NoSQL
Information technology
e-Commerce
The Marketing Information Messaging System
.NET
C#
MS SQL
iOS
Marketing, Social media
Telecommunications
The NuoDB Migrator for Moving SQL Data to a NoSQL Database
Java
NuoDB
MySQL
PostgreSQL
Information technology
Manufacturing
Toyota Automates Its System for Holding Tenders
.NET
C#
Manufacturing
Warehouse Workload Monitoring Application
.NET
C#
MS SQL
WP
Logistics and transportation
Web-Based Personal Styling
Ruby
Ruby on Rails
JavaScript
jQuery
MySQL
Social media
e-Commerce
Web-Based System for Retailers
Ruby
Ruby on Rails
MySQL
MongoDB
Retail
e-Commerce
A Blockchain-Based Platform for Automating Bond Issuing Worth $10M
Bash
JavaScript
Blockchain
Finance

Contact us

Contact us and get a quote within 24 hours

Headquarters

Toll-free

Email