2015년 1월 25일 일요일

Platanus for assembly of highly heterogous genome

Prunus yedoensis whole genome assembly

Library:
- total RNA, Illumina's TruSeq library kit

raw data:
PE:
MiSeq 500bp (library name: Prunus-1) 
NextSeq 300bp (library name: Prunus-2)
MP:
NextSeq 3kb, 5kb, 10kbp, 15kb

File List:
PE:

[goonbokim@R820 miseq_genome_raw_fastq-gunzip]$ ll -h
total 153G
-rw-r--r-- 1 root root 138K Sep 15 14:36 140704_FASTQ.pdf
-rw-r--r-- 1 root root  212 Sep 15 14:36 140704_FASTQ.pdf.md5sum.txt
-rw-r--r-- 1 root root  217 Sep 15 14:40 check_md5sum.txt
-rw-r--r-- 1 root root  657 Sep 15 14:40 get_raw_fastq.pl
-rw-r--r-- 1 root root  248 Sep 15 14:40 get_url.txt
-rw------- 1 root root  78M Sep 15 14:40 nohup.out
-rw-r--r-- 1 root root  35G Sep 15 14:37 Prunus-1_1.fastq   << MiSeq (500bp).1
-rw-r--r-- 1 root root  35G Sep 15 14:38 Prunus-1_2.fastq   << MiSeq (500bp).2
-rw-r--r-- 1 root root  42G Sep 15 14:38 Prunus-2_1.fastq   << NextSeq(300bp).1
-rw-r--r-- 1 root root  42G Sep 15 14:40 Prunus-2_2.fastq  << NextSeq(300bp).2

MP:

[goonbokim@R820 NextSeq_genome_MP_raw_fastq-gunzip]$ ll -h
total 204G
-rw-r--r-- 1 root root 145K Aug 18 10:12 140814_Prunus_MP_FASTQ.pdf
-rw-r--r-- 1 root root  226 Aug 18 09:57 get_file.md5sum
-rw-r--r-- 1 root root  658 Aug 18 10:12 get_raw_fastq.pl
-rw-r--r-- 1 root root   64 Aug 18 10:12 get_url.txt
-rw------- 1 root root  15M Aug 18 10:12 nohup.out
-rw-r--r-- 1  701 1000  13G Aug 14 15:23 Prun-MP-10kb-1_1.fastq
-rw-r--r-- 1  701 1000  13G Aug 14 15:23 Prun-MP-10kb-1_2.fastq
-rw-r--r-- 1  701 1000  13G Aug 14 15:24 Prun-MP-10kb-2_1.fastq
-rw-r--r-- 1  701 1000  13G Aug 14 15:25 Prun-MP-10kb-2_2.fastq
-rw-r--r-- 1  701 1000  17G Aug 14 15:16 Prun-MP-15kb-1_1.fastq
-rw-r--r-- 1  701 1000  17G Aug 14 15:17 Prun-MP-15kb-1_2.fastq
-rw-r--r-- 1  701 1000  11G Aug 14 15:18 Prun-MP-15kb-2_1.fastq
-rw-r--r-- 1  701 1000  11G Aug 14 15:18 Prun-MP-15kb-2_2.fastq
-rw-r--r-- 1  701 1000  12G Aug 14 15:25 Prun-MP-3kb-1_1.fastq
-rw-r--r-- 1  701 1000  12G Aug 14 15:26 Prun-MP-3kb-1_2.fastq
-rw-r--r-- 1  701 1000  14G Aug 14 15:22 Prun-MP-3kb-2_1.fastq
-rw-r--r-- 1  701 1000  14G Aug 14 15:22 Prun-MP-3kb-2_2.fastq
-rw-r--r-- 1  701 1000  14G Aug 14 15:19 Prun-MP-5kb-1_1.fastq
-rw-r--r-- 1  701 1000  14G Aug 14 15:20 Prun-MP-5kb-1_2.fastq
-rw-r--r-- 1  701 1000  13G Aug 14 15:20 Prun-MP-5kb-2_1.fastq
-rw-r--r-- 1  701 1000  13G Aug 14 15:21 Prun-MP-5kb-2_2.fastq





trimming:
java -jar /Bio/apps/Trimmomatic-0.32/trimmomatic-0.32.jar PE -threads 64 -trimlog ./2.2-trimmomatic/fileindex2.log -phred33 ./raw_split/xaa_1.fastq ./raw_split/xaa_2.fastq ./tr
im_split/xaa_1-p-trim.fastq ./trim_split/xaa_1-u-trim.fastq ./trim_split/xaa_2-p-trim.fastq ./trim_split/xaa_2-u-trim.fastq ILLUMINACLIP:/Bio/apps/Trimmomatic-0.32/adapters/ill
umina-adaptor.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:200

SOAPec


[goonbokim@R820 2.2-trimmomatic]$ ls *.cor.* -lh
-rw-rw-r-- 1 goonbokim goonbokim  31G Nov 13 22:58 Prunus-1_1-paired-trimmed.fastq.cor.pair_1.fq
-rw-rw-r-- 1 goonbokim goonbokim   85 Nov 13 22:58 Prunus-1_1-paired-trimmed.fastq.cor.pair.single.stat
-rw-rw-r-- 1 goonbokim goonbokim  38M Nov 13 22:58 Prunus-1_1-paired-trimmed.fastq.cor.single.fq
-rw-rw-r-- 1 goonbokim goonbokim  719 Nov 13 18:09 Prunus-1_1-paired-trimmed.fastq.cor.stat
-rw-rw-r-- 1 goonbokim goonbokim 2.3G Nov 13 17:39 Prunus-1_1-unpaired-trimmed.fastq.cor.fq
-rw-rw-r-- 1 goonbokim goonbokim  704 Nov 13 17:39 Prunus-1_1-unpaired-trimmed.fastq.cor.stat
-rw-rw-r-- 1 goonbokim goonbokim  28G Nov 13 22:58 Prunus-1_2-paired-trimmed.fastq.cor.pair_2.fq
-rw-rw-r-- 1 goonbokim goonbokim  717 Nov 13 18:45 Prunus-1_2-paired-trimmed.fastq.cor.stat
-rw-rw-r-- 1 goonbokim goonbokim 240M Nov 13 17:40 Prunus-1_2-unpaired-trimmed.fastq.cor.fq
-rw-rw-r-- 1 goonbokim goonbokim  687 Nov 13 17:40 Prunus-1_2-unpaired-trimmed.fastq.cor.stat
-rw-rw-r-- 1 goonbokim goonbokim  39G Nov 13 23:01 Prunus-2_1-paired-trimmed.fastq.cor.pair_1.fq
-rw-rw-r-- 1 goonbokim goonbokim   86 Nov 13 23:01 Prunus-2_1-paired-trimmed.fastq.cor.pair.single.stat
-rw-rw-r-- 1 goonbokim goonbokim 175M Nov 13 23:01 Prunus-2_1-paired-trimmed.fastq.cor.single.fq
-rw-rw-r-- 1 goonbokim goonbokim  723 Nov 13 21:55 Prunus-2_1-paired-trimmed.fastq.cor.stat
-rw-rw-r-- 1 goonbokim goonbokim 2.4G Nov 13 17:46 Prunus-2_1-unpaired-trimmed.fastq.cor.fq
-rw-rw-r-- 1 goonbokim goonbokim  702 Nov 13 17:46 Prunus-2_1-unpaired-trimmed.fastq.cor.stat
-rw-rw-r-- 1 goonbokim goonbokim  38G Nov 13 23:01 Prunus-2_2-paired-trimmed.fastq.cor.pair_2.fq
-rw-rw-r-- 1 goonbokim goonbokim  723 Nov 13 22:37 Prunus-2_2-paired-trimmed.fastq.cor.stat
-rw-rw-r-- 1 goonbokim goonbokim 683M Nov 13 17:47 Prunus-2_2-unpaired-trimmed.fastq.cor.fq
-rw-rw-r-- 1 goonbokim goonbokim  695 Nov 13 17:47 Prunus-2_2-unpaired-trimmed.fastq.cor.stat

MP trimming
nextclip:
[goonbokim@R820 MP]$ nextclip -i /media/NAS/Prunus/NextSeq_genome_MP_raw_fastq-gunzip/Prun-MP-3kb-1_1.fastq -j /media/NAS/Prunus/NextSeq_genome_MP_raw_fastq-gunzip/Prun-MP-3kb-
1_2.fastq -o PyMP3k1 -d