Split Fasta File by Max Allowed MB

There are a wide variety of programs available today for analyzing data in fasta format. Some of these programs have attractive websites, or downloadable software packages that you can use locally. Occasionally there are programs that you want to use that are only accessible through a web interface. However, file size restrictions on uploads are a typical problem when working with these programs. Continue reading Split Fasta File by Max Allowed MB

Uniquely Merge Fasta Files – Get Basic Stats

A common task when analyzing assembly data is the merging of fasta files of transcripts from multiple assemblies. The following bash script combines a number of input fasta files while retaining only the sequences that are unique by the specified criteria. The options for uniquely merging fasta files are by sequence, name of sequence, or both sequence and name. Continue reading Uniquely Merge Fasta Files – Get Basic Stats

Trinity Assembly Guide File

Input RNA-seq reads for Trinity assembly need to be in a comma-separated list, and the bash wildcard creates a space separated list. So, the –samples_file option can be used to input a tab-delimited text file indicating biological replicate relationships for the reads instead. There is an example of the format for the samples_file under the Running Trinity section of the Trinity wiki. Continue reading Trinity Assembly Guide File