The FASTA format is a text-based format for representing either nucleotide sequences or amino acid (protein) sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences. This description line begins with a '>' and gives a name or a unique identifier to the sequence. It may also contain additional information.
A more complete example is shown below. It contains identifiers, descriptions and multiple sequences.
>sp|J7RUA5|CAS9_STAAU Start of CRISPR-associated endonuclease Cas9 OS=Staphylococcus aureus
MKRNYILGLDIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSKRGARRLKRRR
RHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSAALLHLAKRRGVHN
VNEVEEDTGNELS
>sp|Q99ZW2|CAS9_STRP1 Start of CRISPR-associated endonuclease Cas9/Csn1 OS=Streptococcus pyogenes
MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAE
ATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFG
NIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSD
VDKLFIQLVQT
>sp|G3ECR1|CAS9_STRTR Start of CRISPR-associated endonuclease Cas9 OS=Streptococcus thermophilus
MLFNKCIIISINLDFSNKEKCMTKPYSIGLDIGTNSVGWAVITDNYKVPSKKMKVLGNTS
KKYIKKNLLGVLLFDSGITAEGRRLKRTARRRYTRRRNRILYLQEIFSTEMATLDDAFFQ
RLDDSFLVPDDKRDSKYPIF
An identifier is composed of alphanumeric characters, _ (underscores) and - (hyphens). Do not put spaces in an identifier.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article