The FASTA format is a text-based format for representing either nucleotide sequences or amino acid (protein) sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences. This description line begins with a '>' and gives a name or a unique identifier to the sequence. It may also contain additional information. 


 


A more complete example is shown below.  It contains identifiers, descriptions and multiple sequences.


>sp|J7RUA5|CAS9_STAAU Start of CRISPR-associated endonuclease Cas9 OS=Staphylococcus aureus

MKRNYILGLDIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSKRGARRLKRRR

RHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSAALLHLAKRRGVHN

VNEVEEDTGNELS

>sp|Q99ZW2|CAS9_STRP1 Start of CRISPR-associated endonuclease Cas9/Csn1 OS=Streptococcus pyogenes

MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAE

ATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFG

NIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSD

VDKLFIQLVQT

>sp|G3ECR1|CAS9_STRTR Start of CRISPR-associated endonuclease Cas9 OS=Streptococcus thermophilus

MLFNKCIIISINLDFSNKEKCMTKPYSIGLDIGTNSVGWAVITDNYKVPSKKMKVLGNTS

KKYIKKNLLGVLLFDSGITAEGRRLKRTARRRYTRRRNRILYLQEIFSTEMATLDDAFFQ

RLDDSFLVPDDKRDSKYPIF


An identifier is composed of alphanumeric characters, _ (underscores) and - (hyphens).  Do not put spaces in an identifier.