The FASTA format is a text-based format for representing either nucleotide sequences or amino acid (protein) sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences. This description line begins with a '>' and gives a name or a unique identifier to the sequence. It may also contain additional information.
A more complete example is shown below. It contains identifiers, descriptions and multiple sequences.
>sp|J7RUA5|CAS9_STAAU Start of CRISPR-associated endonuclease Cas9 OS=Staphylococcus aureus
MKRNYILGLDIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSKRGARRLKRRR
RHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSAALLHLAKRRGVHN
VNEVEEDTGNELS
>sp|Q99ZW2|CAS9_STRP1 Start of CRISPR-associated endonuclease Cas9/Csn1 OS=Streptococcus pyogenes
MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAE
ATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFG
NIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSD
VDKLFIQLVQT
>sp|G3ECR1|CAS9_STRTR Start of CRISPR-associated endonuclease Cas9 OS=Streptococcus thermophilus
MLFNKCIIISINLDFSNKEKCMTKPYSIGLDIGTNSVGWAVITDNYKVPSKKMKVLGNTS
KKYIKKNLLGVLLFDSGITAEGRRLKRTARRRYTRRRNRILYLQEIFSTEMATLDDAFFQ
RLDDSFLVPDDKRDSKYPIF
An identifier is composed of alphanumeric characters, _ (underscores) and - (hyphens). Do not put spaces in an identifier.