String
The string is one of the most fundamental objects in computer science. Strings are intimately familiar to us in the form of text and therefore it is no wonder that the theory of strings (which is to be distinguished from the field of theoretical physics known as "string theory") is one of the most extensively studied subdisciplines of computer science.
Definitions
An alphabet, usually denoted is a finite nonempty set (whose size is denoted ). It is often assumed that the alphabet is totally ordered, but this is not always necesary. An element of this the alphabet is known as a character.
The set of -tuples of is denoted . A string of length is an element of . The set is defined ; an element of is known simply as a string over . The empty string, denoted , is the unique element of . Strings must have finite length, but there is no limit on the length of a string.
Examples of alphabets include:
- The binary alphabet, . There are two characters, usually denoted 0 and 1. In virtually all computers ever built, data are strings over the binary alphabet, and are known as bit strings.
- The Latin alphabet, consisting of the characters a, A, b, B, ..., z, Z. English words may be considered strings over the Latin alphabet.
- The Unicode character set. Text files may be considered strings over this alphabet.
- The set of integers for some . Here the characters are actually integers. (Recall that the definition of character is broader than the common concept of the character as the smallest, indivisible element of writing.)