ASCII

From PEGWiki
Jump to: navigation, search

The American Standard Code for Information Interchange, known universally by its acronym, ASCII, pronounced /ˈæs.ki/, is a 7-bit character encoding supported by nearly all modern computers, that is, a standard assignment of natural numbers to characters of the ASCII character set. Because it uses only seven bits, it fits into an eight-bit byte, which is also nearly universal, but as a limitation it supports only 27 = 128 code points. Nevertheless, these 128 code points are enough to encode all characters that can be produced by a standard US keyboard.

The first 32 characters, numbered from 0 to 31, as well as the last character, numbered 127, are control characters. These characters do not have corresponding symbols that are printed or displayed when strings containing these characters are printed or displayed.

The other 95 characters, numbered from 32 to 126, are printable characters; they have corresponding symbols or glyphs. For example, the uppercase letter A is assigned the code point 65, the space character is assigned the code point 32 (it is considered printable even though no ink actually appears on the page; consider its glyph to be empty), and the tilde character ~ is assigned the code point 126.

ASCII table[edit]

An ASCII table is a table that shows the mapping of code points to supported characters. Here is one such table:

Code point Character
0 NUL (null character)
1 SOH (start of header)
2 STX (start of text)
3 ETX (end of text)
4 EOT (end of transmission)
5 ENQ (enquiry)
6 ACK (acknowledgement)
7 BEL (bell)
8 BS (backspace)
9 HT (horizontal tab)
10 LF (line feed)
11 VT (vertical tab)
12 FF (form feed)
13 CR (carriage return)
14 SO (shift out)
15 SO (shift in)
16 DLE (data link escape)
17 DC1 (device control 1)
18 DC2 (device control 2)
19 DC3 (device control 3)
20 DC4 (device control 4)
21 NAK (negative acknowledgement)
22 SYN (synchronous idle)
23 ETB (end of transmission block)
24 CAN (cancel)
25 EM (end of medium)
26 SUB (substitute)
27 ESC (escape)
28 FS (file separator)
29 GS (group separator)
30 RS (record separator)
31 US (unit separator)
32 (space)
33 !
34 "
35 #
36 $
37 %
38 &
39 '
40 (
41 )
42 *
43 +
44 ,
45 -
46 .
47 /
48 0
49 1
50 2
51 3
52 4
53 5
54 6
55 7
56 8
57 9
58 :
59 ;
60 <
61 =
62 >
63 ?
64 @
65 A
66 B
67 C
68 D
69 E
70 F
71 G
72 H
73 I
74 J
75 K
76 L
77 M
78 N
79 O
80 P
81 Q
82 R
83 S
84 T
85 U
86 V
87 W
88 X
89 Y
90 Z
91 [
92 \
93 ]
94 ^
95 _
96 `
97 a
98 b
99 c
100 d
101 e
102 f
103 g
104 h
105 i
106 j
107 k
108 l
109 m
110 n
111 o
112 p
113 q
114 r
115 s
116 t
117 u
118 v
119 w
120 x
121 y
122 z
123 {
124 |
125 }
126 ~
127 DEL (delete)

The control characters[edit]

Most of the control characters in the above table are unimportant. The following are the most important for programmers:

  • The null character, numbered 0. In the C programming language, this character is used to indicate the end of a string. (This means that strings cannot contain null characters.)
  • Bell, or alarm, numbered 7. Sometimes, displaying this character will cause a beep or some other sound to be produced by the machine (a terminal bell).
  • Backspace, numbered 8. This is the character, if any, produced by the Backspace key.
  • The horizontal tab, numbered 9. It is the character, if any, produced by the "Tab" key.
  • The carriage return, numbered 13, which originally instructed typewriters to return the carriage to the beginning of the line, and the line feed, numbered 10, which originally instructed typewriters to feed the next line of the paper (i.e., to begin the next line). The Windows operating system uses a carriage return followed by a line feed to separate a line of text in a text file from the next; UNIX-based systems use just a line feed. (These characters are generated by the Enter or Return key.)
  • Escape, numbered 27. This is the character, if any, produced by the "Escape" key. (It tends to be caught by the application rather than entered along with text.)
  • Delete, numbered 127. This is the character, if any, produced by the Delete key.