Difference between revisions of "Character encoding"
(adding one reference, for support. more are needed.) |
DavidB4-bot (Talk | contribs) (→top: Spelling, grammar, and general cleanup, typos fixed: Therefore → Therefore,, a agreed-upon → an agreed-upon) |
||
| Line 1: | Line 1: | ||
The purpose of '''character encoding''' schemes is to provide a means of storing, retrieving, comparing, inputting, and outputting symbol glyphs such as letters, digits, and various symbols. | The purpose of '''character encoding''' schemes is to provide a means of storing, retrieving, comparing, inputting, and outputting symbol glyphs such as letters, digits, and various symbols. | ||
| − | Computers represent data as a series of bits. Therefore any other representation of data (such as alphabetic characters) must derive from | + | Computers represent data as a series of bits. Therefore, any other representation of data (such as alphabetic characters) must derive from an agreed-upon interpretation (ie a "standard") of how those bits will be interpreted. Any such standard is essentially arbitrary, but certain standards for character encoding have emerged over the years. Since most modern computers deal with bytes (8-bit groupings), or multiples of bytes, most character encodings are defined at the byte level. There may, or may not, be a correlation between a given byte value and what character it represents in any given encoding scheme. For instance, a byte value of 1001110 represents the letter "N" in [[ASCII]],<ref>https://www.computerhope.com/jargon/a/ascii.htm</ref> but the symbol "+" in EBCDIC. |
The main character encoding standards today are [[ASCII]] (sometimes ambiguously referred to as "ANSI"), and [[Unicode]]. Older standards include EBCDIC, Baudot, and Radix-50. | The main character encoding standards today are [[ASCII]] (sometimes ambiguously referred to as "ANSI"), and [[Unicode]]. Older standards include EBCDIC, Baudot, and Radix-50. | ||
Latest revision as of 23:31, May 21, 2020
The purpose of character encoding schemes is to provide a means of storing, retrieving, comparing, inputting, and outputting symbol glyphs such as letters, digits, and various symbols.
Computers represent data as a series of bits. Therefore, any other representation of data (such as alphabetic characters) must derive from an agreed-upon interpretation (ie a "standard") of how those bits will be interpreted. Any such standard is essentially arbitrary, but certain standards for character encoding have emerged over the years. Since most modern computers deal with bytes (8-bit groupings), or multiples of bytes, most character encodings are defined at the byte level. There may, or may not, be a correlation between a given byte value and what character it represents in any given encoding scheme. For instance, a byte value of 1001110 represents the letter "N" in ASCII,[1] but the symbol "+" in EBCDIC.
The main character encoding standards today are ASCII (sometimes ambiguously referred to as "ANSI"), and Unicode. Older standards include EBCDIC, Baudot, and Radix-50.