February 11, 2009

Because You ASCIId Part II

Filed under: Main — admin @ 12:01 am

There are 256 values in an 8-bit byte. Yet the ASCII standard defined only 128 characters, where 128 is exactly half of 256. That meant that a typical 8-bit PC had to come up with 128 more characters to assign to the “upper values” found in an 8-bit byte. Contrary to popular belief, those are not ASCII characters, they are Extended ASCII characters.

Before clearing up the Extended ASCII confusion, let me just remind some readers eager to understand technical nonsense that there are not always 8-bits in a byte. While an 8-bit byte is a common computer thing these days, early computers did not standardize how many bits were in a byte.

There were once 6-bit computers, 7-bit computers, and even 12-bit computers. Only when the microcomputer architecture appeared in the late 1970s did the 8-bit byte start to become popular. So there!

Now back to my continuing discussion of ASCII:

When 8-bit computers started to appear, computer manufacturers assigned the bonus 128 code values to various characters. Because ASCII didn’t define codes 128 through 255, each manufacturer often put their own characters up there. Depending on the computer you might find graphic blocks, foreign language characters, and other random symbols.

On the IBM PC, the character set used was called the Extended ASCII set. Because the PC became the most popular computer platform, the Extended ASCII character set became a de facto standard. Of course, the PC didn’t dominate all computerdom. The Macintosh, for example, sported a completely different character set for codes 128 through 255. And even though the PC pretty much had a standard Extended ASCII character set, not everyone was happy with it.

To keep people happy, and probably to acknowledge that Extended ASCII was not a standard, IBM came up with the concept of the code page.

A code page is nothing more than a set of display characters to match codes 128 through 255 in an 8-bit byte. By choosing a different code page, you could load in a new set of characters for those codes.

The original IBM PC Extended ASCII code page was dubbed Code Page 437. You can visit this Wikipedia entry to see the full character set.

By using a special DOS, and later Windows, command, you could change the code page used by your PC for displaying characters. In DOS (and the Windows Command Prompt) the command is:

mode con: cp select=xxx

Where xxx is the code page to load. You can use the command mode con: cp to view the current code page setup. (It’s probably good old Code Page 437.)

There were a couple dozen code pages available for the PC, shipped with DOS and Windows. There were code pages for Greek, Cyrillic, Hebrew, and so on. Before you get all excited, however, understand that all the code page nonsense has gone away. That’s because a newer standard called Unicode has replaced the Code Page as the method all computers use (or should use) for displaying text.

I’ll discuss Unicode in my next post.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.


Powered by WordPress