Friday, 20 February 2015

. CHARACTER SET: ISO-LATIN-1 VERSUS UNICODE

HTML’s use of the ISO-latin-1 character set allows it to display most accented characters on most platforms, but it has limitations. For example, common characters such as bullets, Em dashes, and curly quotes simply aren’t available in the ISO-latin-1 (if they’re absolutely necessary, you can create images representing those characters and use them on your pages. I don’t recommend that option, though, because it can interfere with the layout of your page. Also, it can look off if the user’s browser is set to a nonstandard text size.) also, many ISO-latin-1 characters might be entirely unavailable is some browsers, depending on whether those characters exist on that platform and in the current font.

HTML 4.01 takes things a huge leap further by proposing that Unicode should be available as a character set for HTML documents. Unicode is a standard character encoding system that, although backward-compatible with our familiar ASCII encoding, offers the capability to encode characters in almost any of the world’s languages, including Chinese and Japanese.

 This means that documents can be created easily in any language, and they also can contain multiple languages. Both internet explorer and Netscape support Unicode, and it can render documents in many of the explorer and Netscape support Unicode, and it can render documents in many of the scripts provided by Unicode as long as the necessary fonts are available.

This is an important step because Unicode is emerging as a new de facto standard for character encoding, java uses Unicode as its default character encoding, for example, and windows supports Unicode character encoding.

No comments:

Post a Comment