BYTE ORDER MARK

  

Copyright © Philip M. Parker, INSEAD. Terms of Use.

BYTE ORDER MARK

Specialty Definition: Byte Order Mark

(From Wikipedia, the free Encyclopedia)

A Byte Order Mark (BOM) is the character at code point FEFF (ZERO-WIDTH NO-BREAK SPACE), when that character is used to denote the Endianness of an encoded string of UCS/Unicode characters.

A BOM can be used to indicate that unlabeled text is UTF-16 or UTF-8 encoded, as well as indicating the byte-order of UTF-16 text, whether labeled or not.

In UTF-16, a BOM is expressed as the 8-bit byte sequence FE FF at the beginning of the encoded string, to indicate that the encoded characters that follow it use big-endian byte order; or it is expressed as the byte sequence FF FE to indicate little-endian order.

UTF-8 text can also use a BOM, although this is rare, since UTF-8 prescribes a fixed byte order, and since UTF-8 is often assumed or implicit, so it doesn't need a signature. The UTF-8 representation of the BOM is the byte sequence EF BB BF.

External Links

Source: the above text is adapted by the editor from Wikipedia, the free encyclopedia under a copyleft GNU Free Documentation License (GFDL) from the article "Byte Order Mark."

Top     



  

Copyright © Philip M. Parker, INSEAD. Terms of Use.