What are GBK and UTF-8? What's their difference?

UTF-8: Unicode conversion format -8bit, which allows BOM, but usually does not. It is a multi-byte coding for solving international characters. It uses 8 bits (one byte) for English and 24 bits (three bytes) for Chinese. UTF-8 contains characters needed by all countries in the world, and it is a universal international code. UTF-8 encoded text can be displayed on browsers that support UTF8 character set in different countries. For example, if it is encoded by UTF8, foreigners can also display Chinese on English IE without downloading IE's Chinese language support package. GBK is a national standard GB23 12, which is compatible with GB23 12 after expansion. The text encoding of GBK is represented by double bytes, that is, both Chinese and English characters are represented by double bytes. In order to distinguish Chinese characters, the highest bit is set to 1. GBK contains all Chinese characters and is a country code, which is worse than UTF8, but UTF8 occupies a larger database than GBD. GBK, GB23 12 and UTF8 must be encoded in Unicode to convert each other: GBK, GB23 12-Unicode-UTF8 UTK, GB23 12 For a website or forum, if there are many English characters, UTF-8 is recommended to save space. However, plug-ins in many forums now generally only support GBK.

A detailed explanation of the difference between the two codes

Simply put, unicode, gbk, Big Five codes are coded values, and utf-8, uft- 16 and the like are expressions of this value. But the first three codes are compatible, and the three codes of the same Chinese character are completely different. For example, the uncode value of "Han" is different from that of gbk. Suppose that the uncode value is a040 and gbk is 0. Utf-8 code is only organized for uncoded code. If GBK wants to convert UTF-8, it must be converted into Decode code first, and then UTF-8 will do.