Please, what is the default character encoding used for php string? Is it UTF8?

Question

1.00/5 (1 vote)

See more:

PHP

What is the default character encoding used for php string? Is it UTF8?

Posted 19-Jan-16 10:46am

Gbenbam

Add a Solution

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Sergey Alexandrovich Kryukov · Answer 1 · 2016-01-19T11:58:00

Please see: PHP: Character Encoding — Manual[^].

This is not so simple as it may seem; and, at the same time, it does not create any problems. What do you mean by "PHP"? This is not a standard language. All such things are based on such PHP authority as PHP: Hypertext Preprocessor[^]. If you also download PHP from the same source, you can be certain about it, but what prevents any other party from providing some alternative implementation of PHP?

Here is the thing: it is not really important. Unicode is Unicode, it is not UTF-8, UTF-16LE or UTF-32. Unicode is abstracted from encoding or any computer representation of data. It simply define mapping between characters as pure cultural entities, and code points understood as abstract integer number as they are understood in mathematics. How those numbers are represented in some computer memory, variables, network/file streams, is defined in UTFs. And now, you have to understand that correct use of programming should not be based on the knowledge of the representation of Unicode in memory accessed by program variables/members/objects. The text data can come from different sources. The program may or may not be based on some metadata which comes with data. For example, XML encoding comes with XML prolog, and HTML encoding comes with HTTP-EQUIV "content-type" declaration (I always repeat that using HTTP-EQUIV is critically important even if encoding is set as the HTTP server's default; think at this: what happens if the page is saved in a file?). This data goes into the PHP data, which you should process using appropriate PHP functions, without knowing how the code points are encoding. For example, if you find a sub-string in a string, both source string and sub-string are represented in the same encoding; this is all that matters.

—SA