Encode text file php

Convert files encoding

I have a PHP application who’s files encoding is Greek ISO (iso-8859-7). I want to convert the files to utf-8 but simply saving the files with utf-8 isn’t enough since the Greek texts get garbled. Is there an «automatic» method to do this so that I can completely convert my app’s encoding without having to go through each file and rewrite the texts?

4 Answers 4

On a Linux system, if you are sure all files are currently encoded in ISO-8859-7, you can do this:

bash> find /your/path -name "*.php" -type f \ -exec iconv "<>" -f ISO88597 -t UTF8 -o "<>.tmp" \; \ -exec mv "<>.tmp" "<>" \; 

This converts all PHP script files located in /your/path as well as all sub-directories. Remove -name «*.php» to convert all files.

Since you are under Windows, the easiest option would be a PHP script like this:

 $file)< if($file->isFile()) file_put_contents( $fileName, iconv('ISO-8859-7', 'UTF-8', file_get_contents($fileName)) ); > 
$new_string = iconv("ISO-8859-7", "UTF-8", $old_string); 

This will only convert the contents, I would like to entirely convert the files, including the contents.

Ah, I read your last sentence as how to automatically convert the data without having to manually retype it. You are going to have to write your own function to transverse your app and update the encoding of your files. If iconv doesn’t work for you, try mb_convert_encoding (php.net/manual/en/function.mb-convert-encoding.php). Also when you say the texts gets garbled, is that when viewing the file in a text editor?, or when you output contents of the file within PHP?

Читайте также:  Java for double array

Did you send a UTF8 content type header with the output? As well as set the content type to utf8 in the html?

Yes. The problem resides in the fact that the original app encoding was iso-8859-7, not only the data from the db but the files as well.

Источник

Перекодировка текста UTF-8 и WINDOWS-1251

Проблема кодировок часто возникает при написании парсеров, чтении данных из xml и CSV файлов. Ниже представлены способы эту проблему решить.

windows-1251 в UTF-8

$text = iconv('windows-1251//IGNORE', 'UTF-8//IGNORE', $text); echo $text;
$text = mb_convert_encoding($text, 'UTF-8', 'windows-1251'); echo $text;

UTF-8 в windows-1251

$text = iconv('utf-8//IGNORE', 'windows-1251//IGNORE', $text); echo $text;
$text = mb_convert_encoding($text, 'windows-1251', 'utf-8'); echo $text;

Когда ни что не помогает

$text = iconv('utf-8//IGNORE', 'cp1252//IGNORE', $text); $text = iconv('cp1251//IGNORE', 'utf-8//IGNORE', $text); echo $text;

Иногда доходит до бреда, но работает:

$text = iconv('utf-8//IGNORE', 'windows-1251//IGNORE', $text); $text = iconv('windows-1251//IGNORE', 'utf-8//IGNORE', $text); echo $text;

File_get_contents / CURL

Бывают случаи когда file_get_contents() или CURL возвращают иероглифы (Алмазные борÑ) – причина тут не в кодировке, а в отсутствии BOM-метки.

$text = file_get_contents('https://example.com'); $text = "\xEF\xBB\xBF" . $text; echo $text;

Ещё бывают случаи, когда file_get_contents() возвращает текст в виде:

Это сжатый текст в GZIP, т.к. функция не отправляет правильные заголовки. Решение проблемы через CURL:

function getcontents($url) < $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt($ch, CURLOPT_ENCODING, 'gzip'); curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); $output = curl_exec($ch); curl_close($ch); return $output; >echo getcontents('https://example.com');

Источник

How to encode a PHP file with base64

🙂 I have one ridiculously silly question and most of you would like to reffer me to Google right away, but that didn’t helped me out within the first hour. I suppose I didn’t knew how to look for. I’m having a PHP file and I’d like to have it in base64 yet I can’t get it to work anyhow. 1) I encoded my PHP script to base64(and included the PHP tags). It’ll look as following : JTNDJTNGcGhwJTIwVGhpcyUyMGlzJTIwdGhlJTIwUEhQJTIwY29kZSUyMCUzRiUzRQ== This kind of base64 won’t execute so I added the PHP tags to it although the encoded file already had it. Still didn’t worked out. Removed the tags from the base64 and tried again, but still didn’t worked. Then I tried adding the PHP tags and inside of them added : eval(gzinflate(base64_decode(‘base64 here’))); Still didn’t worked out anyhow. Is anyone here kind enough to tell the kiddo how to run a base64 encoded PHP file properly? Would be really appreaciated. 🙂

2 Answers 2

$source = "JTNDJTNGcGhwJTIwVGhpcyUyMGlzJTIwdGhlJTIwUEhQJTIwY29kZSUyMCUzRiUzRQ= JTNDJTNGcGhwJTIwVGhpcyUyMGlzJTIwdGhlJTIwUEhQJTIwY29kZSUyMCUzRiUzRQ= http://eaccelerator.net/" rel="noreferrer">this one or this one. They will crypt your codes and make them even faster!

)" data-controller="se-share-sheet" data-se-share-sheet-title="Share a link to this answer" data-se-share-sheet-subtitle="" data-se-share-sheet-post-type="answer" data-se-share-sheet-social="facebook twitter devto" data-se-share-sheet-location="2" data-se-share-sheet-license-url="https%3a%2f%2fcreativecommons.org%2flicenses%2fby-sa%2f3.0%2f" data-se-share-sheet-license-name="CC BY-SA 3.0" data-s-popover-placement="bottom-start">Share
)" title="">Improve this answer
answered Sep 9, 2013 at 17:36
Add a comment |
2

If you are going to use base_64 to encode your php file then the encoded text need to seat in between the php tags including the base_64 tag.

Example: If your code is: JTNDJTNGcGhwJTIwVGhpcyUyMGlzJTIwdGhlJTIwUEhQJTIwY29kZSUyMCUzRiUzRQ

Then your code should look like:

".base64_decode("JTNDJTNGcGhwJTIwVGhpcyUyMGlzJTIwdGhlJTIwUEhQJTIwY29kZSUyMCUzRiUzRQ")); ?> 

Basically your basic code will look like this:

".base64_decode("Code Goes here")); ?> 

There are more simple tools that can give you this option Check this out: PHP Encoder & Decoder with Domain Lock

Hot Network Questions

Subscribe to RSS

To subscribe to this RSS feed, copy and paste this URL into your RSS reader.

Site design / logo © 2023 Stack Exchange Inc; user contributions licensed under CC BY-SA . rev 2023.7.27.43548

By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy.

Источник

Read ansi file and convert to UTF-8 string

Is there any way to do that with PHP? The data to be inserted looks fine when I print it out. But when I insert it in the database the field becomes empty.

try using mysql_real_escape_string() php.net/manual/en/function.mysql-real-escape-string.php maybe the string to be inserted contains characters that are used my MySQL

i read the string from the txt file and find that some of them return ansii some of them return empty by using mb_detect_encoding($data), any solution

3 Answers 3

$tmp = iconv('YOUR CURRENT CHARSET', 'UTF-8', $string); 

Strange thing is you end up with an empty string in your DB. I can understand you'll end up with some garbarge in your DB but nothing at all (empty string) is strange.

I just typed this in my console:

ANSI_X3.4-1968 ANSI_X3.4-1986 ANSI_X3.4 ANSI_X3.110-1983 ANSI_X3.110 MS-ANSI 

These are possible values for YOUR CURRENT CHARSET As pointed out before when your input string contains chars that are allowed in UTF, you dont need to convert anything.

Change UTF-8 in UTF-8//TRANSLIT when you dont want to omit chars but replace them with a look-a-like (when they are not in the UTF-8 set)

utf8_encode converts from ISO 8859-1 to UTF-8. So it can only be used if the input encoding is ISO 8859-1

i try $data = iconv('ASCII', 'UTF-8', $data); it out Message: iconv() [function.iconv]: Detected an illegal character in input string

ASCII is a subset of UTF-8. If data was actually ASCII (which is not, as the error message states) you wouldn't need to convert.

i read the string from the txt file and find that some of them return ansii some of them return empty by using mb_detect_encoding($data), any solution

When returning false, simply open the file and look with your eyes for garbage. Remove it by hand and try again. If this works you could write a filter function to run before detecting the encoding.

"ANSI" is not really a charset. It's a short way of saying "whatever charset is the default in the computer that creates the data". So you have a double task:

  1. Find out what's the charset data is using.
  2. Use an appropriate function to convert into UTF-8.

For #2, I'm normally happy with iconv() but utf8_encode() can also do the job if source data happens to use ISO-8859-1.

Update

It looks like you don't know what charset your data is using. In some cases, you can figure it out if you know the country and language of the user (e.g., Spain/Spanish) through the default encoding used by Microsoft Windows in such territory.

I hate those editors that use the word “ANSI”. It’s similar to incorrectly using “Unicode” for UTF-16.

mb_detect_encoding() doesn't really do what most people think. In fact it's close to useless. At most, you can use it to distinguish between UTF-8 and UTF-16, but you need to configure it properly.

Be careful, using iconv() can return false if the conversion fails.

I am also having a somewhat similar problem, some characters from the Chinese alphabet are mistaken for \n if the file is encoded in UNICODE, but not if it is UFT-8.

To get back to your problem, make sure the encoding of your file is the same with the one of your database. Also using utf-8_encode() on an already utf-8 text can have unpleasant results. Try using mb_detect_encoding() to see the encoding of the file, but unfortunately this way doesn't always work. There is no easy fix for character encoding from what i can see 🙁

Источник

Оцените статью