Javascript files utf 8

What charset to use to store russian text into javascript files as an array

I am creating a coldfusion page, that takes language translation data stored in a table in my database, and makes static js files for each language pairing of english to ___ etc. I am now starting to work on russian, I was able to get the other languages to work fine.. However, when it saves the file, all the text looks like question marks. Even when I run my translation app, the text for just that language looks like all . I have tried writing it via cffile as utf-8 or ISO-8859-1 but neither seems to get it to display properly. Any suggestions?

Here is an example of what I see, and what I get. Запуск Мой календарь which in english means «Launch My Calendar» but when it saves to file it is «·ÐßãáÚ ¼ÞÙ ÚÐÛÕÝÔÐàì» so the charset is wrong.

4 Answers 4

Have you tried ISO-8859-5? I believe it’s the encoding that «should» be used for Russian.

By all means do use UTF-8 over any other encoding type. You need to make sure that:

  • your cfm templates were written to disk with UTF-8 encoding (notepad++ handles that nicely, and so does Eclipse or the new ColdFusion Builder)
  • your database was created with the proper codepage for nvarchar (and varchar) datatypes
  • your database connection handles UTF-8
Читайте также:  Переменные data в java

How to go about the last two items depends on your database back-end. Coldfusion is quite agnostic in that regard, as it will happily use any jdbc driver that you may need.

When working in a multi-character set environment, character set conversion issues can occur and it can be difficult to determine where the conversion issue occurred.

There are two categories into which conversion issues can be placed. The first involves sending data in the wrong format to the client API. Although this cannot happen with Unicode APIs, it is possible with all other client APIs and results in garbage data.

The second category of issue involves a character that does not have an equivalent in the final character set, or in one of the intermediate character sets. In this case, a substitution character is used. This is called lossy conversion and can happen with any client API. You can avoid lossy conversions by configuring the database to use UTF-8 for the database character set.

The advantage of UTF-8 over any other encoding is that you can handle any number of languages in the same database / client.

Источник

How to write a file in Node.js using the UTF-8 encoding with BOM

Carlos Delgado

Learn how to write text in UTF-8 encoding with BOM in Node.js easily.

The UTF-8 BOM (Byte Order Mark) is a sequence of bytes placed at the start of a text stream that allows the reader to more reliably guess a file as being encoded in UTF-8. Normally, the BOM is used to signal the endianness of encoding, but since endianness is irrelevant to UTF-8, the BOM is unnecessary. However, there will be some cases where some clients will require text files to be opened on default text editors with the mentioned BOM. For example, you can determine whether a .txt file has BOM if you open it with Visual Studio Code:

Visual Studio Code UTF8BOM

In the bottom area, you will see that the default encoding used to read the file is UTF-8 with BOM and the editor was able to determine the encoding from its content.

In this tutorial, I will explain to you how to write a file with the mentioned encoding in Node.js.

File with UTF-8BOM encoding

All that you need to do to add BOM to a file written with UTF-8 is to prepend \ufeff to the content. The following example will write 2 files using the default filesystem of Node.js, one will have the default UTF-8 and the other UTF-8 with BOM:

// Import FileSystem const fs = require('fs'); // Regular Content of the file let fileContent = "Hello World!"; // The absolute path of the new file with its name let filepathUTF8 = "./utf8_file.txt"; let filepathUTF8WithBOM = "./utf8bom_file.txt"; // 1. Write a file with UTF-8 fs.writeFile(filepathUTF8, fileContent, (err) => < if (err) throw err; console.log("The file was succesfully saved with UTF-8!"); >); // 2. Write a file with UTF-8 with BOM // Note that you prepend \ufeff to the content that will be written fs.writeFile(filepathUTF8WithBOM, "\ufeff" + fileContent, (err) => < if (err) throw err; console.log("The file was succesfully saved with UTF-8 with BOM!"); >); 

Источник

International Language Support in JavaScript

JavaScript is built to support a wide variety of world languages and their characters – from the old US ASCII up to the rapidly spreading UTF-8. This page clears up some of the difficulties encountered when dealing with multiple languages and their related characters.

JavaScript and Character Sets

When working with non-European character sets («charsets»), you may need to make changes to the way your page references external JavaScript(.js) files. Ideally, your .js files should saved in the UTF-8 character set in order to maximize its multilingual features — though you can use a different charset that supports your language, at the potential expense of users who can’t support it. Once your files are saved as UTF-8, they must be «served» in the UTF-8 charset in order to display correctly. There are a few ways to ensure this:

Serve the Web Page as UTF-8

If your page is already served as UTF-8 (i.e. Content-type=text/html; charset=UTF-8), you don’t need to make any changes — all embedded files in an HTML document are served in the same charset as the document, unless explicitly specified not to by you. You can do this by:

  • Use the Content-type meta tag — place at the TOP of your page’s section. )
  • Edit your webserver configuration to serve all documents as UTF-8
  • Send the Content-type header via your server-side scripts (i.e. PHP, ASP, JSP)

Use the charset attribute of the tag

The easiest way to ensure your script is served as UTF-8 is to add a charset attribute (charset=»utf-8″) to your tags in the parent page:

Modify your .htaccess files (Apache Only)

You can also configure your webserver to serve all .js files in the UTF-8 charset, or only .js files in a single directory. You can do the latter (in Apache) by adding this line to the .htaccess file in the directory where your scripts are stored:

Источник

reading in utf-8 file (javascript XMLHttpRequest) gives bad european characters

can anyone help? I have small procedure to read in an UTF-8 file with javascript using XMLHttpRequest.. this file has european characters like miércoles sábado etc.. Notice the accents.. But when being read in .. the characters are all messed up.. I have checked the file and it is perfect.. it must be the procedure for reading in.. heres an example i have file that contains, the file is perfect, it happens to be javascript but it doesn’t matter.. any UTF-8 encoding file with special characters gives me the same issue this.weekDays = new Array(«Lunes», «Martes», «Miércoles», «Jueves», «Viernes», «Sábado», «Domingo»); but when returned and read by the procedure below it is like this (notice the funny characters in sabado and miercoles) this.weekDays = new Array(«Lunes», «Martes», «Miércoles», «Jueves», «Viernes», «Sábado», «Domingo»); Here is my procedure — its very small.

var contentType = "application/x-www-form-urlencoded; charset=utf-8"; var request = new XMLHttpRequest(); request.open("GET", path, false); request.setRequestHeader('Content-type', contentType) if (request.overrideMimeType) request.overrideMimeType(contentType); try < request.send(null); >catch (e) < return null; >if (request.status == 500 || request.status == 404 || request.status == 2 || (request.status == 0 && request.responseText == '')) return null; //PROBLEM HERE is with european charcters that are read in print(request.responseText); return request.responseText; 

are you sure the file is in UTF-8? Did you set your text editor to save it with that encoding explicitly? Setting the request to UTF-8 is irrelvant, is the answer really in UTF-8 and the corresponding header set in the response?

thgis is old but for anyone stumbling on this, use the .overrideMimeType(‘text/plain; charset=utf8’); method of the xmlhttprequest object from MDN Using XMLHttpRequest

Источник

Оцените статью