Php csv utf 8 with bom

BOM and CSV encoding

Depending on the software you are using to read your CSV you may need to adjust the CSV document outputting script.

To work as intended you CSV object needs to support the stream filter API

MS Excel on Windows

On Windows, MS Excel expects an UTF-8 encoded CSV with its corresponding BOM character. To fulfill this requirement, you simply need to add the UTF-8 BOM character if needed as explained below:

use League\Csv\Reader; $reader = Reader::createFromPath('/path/to/my/file.csv', 'r'); //let's set the output BOM $reader->setOutputBOM(Reader::BOM_UTF8); //let's convert the incoming data from iso-88959-15 to utf-8 $reader->addStreamFilter('convert.iconv.ISO-8859-15/UTF-8'); //BOM detected and adjusted for the output echo $reader->getContent(); 

The conversion is done with the iconv extension using its bundled stream filters.

MS Excel on MacOS

On a MacOS system, MS Excel requires a CSV encoded in UTF-16 LE using the tab character as delimiter. Here’s an example on how to meet those requirements using the League\Csv package.

use League\Csv\CharsetConverter; use League\Csv\Reader; use League\Csv\Writer; //the current CSV is ISO-8859-15 encoded with a ";" delimiter $origin = Reader::createFromPath('/path/to/french.csv', 'r'); $origin->setDelimiter(';'); //let's use stream resource $writer = Writer::createFromStream(fopen('php://temp', 'r+')); //let's set the output BOM $writer->setOutputBOM(Reader::BOM_UTF16_LE); //we set the tab as the delimiter character $writer->setDelimiter("\t"); //let's convert the incoming data from iso-88959-15 to utf-16 CharsetConverter::addTo($writer, 'ISO-8859-15', 'UTF-16'); //we insert csv data $writer->insertAll($origin); //all is good let's output the results $writer->output('mycsvfile.csv'); 

The conversion is done with the mbstring extension using the League\Csv\CharsetConverter.

Читайте также:  example-like- php mysql examples | w3resource

Skipping The BOM sequence with the Reader class

In order to ensure the correct removal of the sequence and avoid bugs while parsing the CSV, the filter can skip the BOM sequence completely when using the Reader class and convert the CSV content from the BOM sequence encoding charset to UTF-8. To work as intended call the Reader::includeInputBOM method to ensure the default BOM removal behaviour is disabled and add the stream filter to you reader instance using the static method CharsetConverter::skipBOM method;

 use League\Csv\Reader; use League\Csv\CharsetConverter; $input = Reader::BOM_UTF16_BE."john,doe,john.doe@example.com\njane,doe,jane.doe@example.com\n"; $document = Reader::createFromString($input); $document->includeInputBOM(); // de-activate the default skipping mechanism CharsetConverter::addBOMSkippingTo($document); var_dump([. $document]); // returns the document content without the skipped BOM sequence // [ // ['john', 'doe', 'john.doe@example.com'], // ['jane', 'doe', 'jane.doe@example.com'], // ] 

Once the filter is applied, the Reader instance looses all information regarding its own BOM sequence. The sequence is still be present but the instance is no longer able to detect it.

Источник

fputcsv

fputcsv() formats a line (passed as a fields array) as CSV and writes it (terminated by a newline) to the specified file stream .

Parameters

The file pointer must be valid, and must point to a file successfully opened by fopen() or fsockopen() (and not yet closed by fclose() ).

The optional separator parameter sets the field delimiter (one single-byte character only).

The optional enclosure parameter sets the field enclosure (one single-byte character only).

The optional escape parameter sets the escape character (at most one single-byte character). An empty string ( «» ) disables the proprietary escape mechanism.

The optional eol parameter sets a custom End of Line sequence.

Note:

If an enclosure character is contained in a field, it will be escaped by doubling it, unless it is immediately preceded by an escape .

Return Values

Returns the length of the written string or false on failure.

Changelog

Version Description
8.1.0 The optional eol parameter has been added.
7.4.0 The escape parameter now also accepts an empty string to disable the proprietary escape mechanism.

Examples

Example #1 fputcsv() example

$list = array (
array( ‘aaa’ , ‘bbb’ , ‘ccc’ , ‘dddd’ ),
array( ‘123’ , ‘456’ , ‘789’ ),
array( ‘»aaa»‘ , ‘»bbb»‘ )
);

foreach ( $list as $fields ) fputcsv ( $fp , $fields );
>

The above example will write the following to file.csv :

aaa,bbb,ccc,dddd 123,456,789 """aaa""","""bbb"""

Notes

Note: If PHP is not properly recognizing the line endings when reading files either on or created by a Macintosh computer, enabling the auto_detect_line_endings run-time configuration option may help resolve the problem.

See Also

User Contributed Notes 28 notes

If you need to send a CSV file directly to the browser, without writing in an external file, you can open the output and use fputcsv on it..

$out = fopen ( ‘php://output’ , ‘w’ );
fputcsv ( $out , array( ‘this’ , ‘is some’ , ‘csv «stuff», you know.’ ));
fclose ( $out );
?>

If you need to save the output to a variable (e.g. for use within a framework) you can write to a temporary memory-wrapper and retrieve it’s contents:

// output up to 5MB is kept in memory, if it becomes bigger it will automatically be written to a temporary file
$csv = fopen ( ‘php://temp/maxmemory:’ . ( 5 * 1024 * 1024 ), ‘r+’ );

fputcsv ( $csv , array( ‘blah’ , ‘blah’ ));

// put it all in a variable
$output = stream_get_contents ( $csv );
?>

Sometimes it’s useful to get CSV line as string. I.e. to store it somewhere, not in on a filesystem.

function csvstr (array $fields ) : string
$f = fopen ( ‘php://memory’ , ‘r+’ );
if ( fputcsv ( $f , $fields ) === false ) return false ;
>
rewind ( $f );
$csv_line = stream_get_contents ( $f );
return rtrim ( $csv_line );
>
?>

if you want make UTF-8 file for excel, use this:

$fp = fopen($filename, ‘w’);
//add BOM to fix UTF-8 in Excel
fputs($fp, $bom =( chr(0xEF) . chr(0xBB) . chr(0xBF) ));

Please note, that fputcsv ist not always enclosing strings with the enclosure character.

fclose ( $fh );
?>

results in a file containing the line:
«One 1» Two «Three 3»

It seems that only strings containing at least one of the following characters are enclosed:

— the delimiter character
— the enclosure character
— the escape character
— \n (new line)
— \r (line feed)
— \t (tab)
— blank

I hope this saves you the hour it took me to get to the bottom of this behaviour.

Using fputcsv to output a CSV with a tab delimiter is a little tricky since the delimiter field only takes one character.
The answer is to use the chr() function. The ascii code for tab is 9, so chr(9) returns a tab character.

fputcsv ( $fp , $foo , ‘\t’ ); //won’t work
fputcsv ( $fp , $foo , ‘ ‘ ); //won’t work

it should be:
fputcsv ( $fp , $foo , «\t» );
?>
you just forgot that single quotes are literal. meaning whatever you put there that’s what will come out so ‘\t’ would be same as ‘t’ because \ in that case would be only used for escaping but if you use double quotes then that would work.

the solution for how to solve the encoding problem while converting an array to csv file is below.

$fp = fopen(‘php://memory’, ‘w’);
//add BOM to fix UTF-8 in Excel
fputs($fp, $bom =( chr(0xEF) . chr(0xBB) . chr(0xBF) ));
// output the column headings
//fputcsv($fp, array(‘Topic’, ‘Title’, ‘URL’, ‘Keywords’, ‘Score’, ‘FB_count’, ‘TW_count’, ‘|’));
if(isset($trend)) foreach ( $trend as $myField ) fputcsv($fp, $myField, ‘|’);
>
>

Utility function to output a mysql query to csv with the option to write to file or send back to the browser as a csv attachment.

function query_to_csv ( $db_conn , $query , $filename , $attachment = false , $headers = true )

if( $attachment ) // send response headers to the browser
header ( ‘Content-Type: text/csv’ );
header ( ‘Content-Disposition: attachment;filename=’ . $filename );
$fp = fopen ( ‘php://output’ , ‘w’ );
> else $fp = fopen ( $filename , ‘w’ );
>

$result = mysql_query ( $query , $db_conn ) or die( mysql_error ( $db_conn ) );

if( $headers ) // output header row (if at least one row exists)
$row = mysql_fetch_assoc ( $result );
if( $row ) fputcsv ( $fp , array_keys ( $row ));
// reset pointer back to beginning
mysql_data_seek ( $result , 0 );
>
>

while( $row = mysql_fetch_assoc ( $result )) fputcsv ( $fp , $row );
>

// Using the function
$sql = «SELECT * FROM table» ;
// $db_conn should be a valid db handle

// output as an attachment
query_to_csv ( $db_conn , $sql , «test.csv» , true );

// output to file system
query_to_csv ( $db_conn , $sql , «test.csv» , false );
?>

I’ve created a function for quickly generating CSV files that work with Microsoft applications. In the field I learned a few things about generating CSVs that are not always obvious. First, since PHP is generally *nix-based, it makes sense that the line endings are always \n instead of \r\n. However, certain Microsoft programs (I’m looking at you, Access 97), will fail to recognize the CSV properly unless each line ends with \r\n. So this function changes the line endings accordingly. Secondly, if the first column heading / value of the CSV file begins with uppercase ID, certain Microsoft programs (ahem, Excel 2007) will interpret the file as being in the SYLK format rather than CSV, as described here: http://support.microsoft.com/kb/323626

This function accommodates for that as well, by forcibly enclosing that first value in quotes (when this doesn’t occur automatically). It would be fairly simple to modify this function to use another delimiter if need be and I leave that as an exercise to the reader. So quite simply, this function is used for outputting CSV data to a CSV file in a way that is safe for use with Windows applications. It takes two parameters + one optional parameter: the location of where the file should be saved, an array of data rows, and an optional array of column headings. (Technically you could omit the headings array and just include it as the first row of the data, but it is often useful to keep this data stored in different arrays in practice.)

function mssafe_csv ( $filepath , $data , $header = array())
<
if ( $fp = fopen ( $filepath , ‘w’ ) ) <
$show_header = true ;
if ( empty( $header ) ) <
$show_header = false ;
reset ( $data );
$line = current ( $data );
if ( !empty( $line ) ) <
reset ( $line );
$first = current ( $line );
if ( substr ( $first , 0 , 2 ) == ‘ID’ && ! preg_match ( ‘/[«\\s,]/’ , $first ) ) <
array_shift ( $data );
array_shift ( $line );
if ( empty( $line ) ) <
fwrite ( $fp , «\» < $first >\»\r\n» );
> else <
fwrite ( $fp , «\» < $first >\»,» );
fputcsv ( $fp , $line );
fseek ( $fp , — 1 , SEEK_CUR );
fwrite ( $fp , «\r\n» );
>
>
>
> else <
reset ( $header );
$first = current ( $header );
if ( substr ( $first , 0 , 2 ) == ‘ID’ && ! preg_match ( ‘/[«\\s,]/’ , $first ) ) <
array_shift ( $header );
if ( empty( $header ) ) <
$show_header = false ;
fwrite ( $fp , «\» < $first >\»\r\n» );
> else <
fwrite ( $fp , «\» < $first >\»,» );
>
>
>
if ( $show_header ) <
fputcsv ( $fp , $header );
fseek ( $fp , — 1 , SEEK_CUR );
fwrite ( $fp , «\r\n» );
>
foreach ( $data as $line ) <
fputcsv ( $fp , $line );
fseek ( $fp , — 1 , SEEK_CUR );
fwrite ( $fp , «\r\n» );
>
fclose ( $fp );
> else <
return false ;
>
return true ;
>

Alright, after playing a while, I’m confident the following replacement function works in all cases, including the ones for which the native fputcsv function fails. If fputcsv fails to work for you (particularly with mysql csv imports), try this function as a drop-in replacement instead.

Arguments to pass in are exactly the same as for fputcsv, though I have added an additional $mysql_null boolean which allows one to turn php null’s into mysql-insertable nulls (by default, this add-on is disabled, thus working identically to fputcsv [except this one works!]).

function fputcsv2 ( $fh , array $fields , $delimiter = ‘,’ , $enclosure = ‘»‘ , $mysql_null = false ) <
$delimiter_esc = preg_quote ( $delimiter , ‘/’ );
$enclosure_esc = preg_quote ( $enclosure , ‘/’ );

$output [] = preg_match ( «/(?: $ < delimiter_esc >| $ < enclosure_esc >|\s)/» , $field ) ? (
$enclosure . str_replace ( $enclosure , $enclosure . $enclosure , $field ) . $enclosure
) : $field ;
>

fwrite ( $fh , join ( $delimiter , $output ) . «\n» );
>

// the _EXACT_ LOAD DATA INFILE command to use
// (if you pass in something different for $delimiter
// and/or $enclosure above, change them here too;
// but _LEAVE ESCAPED BY EMPTY!_).
/*
LOAD DATA INFILE
‘/path/to/file.csv’

Источник

Оцените статью