Split string by new line characters
I have a string with new line characters. I want to convert that string into an array, and for every new line, jump one index place in the array. If the string is:
My text1 My text2 My text3
Array ( [0] => My text1 [1] => My text2 [2] => My text3 )
You should normalize the newlines first, probably. The method s($yourString)->normalizeLineEndings() is available with github.com/delight-im/PHP-Str (library under MIT License) which has lots of other useful string helpers. You may want to take a look at the source code.
19 Answers 19
I’ve always used this with great success:
$array = preg_split("/\r\n|\n|\r/", $string);
(updated with the final \r, thanks @LobsterMan)
This is the answer. The validated one is WRONG. Well, hesselbom has it too. You could also use this equivalent: preg_split(‘/\n|\r/’, $string, -1, PREG_SPLIT_NO_EMPTY); for the sake of beauty 🙂 Why is this the only good answer? Because you cannot assume what type of end of line you will get: Mac (\r), Windows (\r\n), or Unix (\n).
This example is correct because you can’t just take any split operation that is just based on a single character. If you would do that, triggering on ‘\r’ or ‘\n’ you will end up with a superfluous empty line in case of a Windows «\r\n» ending. And it’s important that the two-character Windows separator is tested first.
You can use the explode function, using » \n » as separator:
$your_array = explode("\n", $your_string_from_db);
For instance, if you have this piece of code:
$str = "My text1\nMy text2\nMy text3"; $arr = explode("\n", $str); var_dump($arr);
array 0 => string 'My text1' (length=8) 1 => string 'My text2' (length=8) 2 => string 'My text3' (length=8)
Note that you have to use a double-quoted string, so \n is actually interpreted as a line-break.
(See that manual page for more details.)
Everyone please be careful with this solution, as it does not work on all newlines. I’ve had the most success with David’s answer
You must split at either \n or\r to be able to handle all kind of texts — this will only work with linux & windows newlines. Mac newlines will be disregarded! (\r)
I guess Tims answer/comment is not right, because this will only match the linebreak an YOUR system, but when you get strings that has linebreaks from other systems it wont work! Had this problem with emails
Nope this answer and the comments on this answer are WRONG! Because this does not factor in the OS newline character, especially not PHP_EOL . You must use preg_split(«/\\r\\n|\\r|\\n/», $value) .
A line break is defined differently on different platforms, \r\n, \r or \n.
Using RegExp to split the string you can match all three with \R
$array = preg_split ('/$\R?^/m', $string);
That would match line breaks on Windows, Mac and Linux!
I used this as well instead of the accepted answer and any other answers on this thread. Just leaving this comment as an informative.
PHP already knows the current system’s newline character(s). Just use the EOL constant.
Yes, but one can edit a file or as in this example a db entry in Windows then use it on a Linux system, for example. I think a general approach would suit better.
I’m not sure this is right. If a text area on a web page gets submitted, it could have different end-of-line characters depending on the user’s browser, not the server’s operating system. So no matter what OS you are using, you will need to be able to parse anything. This is assuming you’re doing web stuff of course.
An alternative to Davids answer which is faster (way faster) is to use str_replace and explode .
$arrayOfLines = explode("\n", str_replace(["\r\n","\n\r","\r"],"\n",$str) );
What’s happening is:
Since line breaks can come in different forms, I str_replace \r\n, \n\r, and \r with \n instead (and original \n are preserved).
Then explode on \n and you have all the lines in an array.
I did a benchmark on the src of this page and split the lines 1000 times in a for loop and:
preg_replace took an avg of 11 seconds
str_replace & explode took an avg of about 1 second
More detail and bencmark info on my forum
@mickmackusa, none that I know of. It was so long ago, I’m not sure why I did it that way. Thinking I should remove the \n\r ?
As far as I know, you only need to retain \r\n in the search array (and for that matter, it doesn’t need to be an array anymore). I am interested in the claim that preg_ was 11 times slower. You did not include the pattern that you used. Your above snippet makes 4 passes over the input. A good preg_ technique will make only one pass over the input. Regex is not known for its speed, but I think your claim requires substantiation. Please post your benchmark details if you are going to keep this performance claim in your answer.
$array = preg_split("/(\r\n|\n|\r)/", $string);
You don’t need preg_* functions, preg patterns, str_replace within, etc. in order to successfully break a string into array by newlines. In all scenarios, be it Linux, Mac, or Windows, this will do.
PHP_EOL is a constant holding the line break character(s) used by the server platform.
the file could come from another system with different new lines, particularry in a net environment for wich is used PHP
If You are using pure utf-8 all the time, everywhere, utf8 files included, normally, and You have nothing else but PHP_EOL inside code of Yours for line break detections, it will be matched as described, and no unpredicted behavior will occur. Keep in mind that it’s not only me yelling and claiming this. PHP_EOL usability is quite confirmed.
In Your case, if sources come from somewhere else and aren’t formed well, whatever, it’s maybe better if You use str_replace (faster than regexp). . all in all, be it regexp. or str_replace or PHP_EOL, there’s one good old sentence saying: «If it works — don’t touch it!». 🙂
Use: $array = preg_split(‘/\s*\R\s*/’, trim($text), NULL, PREG_SPLIT_NO_EMPTY);
This worked best for me because it also eliminates leading (second \s*) and trailing (first \s*) whitespace automatically and also skips blank lines (the PREG_SPLIT_NO_EMPTY flag).
Options
If you want to keep leading whitespace, simply get rid of the second \s* and make it an rtrim() instead.
$array = preg_split('/\s*\R/', rtrim($text), NULL, PREG_SPLIT_NO_EMPTY);
If you need to keep empty lines, get rid of the NULL (it is only a placeholder) and PREG_SPLIT_NO_EMPTY flag, like so.
$array = preg_split('/\s*\R\s*/', trim($text));
Or keeping both leading whitespace and empty lines.
$array = preg_split('/\s*\R/', rtrim($text));
I don’t see any reason why you’d ever want to keep trailing whitespace, so I suggest leaving the first \s* in there. But, if all you want is to split by new line (as the title suggests), it is this simple (as mentioned by Jan Goyvaerts).
There is quite a mix of direct and indirect answers on this page and some good advice in comments, but there isn’t an answer that represents what I would write in my own project.
$string = ' My text1 My text2 My text3 '; var_export( preg_split('/\R+/', $string, 0, PREG_SPLIT_NO_EMPTY) );
array ( 0 => 'My text1', 1 => 'My text2', 2 => 'My text3', )
The OP makes no mention of trimming horizontal whitespace characters from the lines, so there is no expectation of removing \s or \h while exploding on variable (system agnostic) new lines.
While PHP_EOL is sensible advice, it lacks the flexibility appropriately explode the string when the newline sequence is coming from another operating system.
Using a non-regex explode will tend to be less direct because it will require string preparations. Furthermore, there may be mopping up after the the explosions if there are unwanted blank lines to remove.
Using \R+ (one or more consecutive newline sequences) and the PREG_SPLIT_NO_EMPTY function flag will deliver a gap-less, indexed array in a single, concise function call. Some people have a bias against regular expressions, but this is a perfect case for why regex should be used. If performance is a concern for valid reasons (e.g. you are processing hundreds of thousands of points of data), then go ahead and invest in benchmarking and micro-optimization. Beyond that, just use this one-line of code so that your code is brief, robust, and direct.
str_split
If the optional length parameter is specified, the returned array will be broken down into chunks with each being length in length, except the final chunk which may be shorter if the string does not divide evenly. The default length is 1 , meaning every chunk will be one byte in size.
Errors/Exceptions
If length is less than 1 , a ValueError will be thrown.
Changelog
Version | Description |
---|---|
8.2.0 | If string is empty an empty array is now returned. Previously an array containing a single empty string was returned. |
8.0.0 | If length is less than 1 , a ValueError will be thrown now; previously, an error of level E_WARNING has been raised instead, and the function returned false . |
Examples
Example #1 Example uses of str_split()
$arr1 = str_split ( $str );
$arr2 = str_split ( $str , 3 );
print_r ( $arr1 );
print_r ( $arr2 );
The above example will output:
Array ( [0] => H [1] => e [2] => l [3] => l [4] => o [5] => [6] => F [7] => r [8] => i [9] => e [10] => n [11] => d ) Array ( [0] => Hel [1] => lo [2] => Fri [3] => end )
Notes
Note:
str_split() will split into bytes, rather than characters when dealing with a multi-byte encoded string. Use mb_str_split() to split the string into code points.
See Also
- mb_str_split() — Given a multibyte string, return an array of its characters
- chunk_split() — Split a string into smaller chunks
- preg_split() — Split string by a regular expression
- explode() — Split a string by a string
- count_chars() — Return information about characters used in a string
- str_word_count() — Return information about words used in a string
- for
User Contributed Notes 3 notes
The function str_split() is not ‘aware’ of words. Here is an adaptation of str_split() that is ‘word-aware’.
$array = str_split_word_aware (
‘In the beginning God created the heaven and the earth. And the earth was without form, and void; and darkness was upon the face of the deep.’ ,
32
);
/**
* This function is similar to str_split() but this function keeps words intact; it never splits through a word.
*
* @return array
*/
function str_split_word_aware ( string $string , int $maxLengthOfLine ): array
if ( $maxLengthOfLine <= 0 ) throw new RuntimeException ( sprintf ( 'The function %s() must have a max length of line at least greater than one' , __FUNCTION__ ));
>
$lines = [];
$words = explode ( ‘ ‘ , $string );
$currentLine = » ;
$lineAccumulator = » ;
foreach ( $words as $currentWord )
$currentWordWithSpace = sprintf ( ‘%s ‘ , $currentWord );
$lineAccumulator .= $currentWordWithSpace ;
if ( strlen ( $lineAccumulator ) < $maxLengthOfLine ) $currentLine = $lineAccumulator ;
continue;
>
// Overwrite the current line and accumulator with the current word
$currentLine = $currentWordWithSpace ;
$lineAccumulator = $currentWordWithSpace ;
>
if ( $currentLine !== » ) $lines [] = $currentLine ;
>
array( 5 ) [ 0 ]=> string ( 29 ) «In the beginning God created »
[ 1 ]=> string ( 30 ) «the heaven and the earth. And »
[ 2 ]=> string ( 28 ) «the earth was without form, »
[ 3 ]=> string ( 27 ) «and void; and darkness was »
[ 4 ]=> string ( 27 ) «upon the face of the deep. »
>