Python list split lines

Split strings in Python (delimiter, line break, regex, etc.)

This article explains how to split strings by delimiters, line breaks, regular expressions, and the number of characters in Python.

Refer to the following articles for more information on concatenating and extracting strings.

Split by delimiter: split()

Use the split() method to split by delimiter.

If the argument is omitted, it splits by whitespace (spaces, newlines \n , tabs \t , etc.) and processes consecutive whitespace together.

A list of the words is returned.

s_blank = 'one two three\nfour\tfive' print(s_blank) # one two three # four five print(s_blank.split()) # ['one', 'two', 'three', 'four', 'five'] print(type(s_blank.split())) # 

Use join() , described below, to concatenate a list into a string.

Specify the delimiter: sep

Specify a delimiter for the first parameter, sep .

s_comma = 'one,two,three,four,five' print(s_comma.split(',')) # ['one', 'two', 'three', 'four', 'five'] print(s_comma.split('three')) # ['one,two,', ',four,five'] 

To specify multiple delimiters, use regular expressions as described later.

Specify the maximum number of splits: maxsplit

Specify the maximum number of splits for the second parameter, maxsplit .

If maxsplit is given, at most maxsplit splits are done (thus, the returned list will have at most maxsplit + 1 elements).

s_comma = 'one,two,three,four,five' print(s_comma.split(',', 2)) # ['one', 'two', 'three,four,five'] 

For example, maxsplit is helpful for removing the first line from a string.

If you specify sep=’\n’ and maxsplit=1 , you can get a list of strings split by the first newline character \n . The second element [1] of this list is a string excluding the first line. Since it is the last element, it can also be specified as [-1] .

s_lines = 'one\ntwo\nthree\nfour' print(s_lines) # one # two # three # four print(s_lines.split('\n', 1)) # ['one', 'two\nthree\nfour'] print(s_lines.split('\n', 1)[0]) # one print(s_lines.split('\n', 1)[1]) # two # three # four print(s_lines.split('\n', 1)[-1]) # two # three # four 

Similarly, to delete the first two lines:

print(s_lines.split('\n', 2)[-1]) # three # four 

Split from right by delimiter: rsplit()

rsplit() splits from the right of the string.

The result differs from split() only when the maxsplit parameter is provided.

Similar to split() , if you want to remove the last line, use rsplit() .

s_lines = 'one\ntwo\nthree\nfour' print(s_lines.rsplit('\n', 1)) # ['one\ntwo\nthree', 'four'] print(s_lines.rsplit('\n', 1)[0]) # one # two # three print(s_lines.rsplit('\n', 1)[1]) # four 

To delete the last two lines:

print(s_lines.rsplit('\n', 2)[0]) # one # two 

Split by line break: splitlines()

There is also a splitlines() for splitting by line boundaries.

As shown in the previous examples, split() and rsplit() split the string by whitespace, including line breaks, by default. You can also specify line breaks explicitly using the sep parameter.

However, using splitlines() is often more suitable.

For example, split string that contains \n (LF, used in Unix OS including Mac) and \r\n (CR + LF, used in Windows OS).

s_lines_multi = '1 one\n2 two\r\n3 three\n' print(s_lines_multi) # 1 one # 2 two # 3 three 

By default, when split() is applied, it splits not only by line breaks but also by spaces.

print(s_lines_multi.split()) # ['1', 'one', '2', 'two', '3', 'three'] 

As sep allows specifying only one newline character, split() may not work as expected if the string contains mixed newline characters. It is also split at the end of the newline character.

print(s_lines_multi.split('\n')) # ['1 one', '2 two\r', '3 three', ''] 

splitlines() splits at various newline characters but not at other whitespaces.

print(s_lines_multi.splitlines()) # ['1 one', '2 two', '3 three'] 

If the first argument, keepends , is set to True , the result includes a newline character at the end of the line.

print(s_lines_multi.splitlines(True)) # ['1 one\n', '2 two\r\n', '3 three\n'] 

See the following article for other operations with line breaks.

Split by regex: re.split()

split() and rsplit() split only when sep matches completely.

If you want to split a string that matches a regular expression (regex) instead of perfect match, use the split() of the re module.

In re.split() , specify the regex pattern in the first parameter and the target character string in the second parameter.

Here’s an example of splitting a string by consecutive numbers:

import re s_nums = 'one1two22three333four' print(re.split('\d+', s_nums)) # ['one', 'two', 'three', 'four'] 

The maximum number of splits can be specified in the third parameter, maxsplit .

print(re.split('\d+', s_nums, 2)) # ['one', 'two', 'three333four'] 

Split by multiple different delimiters

These two examples are helpful to remember, even if you are not familiar with regex:

Enclose a string with [] to match any single character in it. You can split a string by multiple different characters.

s_marks = 'one-two+three#four' print(re.split('[-+#]', s_marks)) # ['one', 'two', 'three', 'four'] 

If patterns are delimited by | , it matches any pattern. Of course, it is possible to use special characters of regex for each pattern, but it is OK even if normal string is specified as it is. You can split by multiple different strings.

s_strs = 'oneXXXtwoYYYthreeZZZfour' print(re.split('XXX|YYY|ZZZ', s_strs)) # ['one', 'two', 'three', 'four'] 

Concatenate a list of strings

In the previous examples, you can split the string and get the list.

If you want to concatenate a list of strings into one string, use the string method, join() .

Call join() from ‘separator’ , and pass a list of strings to be concatenated.

l = ['one', 'two', 'three'] print(','.join(l)) # one,two,three print('\n'.join(l)) # one # two # three print(''.join(l)) # onetwothree 

See the following article for details of string concatenation.

Split based on the number of characters: slice

Use slice to split strings based on the number of characters.

s = 'abcdefghij' print(s[:5]) # abcde print(s[5:]) # fghij 

The split results can be obtained as a tuple or assigned to individual variables.

s_tuple = s[:5], s[5:] print(s_tuple) # ('abcde', 'fghij') print(type(s_tuple)) # s_first, s_last = s[:5], s[5:] print(s_first) # abcde print(s_last) # fghij 
s_first, s_second, s_last = s[:3], s[3:6], s[6:] print(s_first) # abc print(s_second) # def print(s_last) # ghij 

The number of characters can be obtained with the built-in function len() . You can also split a string into halves using this.

half = len(s) // 2 print(half) # 5 s_first, s_last = s[:half], s[half:] print(s_first) # abcde print(s_last) # fghij 

If you want to concatenate strings, use the + operator.

print(s_first + s_last) # abcdefghij 

Источник

Python Split String – How to Split a String into a List or Array in Python

Shittu Olumide

Shittu Olumide

Python Split String – How to Split a String into a List or Array in Python

In this article, we will walk through a comprehensive guide on how to split a string in Python and convert it into a list or array.

We’ll start by introducing the string data type in Python and explaining its properties. Then we’ll discuss the various ways in which you can split a string using built-in Python methods such as split() , splitlines() , and partition() .

Overall, this article should be a useful resource for anyone looking to split a string into a list in Python, from beginners to experienced programmers.

What is a String in Python?

A string is a group of characters in Python that are encased in single quotes ( ‘ ‘ ) or double quotes ( » » ). This built-in Python data type is frequently used to represent textual data.

Since strings are immutable, they cannot be changed once they have been created. Any action that seems to modify a string actually produces a new string.

Concatenation, slicing, and formatting are just a few of the many operations that you can perform on strings in Python. You can also use strings with a number of built-in modules and functions, including re , str() , and len() .

There’s also a wide range of string operations, including split() , replace() , and strip() , that are available in Python. You can use them to manipulate strings in different ways.

Let’s now learn how to split a string into a list in Python.

How to Split a String into a List Using the split() Method

The split() method is the most common way to split a string into a list in Python. This method splits a string into substrings based on a delimiter and returns a list of these substrings.

myString = "Hello world" myList = myString.split() print(myList) 

In this example, we split the string «Hello world» into a list of two elements, «Hello» and «world» , using the split() method.

How to Split a String into a List Using the splitlines() Method

The splitlines() method is used to split a string into a list of lines, based on the newline character (\n) .

myString = "hello\nworld" myList = myString.splitlines() print(myList) 

In this example, we split the string «hello\nworld» into a list of two elements, «hello» and «world» , using the splitlines() method.

How to Split a String into a List Using Regular Expressions with the re Module

The re module in Python provides a powerful way to split strings based on regular expressions.

import re myString = "hello world" myList = re.split('\s', myString) print(myList) 

In this example, we split the string «hello world» into a list of two elements, «hello» and «world» , using a regular expression that matches any whitespace character (\s) .

How to Split a String into a List Using the partition() Method

The partition() method splits a string into three parts based on a separator and returns a tuple containing these parts. The separator itself is also included in the tuple.

myString = "hello:world" myList = myString.partition(':') print(myList) 

In this example, we split the string «hello:world» into a tuple of three elements, «hello» , «:» , and «world» , using the partition() method.

Note: The most common method for splitting a string into a list or array in Python is to use the split() method. This method is available for any string object in Python and splits the string into a list of substrings based on a specified delimiter.

When to Use Each Method

So here’s an overview of these methods and when to use each one for quick reference:

  1. split() : This is the most common method for splitting a text into a list. You can use this method when you want to split the text into words or substrings based on a specific delimiter, such as a space, comma, or tab.
  2. partition() : This method splits a text into three parts based on the first occurrence of a delimiter. You can use this method when you want to split the text into two parts and keep the delimiter. For example, you might use partition() to split a URL into its protocol, domain, and path components. The partition() method returns a tuple of three strings.
  3. splitlines() : This method splits a text into a list of strings based on the newline characters ( \n ). You can use this method when you want to split a text into lines of text. For example, you might use splitlines() to split a multiline string into individual lines.
  4. Regular expressions: This is a more powerful method for splitting text into a list, as it allows you to split the text based on more complex patterns. For example, you might use regular expressions to split a text into sentences, based on the presence of punctuation marks. The re module in Python provides a range of functions for working with regular expressions.

Conclusion

These are some of the most common methods to split a string into a list or array in Python. Depending on your specific use case, one method may be more appropriate than the others.

Let’s connect on Twitter and on LinkedIn. You can also subscribe to my YouTube channel.

Источник

Читайте также:  Php fpm reload or restart
Оцените статью