Python splitting strings into lists

Python 3 Examples How to Split String into List

In this post you can find useful information for beginers and advanced how to split strings into lists. You can see the using of a separator, dictionaries, split only on first separator or how to treat consecutive separators. There is an example for using regular expression for spliting strings:

You could be interested in these articles about python:

Simple split of string into list

If you want to split any string into a list (of substrings) you can use simply the method split(). It can be used:

  • without parameter — then space is used as separator
  • with parameter — comma, dot etc — see next section
print "Python2 Python3 Python Numpy".split() print "Python2, Python3, Python, Numpy".split() 
['Python2', 'Python3', 'Python', 'Numpy'] ['Python2,', 'Python3,', 'Python,', 'Numpy'] 

Python split string by separator

Python split string by comma or any other character use the same method split() with parameter — comma, dot etc. In the example below the string is split by comma and semi colon (which can be used for CSV files.

print "Python2, Python3, Python, Numpy".split(',') print "Python2; Python3; Python; Numpy".split(';') 
['Python2', ' Python3', ' Python', ' Numpy'] ['Python2', ' Python3', ' Python', ' Numpy'] 

You can note that separator is missed in the ouput list. So if you want to keep the separator in the output you can use non capturing groups which means:

sep = re.split(',', 'Python2, Python3, Python, Numpy') print(sep) sep = re.split('(,)', 'Python2, Python3, Python, Numpy') print(sep) 
['Python2', ' Python3', ' Python', ' Numpy'] ['Python2', ',', ' Python3', ',', ' Python', ',', ' Numpy'] 

But if you want the separator to be part of the separated words then you can use list comprehensions(no regular expressions):

text = 'Python2, Python3, Python, Numpy' sep = ',' result = [x+sep for x in text.split(sep)] print(result) 
['Python2,', ' Python3,', ' Python,', ' Numpy,'] 

Split multi-line string into a list (per line)

We can use the same string method split and the special character for new line ‘\n’. If the text contains some extra spaces we can remove them by strip() or lstrip():

str = """ Python is cool Python is easy Python is mighty """ list = [] for line in str.split("\n"): if not line.strip(): continue list.append(line.lstrip()) print list 
[‘Python is cool’, ‘Python is easy’, ‘Python is mighty’]

Split string dictionary into lists (map)

Let say that we have string which is formatted as a dictionary with values: key => value. We want to have this couples into lists or a map. Here you can find simple example:

dictionary = """\ key1 => value1 key2 => value2 key3 => value3 """ mydict = <> listKey = [] listValue = [] for line in dictionary.split("\n"): if not line.strip(): continue k, v = [word.strip() for word in line.split("=>")] mydict[k] = v listKey.append(k) listValue.append(v) print mydict print listKey print listValue 

the result are 1 map and 2 lists:

Python split string by first occurrence

If you need to do a split but only for several items and not all of them then you can use «maxsplit». In this example we are splitting the first 3 comma separated items:

str = "Python2, Python3, Python, Numpy, Python2, Python3, Python, Numpy" data = str.split(", ",3) for temp in data: print temp 

Numpy Python2 Python3 Python Numpy

Split string by consecutive separators(regex)

If you want to split several consecutive separators as one(not like the default string split method) you need to use regex module in order to achieve it:

default split method vs module re:

import re print('Hello1111World'.split('1')) print(re.split('1+', 'Hello1111World' )) 
['Hello', '', '', '', 'World'] ['Hello', 'World'] 

This is very useful when you want to skip several spaces or other characters.

By using SoftHints — Python, Linux, Pandas , you agree to our Cookie Policy.

Источник

Python String split() Method

Split a string into a list where each word is a list item:

txt = «welcome to the jungle»

Definition and Usage

The split() method splits a string into a list.

You can specify the separator, default separator is any whitespace.

Note: When maxsplit is specified, the list will contain the specified number of elements plus one.

Syntax

Parameter Values

Parameter Description
separator Optional. Specifies the separator to use when splitting the string. By default any whitespace is a separator
maxsplit Optional. Specifies how many splits to do. Default value is -1, which is «all occurrences»

More Examples

Example

Split the string, using comma, followed by a space, as a separator:

txt = «hello, my name is Peter, I am 26 years old»

Example

Use a hash character as a separator:

Example

Split the string into a list with max 2 items:

# setting the maxsplit parameter to 1, will return a list with 2 elements!
x = txt.split(«#», 1)

Unlock Full Access 50% off

COLOR PICKER

colorpicker

Join our Bootcamp!

Report Error

If you want to report an error, or if you want to make a suggestion, do not hesitate to send us an e-mail:

Thank You For Helping Us!

Your message has been sent to W3Schools.

Top Tutorials
Top References
Top Examples
Get Certified

W3Schools is optimized for learning and training. Examples might be simplified to improve reading and learning. Tutorials, references, and examples are constantly reviewed to avoid errors, but we cannot warrant full correctness of all content. While using W3Schools, you agree to have read and accepted our terms of use, cookie and privacy policy.

Источник

Python: Split String into List with split()

Data can take many shapes and forms — and it’s oftentimes represented as strings.

Be it from a CSV file or input text, we split strings oftentimes to obtain lists of features or elements.

In this guide, we’ll take a look at how to split a string into a list in Python, with the split() method.

Split String into List in Python

The split() method of the string class is fairly straightforward. It splits the string, given a delimiter, and returns a list consisting of the elements split out from the string.

By default, the delimiter is set to a whitespace — so if you omit the delimiter argument, your string will be split on each whitespace.

Let’s take a look at the behavior of the split() method:

string = "Age,University,Name,Grades" lst = string.split(',') print(lst) print('Element types:', type(lst[0])) print('Length:', len(lst)) 

Our string had elements delimited with a comma, as in a CSV (comma-separated values) file, so we’ve set the delimiter appropriately.

This results in a list of elements of type str , no matter what other type they can represent:

['Age', 'University', 'Name', 'Grades'] Element types: Length: 4 

Split String into List, Trim Whitespaces and Change Capitalization

Not all input strings are clean — so you won’t always have a perfectly formatted string to split. Sometimes, strings may contain whitespaces that shouldn’t be in the «final product» or have a mismatch of capitalized and non-capitalized letters.

Thankfully, it’s pretty easy to process this list and each element in it, after you’ve split it:

# Contains whitespaces after commas, which will stay after splitting string = "age, uNiVeRsItY, naMe, gRaDeS" lst = string.split(',') print(lst) 
['age', ' uNiVeRsItY', ' naMe', ' gRaDeS'] 

No good! Each element starts with a whitespace and the elements aren’t properly capitalized at all. Applying a function to each element of a list can easily be done through a simple for loop so we’ll want to apply a strip() / trim() (to get rid of the whitespaces) and a capitalization function.

Since we’re not only looking to capitalize the first letter but also keep the rest lowercase (to enforce conformity), let’s define a helper function for that:

def capitalize_word(string): return string[:1].capitalize() + string[1:].lower() 

The method takes a string, slices it on its first letter and capitalizes it. The rest of the string is converted to lowercase and the two changed strings are then concatenated.

We can now use this method in a loop as well:

string = "age, uNiVeRsItY, naMe, gRaDeS" lst = string.split(',') lst = [s.strip() for s in lst] lst = [capitalize_word(s) for s in lst] print(lst) print('Element types:', type(lst[0])) print('Length:', len(lst)) 
['Age', 'University', 'Name', 'Grades'] Element types: Length: 4 

Split String into List and Convert to Integer

What happens if you’re working with a string-represented list of integers? After splitting, you won’t be able to perform integer operations on these, since they’re ostensibly strings.

Thankfully, we can use the same for loop as before to convert the elements into integers:

string = "1,2,3,4" lst = string.split(',') lst = [int(s) for s in lst] print(lst) print('Element types:', type(lst[0])) print('Length:', len(lst)) 

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

[1, 2, 3, 4] Element types: Length: 4 

Split String into List with Limiter

Besides the delimiter, the split() method accepts a limiter — the number of times a split should occur.

It’s an integer and is defined after the delimiter:

string = "Age, University, Name, Grades" lst = string.split(',', 2) print(lst) 

Here, two splits occur, on the first and second comma, and no splits happen after that:

['Age', ' University', ' Name, Grades'] 

Conclusion

In this short guide, you’ve learned how to split a string into a list in Python.

You’ve also learned how to trim the whitespaces and fix capitalization as a simple processing step alongside splitting a string into a list.

Источник

Читайте также:  Creating xml file with java
Оцените статью