- Pandas Convert Column to String Type?
- 1. Quick Examples of Convert Column To String
- 2. Convert Column to String Type
- 3. Convert Specific Column to String
- 4. Convert Multiple Columns to String
- 5. Using DataFrame.apply(str)
- 6. Using Series.map(str)
- 7. Convert All Columns to Strings
- Conclusion
- Related Articles
- References
- You may also like reading:
- pandas.DataFrame.to_string#
- Как преобразовать столбцы Pandas DataFrame в строки
- Пример 1: преобразование одного столбца DataFrame в строку
- Пример 2. Преобразование нескольких столбцов DataFrame в строки
- Пример 3: преобразование всего фрейма данных в строки
Pandas Convert Column to String Type?
In this article, I will explain how to convert single column or multiple columns to string type in pandas DataFrame, here, I will demonstrate using DataFrame.astype(str) , DataFrame.values.astype(str) , DataFrame.apply(str) , DataFrame.map(str) and DataFrame.applymap(str) methods to covert any type to string type.
1. Quick Examples of Convert Column To String
If you are in a hurry, below are some of the quick examples of how to convert column to string type in Pandas DataFrame. You can apply these to convert from/to any type in Pandas.
Note that map(str) and apply(str) takes less time compared with the remaining techniques.
# Below are the quick examples # Convert "Fee" from int to string df = df.astype() # Using Series.astype() to convert to string df["Fee"]=df["Fee"].values.astype('string') # Multiple columns string conversion df = pd.DataFrame(technologies) df = df.astype() # Multiple columns string conversion df = pd.DataFrame(technologies) df[[ 'Fee', 'Discount']] = df[['Fee','Discount']].astype(str) # Multiple columns string conversion df["Fee"] = df["Fee"].astype(str) df["Discount"]= df["Discount"].astype(str) # Using apply(str) method df["Fee"]=df["Fee"].apply(str) # Using apply(str) with lambda function df["Fee"] = df["Fee"].apply(lambda x: str(x)) # Using map(str) method df['Fee'] = df["Fee"].map(str) # Convert entire DataFrame to string df=df.applymap(str) # Convert entire DataFrame to string df=df.astype(str)
Now, let’s see a detailed example. first, create a Pandas DataFrame with a few rows and columns, and execute and validate the results. Our DataFrame contains column names Courses , Fee , Duration , and Discount .
import pandas as pd import numpy as np technologies= (< 'Courses':["Spark","PySpark","Hadoop","Python","Pandas","Hadoop","Spark"], 'Fee' :[22000,25000,23000,24000,26000,25000,25000], 'Duration':['30day','50days','55days','40days','60days','35day','55days'], 'Discount':[1000,2300,1000,1200,2500,1300,1400] >) df = pd.DataFrame(technologies) print(df) print(df.dtypes)
# Output: Courses Fee Duration Discount 0 Spark 22000 30day 1000 1 PySpark 25000 50days 2300 2 Hadoop 23000 55days 1000 3 Python 24000 40days 1200 4 Pandas 26000 60days 2500 5 Hadoop 25000 35day 1300 6 Spark 25000 55days 1400 Courses object Fee int64 Duration object Discount int64 dtype: object
You can identify the data type of each column by using dtypes .
2. Convert Column to String Type
Use pandas DataFrame.astype() function to convert a column from int to string, you can apply this on a specific column or on an entire DataFrame.
The Below example converts Fee column from int to string dtype. You can also use numpy.str_ or ‘str’ to specify string type.
# Convert "Fee" from int to string df = df.astype() print(df.dtypes)
# Output: Courses object Fee string Duration object Discount int64 dtype: object
3. Convert Specific Column to String
You can also use Series.astype() to convert a specific column. Since each column on DataFrame is pandas Series, I will get the column from DataFrame as Series and use astype() function. In the below example df.Fee or df[‘Fee’] returns Series object.
# Using Series.astype() to convert column to string df["Fee"]=df["Fee"].values.astype('string') print(df.dtypes)
# Output: Courses object Fee string Duration object Discount int64 dtype: object
4. Convert Multiple Columns to String
You can also convert multiple columns to string by sending dict of column name -> data type to astype() method. The below example converts column Fee from int to string and Discount from float to string dtype.
# Multiple columns string conversion df = pd.DataFrame(technologies) df = df.astype() print(df.dtypes) # Multiple columns string conversion df = pd.DataFrame(technologies) df[[ 'Fee', 'Discount']] = df[['Fee','Discount']].astype(str) print(df.dtypes) # Multiple columns string conversion df["Fee"] = df["Fee"].astype(str) df["Discount"]= df["Discount"].astype(str) print(df.dtypes)
# Output: Courses object Fee object Duration object Discount object dtype: object
5. Using DataFrame.apply(str)
You can convert the column “Fee” to a string by simply using DataFrame.apply(str) , for example df[«Fee»]=df[«Fee»].apply(str) .
# Using apply(str) method df["Fee"]=df["Fee"].apply(str) print(df.dtypes)
# Output: Courses object Fee object Duration object Discount int64 dtype: object
Using apply() with a lambda expression also works in this case. For example df[«Fee»] = df[«Fee»].apply(lambda x: str(x)) .
# Using apply(str) to lambda function df["Fee"] = df["Fee"].apply(lambda x: str(x)) print(df.dtypes)
Yields same output as above.
6. Using Series.map(str)
You can also convert the column “Fee” to a string by using Series.map(str) , for example df[‘Fee’]=df[«Fee»].map(str) .
# Convert columns to string using map(str) method df['Fee'] = df["Fee"].map(str) print(df.dtypes)
Yields same output as above.
Note: map(str) and apply(str) takes less time to compare with the remaining techniques.
7. Convert All Columns to Strings
If you want to change the data type for all columns in the DataFrame to the string type, you can use df.applymap(str) or df.astype(str) methods.
# Convert entire DataFrame to string df=df.applymap(str) print(df.dtypes) # Convert entire DataFrame to string df=df.astype(str) print(df.dtypes)
# Output: Courses object Fee object Duration object Discount object dtype: object
Conclusion
In this article, you have learned how to convert columns to string type in pandas using DataFrame.astype(str) , Series.astype(str) and DataFrame.apply(str) methods. Also, you have learned how to convert to string using DataFrame.map(str) and DataFrame.applymap(str) methods.
Related Articles
References
You may also like reading:
pandas.DataFrame.to_string#
DataFrame. to_string ( buf = None , columns = None , col_space = None , header = True , index = True , na_rep = ‘NaN’ , formatters = None , float_format = None , sparsify = None , index_names = True , justify = None , max_rows = None , max_cols = None , show_dimensions = False , decimal = ‘.’ , line_width = None , min_rows = None , max_colwidth = None , encoding = None ) [source] #
Render a DataFrame to a console-friendly tabular output.
Parameters buf str, Path or StringIO-like, optional, default None
Buffer to write to. If None, the output is returned as a string.
columns sequence, optional, default None
The subset of columns to write. Writes all columns by default.
col_space int, list or dict of int, optional
The minimum width of each column. If a list of ints is given every integers corresponds with one column. If a dict is given, the key references the column, while the value defines the space to use..
header bool or sequence of str, optional
Write out the column names. If a list of strings is given, it is assumed to be aliases for the column names.
index bool, optional, default True
Whether to print index (row) labels.
na_rep str, optional, default ‘NaN’
String representation of NaN to use.
formatters list, tuple or dict of one-param. functions, optional
Formatter functions to apply to columns’ elements by position or name. The result of each function must be a unicode string. List/tuple must be of length equal to the number of columns.
float_format one-parameter function, optional, default None
Formatter function to apply to columns’ elements if they are floats. This function must return a unicode string and will be applied only to the non- NaN elements, with NaN being handled by na_rep .
Set to False for a DataFrame with a hierarchical index to print every multiindex key at each row.
index_names bool, optional, default True
Prints the names of the indexes.
justify str, default None
How to justify the column labels. If None uses the option from the print configuration (controlled by set_option), ‘right’ out of the box. Valid values are
- left
- right
- center
- justify
- justify-all
- start
- end
- inherit
- match-parent
- initial
- unset.
Maximum number of rows to display in the console.
max_cols int, optional
Maximum number of columns to display in the console.
show_dimensions bool, default False
Display DataFrame dimensions (number of rows by number of columns).
decimal str, default ‘.’
Character recognized as decimal separator, e.g. ‘,’ in Europe.
line_width int, optional
Width to wrap a line in characters.
min_rows int, optional
The number of rows to display in the console in a truncated repr (when number of rows is above max_rows ).
max_colwidth int, optional
Max width to truncate each column in characters. By default, no limit.
encoding str, default “utf-8”
If buf is None, returns the result as a string. Otherwise returns None.
Convert DataFrame to HTML.
>>> d = 'col1': [1, 2, 3], 'col2': [4, 5, 6]> >>> df = pd.DataFrame(d) >>> print(df.to_string()) col1 col2 0 1 4 1 2 5 2 3 6
Как преобразовать столбцы Pandas DataFrame в строки
Часто вы можете захотеть преобразовать один или несколько столбцов в кадре данных pandas в строки. К счастью, это легко сделать с помощью встроенной функции pandas astype(str) .
В этом руководстве показано несколько примеров использования этой функции.
Пример 1: преобразование одного столбца DataFrame в строку
Предположим, у нас есть следующие Pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame() #view DataFrame df player points assists 0 A 25 5 1 B 20 7 2 C 14 7 3 D 16 8 4 E 27 11
Мы можем определить тип данных каждого столбца с помощью dtypes:
df.dtypes player object points int64 assists int64 dtype: object
Мы видим, что столбец «игрок» представляет собой строку, а два других столбца «очки» и «ассисты» — целые числа.
Мы можем преобразовать столбец «точки» в строку, просто используя astype(str) следующим образом:
df['points'] = df['points'].astype( str )
Мы можем убедиться, что этот столбец теперь является строкой, еще раз используя dtypes:
df.dtypes player object points object assists int64 dtype: object
Пример 2. Преобразование нескольких столбцов DataFrame в строки
Мы можем преобразовать оба столбца «точки» и «ассисты» в строки, используя следующий синтаксис:
df[['points', 'assists']] = df[['points', 'assists']].astype( str )
И еще раз мы можем проверить, что это строки, используя dtypes:
df.dtypes player object points object assists object dtype: object
Пример 3: преобразование всего фрейма данных в строки
Наконец, мы можем преобразовать каждый столбец в DataFrame в строки, используя следующий синтаксис:
#convert every column to strings df = df.astype(str) #check data type of each column df.dtypes player object points object assists object dtype: object
Вы можете найти полную документацию по функции astype() здесь .