- How to convert Excel to CSV Python Pandas
- Install Pandas in Python
- Reading Excel file using Python Pandas
- Convert Excel to CSV Python
- How to Convert Excel to CSV in Python
- Method 1: Converting Excel to CSV using Pandas Module
- Algorithm (Steps)
- Example
- Output
- Method 2: Converting Excel to CSV using openpyxl and CSV Modules
- Algorithm (Steps)
- Example
- Output
- Conclusion
- Convert Excel to CSV using Python (example included)
- Steps to Convert Excel to CSV using Python
- Step 1: Install the Pandas Package
- Step 2: Capture the Path where the Excel File is Stored
- Step 3: Specify the Path where the New CSV File will be Stored
- Step 4: Convert the Excel to CSV using Python
How to convert Excel to CSV Python Pandas
In this article, you will learn how to convert Excel to CSV using Python Pandas.
Pandas is open source, fast, flexible, powerful, easy-to-use tools and are widely used for data manipulation and data analysis. It provides functionality to read data from various file formats, such as CSV, MS Excel, etc. As we know, Microsoft Excel has been used in many different applications and spreadsheet representations. We can easily organize a large amount of data by using this. It is also rich in features like calculation, graphing tools, pivot tables, producing graphs and charts, and much more. CSV is a widely used file format that stores data in a tabular format. Most popular programming languages have tools or applications to support the CSV file format.
CSV files have some advantages over Excel files. The CSV files are faster and consume less memory space, whereas Excel consumes more memory space while importing data. That’s why it’s important to convert excel to CSV.
Install Pandas in Python
Before starting work on Python Pandas, we need to install this module. So, open your command prompt, activate your virtual environment, and install it using the following command.
pip install pandas
On successful installation, it returns something like this-
Installing collected packages: pandas Successfully installed pandas-1.0.1
Suppose we have the following excel file containing school program participant data.
Reading Excel file using Python Pandas
Here, we have first imported the Pandas module and passed the excel sheet file as a parameter in read_excel() method. This method reads the data into a Pandas DataFrame. It accepts filename in the first parameter and sheet name in the second parameter. DataFrame is the key data structure of Pandas.
import pandas as pd df = pd.read_excel("school_event.xlsx")
Convert Excel to CSV Python
Pandas DataFrame provides the to_csv() method to convert data from a dataframe to a CSV. If we provide the file location as an argument, then data is written to the file otherwise, CSV string is returned. This method provides many options, like we can provide a custom delimiter, specify the columns to be used, ignore the index column, and much more to the generated CSV file. Here is the complete code to convert Excel to CSV.
import pandas as pd df = pd.read_excel("school_event.xlsx") df.to_csv ("school.csv", index = None, header=True)
You can see how we can simply achieve more work from just 2-3 lines of code. This is one of the big advantages of Python Pandas.
How to Convert Excel to CSV in Python
In this article, we will show you how to convert an excel file to the CSV File (Comma Separated Values) using python.
Assume we have taken an excel file with the name sampleTutorialsPoint.xlsx containing some random text. We will return a CSV File after converting the given excel file into a CSV file.
sampleTutorialsPoint.xlsx
Player Name | Age | Type | Country | Team | Runs | Wickets |
---|---|---|---|---|---|---|
Virat Kohli | 33 | Batsman | India | Royal Challengers Bangalore | 6300 | 20 |
Bhuvaneshwar Kumar | 34 | Batsman | India | Sun Risers Hyderabad | 333 | 140 |
Mahendra Singh Dhoni | 39 | Batsman | India | Chennai Super Kings | 4500 | 0 |
Rashid Khan | 28 | Bowler | Afghanistan | Gujarat Titans | 500 | 130 |
Hardik Pandya | 29 | All rounder | India | Gujarat Titans | 2400 | 85 |
David Warner | 34 | Batsman | Australia | Delhi Capitals | 5500 | 12 |
Kieron Pollard | 35 | All rounder | West Indies | Mumbai Indians | 3000 | 67 |
Rohit Sharma | 33 | Batsman | India | Mumbai Indians | 5456 | 20 |
Kane Williamson | 33 | Batsman | New Zealand | Sun Risers Hyderabad | 3222 | 5 |
Kagiso Rabada | 29 | Bowler | South Africa | Lucknow Capitals | 335 | 111 |
Method 1: Converting Excel to CSV using Pandas Module
Algorithm (Steps)
Following are the Algorithm/steps to be followed to perform the desired task −
- Import the pandas module (Pandas is a Python open-source data manipulation and analysis package)
- Create a variable to store the path of the input excel file.
- Read the given excel file content using the pandas read_excel() function(reads an excel file object into a data frame object).
- Convert the excel file into a CSV file using the to_csv() function(converts object into a CSV file) by passing the output excel file name, index as None, and header as true as arguments.
- Read the output CSV file with the read_csv() function(loads a CSV file as a pandas data frame) and convert it to a data frame object with the pandas module’s DataFrame() function.
- Show/display the data frame object.
Example
The following program converts an excel file into a CSV file and returns a new CSV file
# importing pandas module import pandas as pd # input excel file path inputExcelFile ="sampleTutorialsPoint.xlsx" # Reading an excel file excelFile = pd.read_excel (inputExcelFile) # Converting excel file into CSV file excelFile.to_csv ("ResultCsvFile.csv", index = None, header=True) # Reading and Converting the output csv file into a dataframe object dataframeObject = pd.DataFrame(pd.read_csv("ResultCsvFile.csv")) # Displaying the dataframe object dataframeObject
Output
On executing, the above program will generate the following output −
| index | Player Name | Age | Type | Country | Team |Runs | Wickets | |--------|---------------------|-----|-----------|------------------|---------------------------|----- |---------| | 0 |Virat Kohli | 33|Batsman | India |Royal Challengers Bangalore| 6300 | 20 | | 1 |Bhuvaneshwar Kumar | 34|Batsman | India |Sun Risers Hyderabad | 333 | 140 | | 2 |Mahendra Singh Dhoni | 39|Batsman | India |Chennai Super Kings | 4500 | 0 | | 3 |Rashid Khan | 28|Bowler | Afghanistan |Gujarat Titans | 500 | 130 | | 4 |Hardik Pandya | 29|All rounder| India |Gujarat Titans | 2400 | 85 | | 5 |David Warner | 34|Batsman | Australia |Delhi Capitals | 5500 | 12 | | 6 |Kieron Pollard | 35|All rounder| West Indies |Mumbai Indians | 3000 | 67 | | 7 |Rohit Sharma | 33|Batsman | India |Mumbai Indians | 5456 | 20 | | 8 |Kane Williamson | 33|Batsman | New Zealand |Sun Risers Hyderabad | 3222 | 5 | | 9 |Kagiso Rabada | 29|Bowler | South Africa |Lucknow Capitals | 335 | 111 |
In this program, we use the pandas read_excel() function to read an excel file containing some random dummy data, and then we use the to csv() function to convert the excel file to csv. If we pass the index as a false argument, the final CSV file does not contain the index row at the beginning. Then we converted the CSV to a data frame to see if the values from the excel file were copied into the CSV file.
Method 2: Converting Excel to CSV using openpyxl and CSV Modules
Algorithm (Steps)
Following are the Algorithm/steps to be followed to perform the desired task −
- Use the import keyword, to import the openpyxl(Openpyxl is a Python package for interacting with and managing Excel files. Excel 2010 and later files with the xlsx/xlsm/xltx/xltm extensions are supported. Data scientists use Openpyxl for data analysis, data copying, data mining, drawing charts, styling sheets, formula addition, and other operations) and CSV modules.
- Create a variable to store the path of the input excel file.
- To create/load a workbook object, pass the input excel file to the openpyxl module’s load_workbook() function (loads a workbook).
- Opening an output CSV file in write mode with open() and writer() functions to convert an input excel file into a CSV file.
- Using the for loop, traverse each row of the worksheet.
- Use the writerow() function, to write cell data of the excel file into the result CSV file row-by-row.
Example
The following program converts an excel file into a CSV file and returns a new CSV file −
# importing openpyxl module and csv modules import openpyxl import csv # input excel file path inputExcelFile = 'sampleTutorialsPoint.xlsx' # creating or loading an excel workbook newWorkbook = openpyxl.load_workbook(inputExcelFile) # getting the active workbook sheet(Bydefault-->Sheet1) firstWorksheet = newWorkbook.active # Opening a output csv file in write mode OutputCsvFile = csv.writer(open("ResultCsvFile.csv", 'w'), delimiter=",") # Traversing in each row of the worshsheet for eachrow in firstWorksheet.rows: # Writing data of the excel file into the result csv file row-by-row OutputCsvFile.writerow([cell.value for cell in eachrow])
Output
On executing, the above program a new CSV file (ResultCsvFile.csv) will be created with data of Excel.
In this program, we have an excel file with some random dummy data, which we load as an openpyxl work and set to use using the active attribute. Then we made a new CSV file and opened it in writing mode, then we went through the excel file row by row and copied the data into the newly created CSV file.
Conclusion
In this tutorial, we learned how to read an excel file and convert it to an openpyxl workbook, then how to convert it to a CSV file and remove the index, and finally how to convert the CSV file to a pandas data frame.
Convert Excel to CSV using Python (example included)
Need to convert an Excel file to a CSV file using Python?
If so, you may use the following template to convert your file:
import pandas as pd read_file = pd.read_excel (r'Path where the Excel file is stored\File name.xlsx') read_file.to_csv (r'Path to store the CSV file\File name.csv', index = None, header=True)
And if you have a specific Excel sheet that you’d like to convert, you may then use this template:
import pandas as pd read_file = pd.read_excel (r'Path where the Excel file is stored\File name.xlsx', sheet_name='Your Excel sheet name') read_file.to_csv (r'Path to store the CSV file\File name.csv', index = None, header=True)
In the next section, you’ll see the complete steps to convert your Excel file to a CSV file using Python.
Steps to Convert Excel to CSV using Python
Step 1: Install the Pandas Package
If you haven’t already done so, install the Pandas package. You may use the following command to install Pandas (under Windows):
Step 2: Capture the Path where the Excel File is Stored
Next, capture the path where the Excel file is stored on your computer.
Here is an example of a path where an Excel file is stored:
Where ‘Product_List‘ is the Excel file name, and ‘xlsx‘ is the file extension.
Step 3: Specify the Path where the New CSV File will be Stored
Now you’ll need to specify the path where the new CSV file will be stored. For example:
Where ‘New_Products‘ is the new file name, and ‘csv‘ is the file extension.
Step 4: Convert the Excel to CSV using Python
For the final part, use the following template to assist you in the conversion of Excel to CSV:
import pandas as pd read_file = pd.read_excel (r'Path where the Excel file is stored\File name.xlsx') read_file.to_csv (r'Path to store the CSV file\File name.csv', index = None, header=True)
This is how the code would look like in the context of our example (you’ll need to modify the paths to reflect the location where the files will be stored on your computer):
import pandas as pd read_file = pd.read_excel (r'C:\Users\Ron\Desktop\Test\Product_List.xlsx') read_file.to_csv (r'C:\Users\Ron\Desktop\Test\New_Products.csv', index = None, header=True)
Once you run the code (adjusted to you paths), you’ll get the new CSV file at your specified location.
You may also want to check the following source for the steps to convert CSV to Excel using Python.