In this tutorial, we will learn about the process of reading an Excel file using Python by using various modules and libraries. Excel is a spreadsheet application developed by Microsoft. It is an easily accessible tool that is used to organize, analyze, and store data in tables. It is widely used in various places such as Offices, schools, and colleges to manage the data.
Reading from an Excel file
First, we need to install the xlrd module using the following command:
pip install xlrd
Creating a Workbook
A workbook contains all the data in the Excel file. You can create a new workbook from scratch, or you can easily create a workbook from the Excel file that already exists.
Reading an Excel file using the Xlrd module
Note: The Xlrd module only supports .xls files. Earlier, it used to support .xlsx files too, but the support was terminated from version 2.0.0, although we have shown an example for a better understanding.
Similarly, we have libraries like xlwt and xlutils that only support .xls files, which aren’t relevant currently.
#importing the xlrd module
import xlrd
# writing the path of the file
file = r"C:\Users\mk\Desktop\hello_Python_tech.xls"
# Open workbook
wb = xlrd.open_workbook(file)
# Select first sheet
sheet = wb.sheet_by_index(0)
# Get value at row 0, column 1
cell_value = sheet.cell_value(0, 1)
#printing the value from the Excel sheet
print(cell_value)
Output:
Python
Explanation
In the above example, we have read the .xls file using the xlrd module. We printed the value located at the 0th row and 1st column in the Excel file.
What if the XLRD module is used with the .xlsx extension
When you try to run the XLRD module with the extension .xlsx, you will get an error as shown in the section below:
Output:
XLRDError Traceback (most recent call last)
Cell In[2], line 6
4 file = r"C:\Users\mk\Desktop\hello_tpoint_tech.xlsx"
5 # Open workbook
----> 6 wb = xlrd.open_workbook(file)
8 # Select first sheet
9 sheet = wb.sheet_by_index(0)
File c:\Users\mk\AppData\Local\Programs\Python\Python313\Lib\site-packages\xlrd\_init_.py:170, in open_workbook(filename, logfile, verbosity, use_mmap, file_contents, encoding_override, formatting_info, on_demand, ragged_rows, ignore_workbook_corruption)
167 # We have to let unknown file formats pass through here, as some ancient
168 # files that xlrd can parse don't start with the expected signature.
169 if file_format and file_format != 'xls':
--> 170 raise XLRDError(FILE_FORMAT_DESCRIPTIONS[file_format]+'; not supported')
172 bk = open_workbook_xls(
173 filename=filename,
174 logfile=logfile,
(...) 182 ignore_workbook_corruption=ignore_workbook_corruption,
183 )
185 return bk
XLRDError: Excel xlsx file; not supported
1. Reading from the Pandas
Pandas is defined as an open-source library that is built on top of the NumPy library. It provides fast analysis, data cleaning, and preparation of the data for the user, and supports both xls and xlsx extensions from the URL.
We can also read the CSV file using Pandas.
It is a Python package that provides a beneficial data structure called a data frame.
Example: Reading the .xlsx file using Pandas
Let us take an example to demonstrate how to read the .xlsx file using Pandas.
import pandas as pd
# Specify the correct file path
file_path = r"C:\Users\mk\Desktop\hello_Learn_Python.xlsx"
tpointtech = pd.read_excel(file_path, engine='openpyxl')
print(Learn Python)
Explanation:
In the above example, we have read the contents of the Excel file using the Pandas library. We imported the pandas library as pd.
Now, we will read the CSV file using the Pandas library.
Example: Reading the CSV file using Pandas
Let’s take an example to demonstrate how to read the CSV file using Pandas.
import pandas as pd
# specify the path of the CSV file
file_path = r"C:\Users\mk\Desktop\hello_Learn_Python.csv"
# Read the file
data = pd.read_csv(file_path, low_memory=False)
# Output the number of rows
print("Total rows: {0}".format(len(data)))
# Print the list of column headers
print(list(data.columns))
Output:
Total rows: 0
['Hello..', 'Learn', 'Python', 'Pvt Ltd']
Explanation:
In the above example, we have read the contents of the CSV file using the Pandas library. We imported the pandas library as pd. It concludes that using Pandas, we can read both Excel files as well as CSV files.
2. Reading from the openpyxl
We can also read data from the existing spreadsheet using openpyxl. It also allows the user to perform calculations and add content that was not part of the original dataset.
Installing the openpyxl library
First, we need to install the openpyxl module using pip from the command line.
pip install openpyxl
Python Example to Read Data from the openpyxl
Let us take an example to demonstrate how to read data from the openpyxl in Python.
#importing the openpyxl library
import openpyxl
# writing the full raw path of our file
file_path = r"C:\Users\mk\Desktop\hello_tpoint_tech.xlsx"
# Here we are Loading workbook
workbook = openpyxl.load_workbook(file_path)
# Select active sheet
sheet = workbook.active
# Printing all rows of the Excel file
for row in sheet.iter_rows(values_only=True):
print(row)
Explanation
In the above example, we have used the openpyxl library to read and print the content of the Excel file named “hello_Learn_Python.xlsx”.
3. Reading from the Xlwings
The Xlwings library is another library that supports the .xlsx files. It allows us to enter the data as well as read the data. Let’s see an example of using Xlwings to enter the data.
#importing the xlwings
import xlwings as xw
# Open the Excel file
wb = xw.Book(r"C:\Users\mk\Desktop\hello_tpoint_tech.xlsx")
# Select a sheet
sheet = wb.sheets['Sheet1']
# Insert a list of lists into multiple rows into Name, role, and attendance format
sheet.range("A3:D5").value = [
['Prashant Y.','SEO', 85],
['Sam A.','Graphic Designer',92],
['David J','Java Developer', 83]
]
# Save and close
wb.save()
wb.close()
Explanation:
In the above example, we used the xlwings library to change or modify the data inside the Excel file. We have changed the data from A3 to D5. We have shown the excel file before the execution of the code and after the execution of the code.
Leave a Reply