- Most frequent words in a text file in Python
- Handling files in python
- Reading a file:
- Most frequent words in a text file with Python
- Reading contents:
- Finding the most frequent word:
- Output:
- 3 responses to “Most frequent words in a text file in Python”
- Python Program to Count Most Frequent Words in a File
- Python Program to Count Most Frequent Words in a File
- Summary
Most frequent words in a text file in Python
Hello python learners! In this session, we will be learning how to find the most frequent words in a text read from a file. Instead of doing on normal text let us do this on a text read from a file. For better understanding, we need to be familiar with files and the operations on files. So, let’s learn about files
Handling files in python
Data is often stored in text files, which is organized. There are many kinds of files. Text files, music files, videos, and various word processor and presentation documents are those we are familiar with.
Text files only contain characters whereas, all the other file formats include formatting information that is specific to that file format. Operations performed on the data in files include the read and write operations. To perform any operation the program must open the file. The syntax to open a file is given below:
with open(«filename», «mode») as «variable»: «block»
Though there are several ways of opening a file I prefer this way because we need not specify the close statement at the end.
For more understanding on files go through this link handling files
Reading a file:
There are several techniques for reading files. One way is reading the overall contents of the file into a string and we also have iterative techniques in which in each iteration one line of text is read. We, can also read each line of text and store them all in a list. The syntax for each technique is given below
#to read the entire contents of text into a single string with open('file1.txt', 'r') as f: contents = f.read() #to read each line and store them as list with open('file1.txt', 'r') as f: lines = f.readlines() #for iterative method of reading text in files with open('planets.txt', 'r') as f: for line in f: print(len(line))
As our job is to just read the contents of the file and then finding the most frequent word in a text read from a file we have no space for the write operation. In case you want to learn it go through this link text file in Python
Now let’s get into our job of finding the most frequent words from a text read from a file.
Most frequent words in a text file with Python
First, you have to create a text file and save the text file in the same directory where you will save your python program. Because once you specify the file name for opening it the interpreter searches the file in the same directory of the program. Make sure you have created and saved the file in proper directory.
The algorithm we are going to follow is quite simple first we open the file then we read the contents we will see how many times each word is repeated and store them in a variable called count. Then we check it with the maximum count which is initialized as zero in the beginning. If count is less than maximum count we ignore the word if it is equal we will place it in a list. Otherwise, if it is greater then we clear the list and place this word in the list.
Let us start with initializing variables and opening file
fname=input("enter file name") count=0 #count of a specific word maxcount=0 #maximum among the count of each words l=[] #list to store the words with maximum count with open(fname,'r') as f:
we have opened the file as f and we will be using f whenever we have to specify the file.
Now we have to read the contents. We have many techniques for that as we have previously discussed. But, the thing is that we should take the most reliable one for our task. As we are concerned with the words of the file, it would be better if we read the entire contents. And, then we split the string into a list with the words in the string using split method.
Reading contents:
with open(fname,'r') as f: contents=f.read() words=content.split()
Finding the most frequent word:
Now, we have all the words in a list we will implement the algorithm discussed early
for i in range(len(words)): for j in range(len(words)): if(words[i]==words[j]): #finding count of each word count+=1 else: count=count if(count==maxcount): #comparing with maximum count l.append(words[i]) elif(count>maxcount): #if count greater than maxcount l.clear() l.append(words[i]) maxcount=count else: l=l count=0 print(l) #printing contents of l
Now, we have the most frequent words in the list ‘l’ that will be printed at last.
Output:
Let us consider you have a text file with contents like this
Hi, friends this program is found in codespeedy. This program works perfectly
Hope you like this session guys.
3 responses to “Most frequent words in a text file in Python”
Post is quite good for pure fundamental concept of counting. The alternative way of this program will be: Using python inbuilt function : collections and here we use counter method. Then the large program will be in just between 3 to 4 lines to find the most frequent word. Program: from collections import Counter
given_string = “Hi, friends this program is found in codespeedy. This program works perfectly”
words = given_string.split(” “)
words_count = Counter(words).most_common()
print(“Most frequent word in the given sentence is : ” + words_count[0][0] + “\nNumber of occurrence is:”,words_count[0][1]) Output: Most frequent word in the given sentence is : program
Number of occurrence is: 2
Actually, in the code comments line in Python code was made by double slash. Like: //this is a comment
I have made the necessary changes. The code should work properly now. Thanks for your comment buddy.
Python Program to Count Most Frequent Words in a File
Counting the number of specific words in a file is something you need to know as a coder. Counting the most frequent words in a file is one of the coding questions you can get to solve in any coding interview. So, if you want to learn how to find the most common words in a file, this article is for you. In this article, I’ll walk you through how to write a Python program to count the most frequent words in a file.
Python Program to Count Most Frequent Words in a File
Writing a program to count the most frequent words in a file is an important coding interview question that you can get in any coding interview. You can get questions based on this logic in several ways. Here you will be given a file, and you will be asked to find the most frequent words in that file along with the number of times they are present. So here’s how you can write a Python program to count the most frequent words in a file:
[('the', 5), ('you', 5), ('Python', 4), ('is', 4), ('of', 3)]
In the above code, I am first reading a text file from my computer, then I am splitting all the words and storing them into a Python list. Then I am counting the frequency of all the words in the list by using the Counter method of the collection module in Python. In the end, I am printing the top 5 most frequent words in the file.
Summary
So this is how you can write a program to count the most frequent words from any file. Writing a program to count the most frequent words in a file is an important coding interview question that you can get in any coding interview. You can get questions based on this logic in several ways. I hope you liked this article on how to write a Python program to count the most frequent words in a file. Feel free to ask your valuable questions in the comments section below.