Python read one word

Python: read one word per line of a text file

Its not a proper code but I want to know if there is a way to search just one word w./o using .split() as it forms a list and i dont want that with this snippet:

f=(i for i in fin.xreadlines()) for i in f: try: match=re.search(r"([A-Z]+\b) | ([A-Z\'w]+\b) | (\b[A-Z]+\b) | (\b[A-Z\'w]+\b) | (.\w+\b)", i) # | r"[A-Z\'w]+\b" | r"\b[A-Z]+\b" | r"\b[A-Z\'w]+\b" | r".\w+\b" 
class LineReader: #Intended only to be used with for loop def __init__(self,filename): self.fin=open(filename,'r') def __getitem__(self,index): line=self.fin.xreadline() return line.split() 

where say f=LineReader(filepath) and for i in f.getitem(index=line number 25) loop starts from there? i dont know how to do that.any tips?

What do you want instead of a list? A generator? A function returning a new word every time it gets called?

@Lennart lets say a simple text file is searched for particular pattern.only print 1 o/p per line. and can it be done through a class like given above?

What do you mean with word? In the first sample you search for blocks of uppercase letters and in de second sample all text that is not an whitespace. And what do you mean with search one word? Do you want the first word from the line?

Читайте также:  Javascript if checked is true

1 Answer 1

To get the first word of a line:

line[:max(line.find(' '), 0) or None] 

line.find(‘ ‘) searches for the first whitespace, and returns it. If there is no whitespace found it returns -1

max( . ), 0) makes sure the result is always greater than 0, and makes -1 0. This is usefull because bool(-1) is True and bool(0) is False.

x or None evaluates to x if x != 0 else None

and finaly line[:None] is equal to line[:] , which returns a string identical to line

with open('file') as f: for line in f: word = line[:max(line.find(' '), 0) or None] if condition(word): do_something(word) 

And the class (implemented as a generator here)

def words(stream): for line in stream: yield line[:max(line.find(' '), 0) or None] 
gen = words(f) for word in gen: if condition(word): print word 
gen = words(f) while 1: try: word = gen.next() if condition(word): print word except StopIteration: break # we reached the end 

But you also wanted to start reading from a certain linenumber. This can’t be done very efficient if you don’t know the lengths of the lines. The only way is reading lines and discarding them until you reach the right linenumber.

def words(stream, start=-1): # you could replace the -1 with 0 and remove the +1 for i in range(start+1): # it depend on whether you start counting with 0 or 1 try: stream.next() except StopIteration: break for line in stream: yield line[:max(line.find(' '), 0) or None] 

Be aware that you could get strange results if a line would start with a space. To prevent that, you could insert line = line.rstrip() at the beginning of the loop.

Disclaimer: None of this code is tested

Источник

How to read only 1 word in python

Solution 1: Probably the simplest way to do this is a simple loop testing against plus a specific subset of extra characters (e.g., ): As you edit longer and more complex texts, excluding specific punctuation marks becomes less sustainable, and you’d need to use more complex regex (for example, when is a an apostrophe or a quote?), but for the scope of your problem above, this should suffice. None and finaly is equal to , which returns a string identical to First sample: And the class (implemented as a generator here) Which you could use like Or

How to read ONLY 1 word in python?

When you give a limit to split() , all the items from that limit to the end are combined. So if you do

lines = 'Saish ddd TestUser ForTestUse' split = lines.split(None, 2) 
['Saish', 'ddd', 'TestUser ForTestUse'] 

If you just want the third word, don’t give a limit to split() .

You can use it directly without passing any None

I understand your passing (None, 2) because you want to get None if there is no value at index 2, A simple way to check if the index is available in the list

2 in zip(*enumerate(lines.split()))[0] 
2 in list(zip(*enumerate(lines.split())))[0] 

Linux — How to read ONLY 1 word in python?, 2 Answers Sorted by: 1 When you give a limit to split (), all the items from that limit to the end are combined. So if you do lines = ‘Saish ddd TestUser ForTestUse’ split = lines.split (None, 2) the result is [‘Saish’, ‘ddd’, ‘TestUser ForTestUse’] If you just want the third word, don’t give a limit to split (). second = lines.split () [2] Share

Python: read one word per line of a text file

To get the first word of a line:

line[:max(line.find(' '), 0) or None] 

line.find(‘ ‘) searches for the first whitespace, and returns it. If there is no whitespace found it returns -1

max( . ), 0) makes sure the result is always greater than 0, and makes -1 0. This is usefull because bool(-1) is True and bool(0) is False.

x or None evaluates to x if x != 0 else None

and finaly line[:None] is equal to line[:] , which returns a string identical to line

with open('file') as f: for line in f: word = line[:max(line.find(' '), 0) or None] if condition(word): do_something(word) 

And the class (implemented as a generator here)

def words(stream): for line in stream: yield line[:max(line.find(' '), 0) or None] 
gen = words(f) for word in gen: if condition(word): print word 
gen = words(f) while 1: try: word = gen.next() if condition(word): print word except StopIteration: break # we reached the end 

But you also wanted to start reading from a certain linenumber. This can’t be done very efficient if you don’t know the lengths of the lines. The only way is reading lines and discarding them until you reach the right linenumber.

def words(stream, start=-1): # you could replace the -1 with 0 and remove the +1 for i in range(start+1): # it depend on whether you start counting with 0 or 1 try: stream.next() except StopIteration: break for line in stream: yield line[:max(line.find(' '), 0) or None] 

Be aware that you could get strange results if a line would start with a space. To prevent that, you could insert line = line.rstrip() at the beginning of the loop.

Disclaimer: None of this code is tested

Reading and Writing MS Word Files in Python via Python, $ pip install python-docx Reading MS Word Files with Python-Docx Module In this section, you will see how to read text from MS Word files via the python-docx module. Create a new MS Word file and rename it as «my_word_file.docx». I saved the file in the root of my «E» directory, although you …

How to read specific word of a text file in Python

You need to read the line first, then get the word from that line. Use the .readline() method (Docs).

Here is the correct way according to the example in the question:

fo = open("output.txt", "r+") str = fo.readline() str = str[7:11] print "Read String is : ", str fo.close() 

However, for the best practise use a with statement:

with open('myfile.txt', 'r') as fo: str = fo.readline() str = str[7:11] print "Read String is : ", str 

with automatically closes the file when the block ends. If you are using Python 2.5 or lower, you have to include from __future__ import with_statement .

How to get only words from the string using python, How to get only words from the string using python Ask Question 4 I have a file which has special characters, so I used file operations to read. f=open (‘st.txt’,’r’) string=f.read () The sample string is «Free Quote!\n \n Protecting your family is the best investment you\’ll eve=\nr \n»

How to get only words from the string using python

Probably the simplest way to do this is a simple loop testing against string.ascii_letters plus a specific subset of extra characters (e.g., ‘- ):

>>> import string >>> text = "Free Quote!\n \n Protecting your family is the best investment you\'ll eve=\nr \n" >>> ''.join([x for x in text if x in string.ascii_letters + '\'- ']) "Free Quote Protecting your family is the best investment you'll ever " 

As you edit longer and more complex texts, excluding specific punctuation marks becomes less sustainable, and you’d need to use more complex regex (for example, when is a ‘ an apostrophe or a quote?), but for the scope of your problem above, this should suffice.

I found 3 solutions but there all close but not exactly what you want.

import re in_string = "Free Quote!\n \n Protecting your family is the best investment you\'ll eve=\nr \n" #variant 1 #Free Quote Protecting your family is the best investment youll eve r out_string = "" array = "Free Quote!\n \n Protecting your family is the best investment you\'ll eve=\nr \n".split( ) for word in array: out_string += re.sub(r'[\W]', '', word) + " " print(out_string) #variant 2 #Free Quote Protecting your family is the best investment you ll eve r print(" ".join(re.findall("[a-zA-Z]+", in_string))) #variant 3 #FreeQuoteProtectingyourfamilyisthebestinvestmentyoullever print(re.sub(r'[\W]', '', in_string)) 

Python: read one word per line of a text file, The only way is reading lines and discarding them until you reach the right linenumber.

Источник

Reading a text file and splitting it into single words in python

I have this text file made up of numbers and words, for example like this — 09807754 18 n 03 aristocrat 0 blue_blood 0 patrician and I want to split it so that each word or number will come up as a new line. A whitespace separator would be ideal as I would like the words with the dashes to stay connected. This is what I have so far:

f = open('words.txt', 'r') for word in f: print(word) 
09807754 18 n 3 aristocrat . 

Does that data literally have quotes around it? Is it «09807754 18 n 03 aristocrat 0 blue_blood 0 patrician» or 09807754 18 n 03 aristocrat 0 blue_blood 0 patrician in the file?

6 Answers 6

$ cat words.txt line1 word1 word2 line2 word3 word4 line3 word5 word6 

If you just want one word at a time (ignoring the meaning of spaces vs line breaks in the file):

with open('words.txt','r') as f: for line in f: for word in line.split(): print(word) 
line1 word1 word2 line2 . word6 

Similarly, if you want to flatten the file into a single flat list of words in the file, you might do something like this:

with open('words.txt') as f: flat_list=[word for line in f for word in line.split()] >>> flat_list ['line1', 'word1', 'word2', 'line2', 'word3', 'word4', 'line3', 'word5', 'word6'] 

Which can create the same output as the first example with print ‘\n’.join(flat_list) .

Or, if you want a nested list of the words in each line of the file (for example, to create a matrix of rows and columns from a file):

with open('words.txt') as f: matrix=[line.split() for line in f] >>> matrix [['line1', 'word1', 'word2'], ['line2', 'word3', 'word4'], ['line3', 'word5', 'word6']] 

If you want a regex solution, which would allow you to filter wordN vs lineN type words in the example file:

import re with open("words.txt") as f: for line in f: for word in re.findall(r'\bword\d+', line): # wordN by wordN with no lineN 

Or, if you want that to be a line by line generator with a regex:

 with open("words.txt") as f: (word for line in f for word in re.findall(r'\w+', line)) 

Источник

Оцените статью