Write a function top5_words(text) that takes a single argument text (a non-empty string), tokenises text into words based on whitespace (once again, without any stripping of punctuation or case normalisation), and returns the top-5 words as a list of strings, in descending order of frequency. If there is a tie in frequency at any point, the words with the same frequency should be sub-sorted alphabetically (e.g. if 'turtle' and 'grok' both occur 5 times, 'grok' should come first). If there are less than five distinct words in text, the function should return all words in descending order of frequency (with the same tie-breaking mechanism). For example:>>> top5_words("one one was a racehorse two two was one too")["one", "two", "was", "a", "racehorse"]>>> top5_words("buffalo buffalo buffalo chicken buffalo")["buffalo", "chicken"]>>> top5_words("the quick brown fox jumped over the lazy dog")["the", "brown", "dog", "fox", "jumped"]
Question
Write a function top5_words(text) that takes a single argument text (a non-empty string), tokenises text into words based on whitespace (once again, without any stripping of punctuation or case normalisation), and returns the top-5 words as a list of strings, in descending order of frequency. If there is a tie in frequency at any point, the words with the same frequency should be sub-sorted alphabetically (e.g. if 'turtle' and 'grok' both occur 5 times, 'grok' should come first). If there are less than five distinct words in text, the function should return all words in descending order of frequency (with the same tie-breaking mechanism). For example:>>> top5_words("one one was a racehorse two two was one too")["one", "two", "was", "a", "racehorse"]>>> top5_words("buffalo buffalo buffalo chicken buffalo")["buffalo", "chicken"]>>> top5_words("the quick brown fox jumped over the lazy dog")["the", "brown", "dog", "fox", "jumped"]
Solution
Sure, here is a Python function that should do what you're asking for:
def top5_words(text):
# Split the text into words based on whitespace
words = text.split()
# Create a dictionary to count the frequency of each word
word_counts = {}
for word in words:
if word in word_counts:
word_counts[word] += 1
else:
word_counts[word] = 1
# Sort the words first alphabetically
sorted_words = sorted(word_counts.items())
# Then sort the words by frequency, in descending order
sorted_words.sort(key=lambda x: x[1], reverse=True)
# Get the top 5 words
top5 = [word[0] for word in sorted_words[:5]]
return top5
This function first splits the input text into words based on whitespace. It then counts the frequency of each word using a dictionary. The words are then sorted first alphabetically, and then by frequency in descending order. Finally, the function returns the top 5 words. If there are less than five distinct words, it will return all of them.
Similar Questions
Write a function repeat_word_count(text, n) that takes a string text and a positive integer n, converts text into a list of words based on simple whitespace separation (with no removal of punctuation or changing of case), and returns a sorted list of words that occur n or more times in text. For example:>>> repeat_word_count("buffalo buffalo buffalo buffalo", 2)['buffalo']>>> repeat_word_count("one one was a racehorse two two was one too", 3)['one']>>> repeat_word_count("how much wood could a wood chuck chuck", 1)['a', 'chuck', 'could', 'how', 'much', 'wood']
Define a function first_word(arg1),with a string argument arg1. In the function:a. use split() function to separate arg1 by "," into a list of wordsb. Return res that is the last element of the list of words
Have the function LetterCount(str) take the str parameter being passed and return the first word with the greatest number of repeated letters. For example: "Today, is the greatest day ever!" should return greatest because it has 2 e's (and 2 t's) and it comes before ever which also has 2 e's. If there are no words with repeating letters return -1. Words will be separated by spaces.ExamplesInput: "Hello apple pie"Output: HelloInput: "No words"Output: -1
Lexicographical order is often known as alphabetical order when dealing with strings. A string is greater than another string if it comes later in a lexicographically sorted list.Given a word, create a new word by swapping some or all of its characters. This new word must meet two criteria:It must be greater than the original wordIt must be the smallest word that meets the first conditionExampleThe next largest word is .Complete the function biggerIsGreater below to create and return the new string meeting the criteria. If it is not possible, return no answer.Function DescriptionComplete the biggerIsGreater function in the editor below.biggerIsGreater has the following parameter(s):string w: a wordReturns- string: the smallest lexicographically higher string possible or no answerInput FormatThe first line of input contains , the number of test cases.Each of the next lines contains .Constraints will contain only letters in the range ascii[a..z].Sample Input 05abbbhefgdhckdkhcSample Output 0bano answerhegfdhkchcdkExplanation 0Test case 1:ba is the only string which can be made by rearranging ab. It is greater.Test case 2:It is not possible to rearrange bb and get a greater string.Test case 3:hegf is the next string greater than hefg.Test case 4:dhkc is the next string greater than dhck.Test case 5:hcdk is the next string greater than dkhc.Sample Input 16lmnodcbadcbbabdcabcdfedcbabcdSample Output 1lmonno answerno answeracbdabdcfedcbabdc
Question 2: For a given list of strings, please create a new list where each element is the uppercase version of the first letter of a string with a length greater than or equal to 5 in the original list.words = ["apple", "banana", "orange", "mango", "kiwi", "pear"]
Upgrade your grade with Knowee
Get personalized homework help. Review tough concepts in more detail, or go deeper into your topic by exploring other relevant questions.