Knowee
Questions
Features
Study Tools

This task involves creating a lexicon that can store your collection of Word objects, and populating it with words loaded from two text files (in1.txt and in2.txt).More details on each of these steps is provided below, however in short, for this task you should:Create and initialize a variable that represents your lexicon.Define a function read_data() that can read words from a text file into your lexicon (Keeping in mind the constraints from the assignment brief).Call the read_data() function to load all words from the text files in1.txt and in2.txt into your lexicon.Step 1:It is up to you to decide on how you want to represent your lexicon. It may be useful to refer back to the labs to see how we represented collections of Person objects for the sorting algorithms that you implemented.Step 2:In this step you should define a function read_data() that reads words from a text file into your lexicon.This function should take two arguments, the lexicon that words should be loaded into, and the name of the file to load.From the assignment brief, the criteria for a word is:A word is obtained as a sequence of characters separated by whitespaceWords should be stored in lowercaseAny numbers or punctuation characters should be removedUnique words should only be stored once in the lexicon (You can use the frequency of the word to show that a word appears twice in the files)Hint:Below is an example of how you can remove all punctiation and numeric characters from a string:my_str = 'H1-ell.0o'# Remove all punctuation and numeric charactersmy_str = ''.join([ch for ch in my_str if ch.isalpha()])print(my_str)>>> HelloDo not be concerned with initializing the neighbours per-word. This will be done in Task 4.Step 3.In this step, you should make two calls to your read_data() function to load both in1.txt and in2.txt into your lexicon. When calling these functions, you can hardcode the names of these files.

Question

This task involves creating a lexicon that can store your collection of Word objects, and populating it with words loaded from two text files (in1.txt and in2.txt).More details on each of these steps is provided below, however in short, for this task you should:Create and initialize a variable that represents your lexicon.Define a function read_data() that can read words from a text file into your lexicon (Keeping in mind the constraints from the assignment brief).Call the read_data() function to load all words from the text files in1.txt and in2.txt into your lexicon.Step 1:It is up to you to decide on how you want to represent your lexicon. It may be useful to refer back to the labs to see how we represented collections of Person objects for the sorting algorithms that you implemented.Step 2:In this step you should define a function read_data() that reads words from a text file into your lexicon.This function should take two arguments, the lexicon that words should be loaded into, and the name of the file to load.From the assignment brief, the criteria for a word is:A word is obtained as a sequence of characters separated by whitespaceWords should be stored in lowercaseAny numbers or punctuation characters should be removedUnique words should only be stored once in the lexicon (You can use the frequency of the word to show that a word appears twice in the files)Hint:Below is an example of how you can remove all punctiation and numeric characters from a string:my_str = 'H1-ell.0o'# Remove all punctuation and numeric charactersmy_str = ''.join([ch for ch in my_str if ch.isalpha()])print(my_str)>>> HelloDo not be concerned with initializing the neighbours per-word. This will be done in Task 4.Step 3.In this step, you should make two calls to your read_data() function to load both in1.txt and in2.txt into your lexicon. When calling these functions, you can hardcode the names of these files.

...expand
🧐 Not the exact question you are looking for?Go ask a question

Solution

Sure, here is a step-by-step guide to complete the task:

Step 1: Create and Initialize a Lexicon You can use a dictionary to represent your lexicon. Each key in the dictionary will be a word, and the value will be the frequency of the word.

lexicon = {}

Step 2: Define a Function to Read Data This function will read words from a text file and store them in the lexicon. It will convert words to lowercase, remove punctuation and numbers, and count the frequency of each word.

import re

def read_data(lexicon, filename):
    with open(filename, 'r') as file:
        for line in file:
            words = line.lower().split()
            for word in words:
                word = re.sub(r'\W+', '', word)  # remove punctuation
                word = re.sub(r'\d+', '', word)  # remove numbers
                if word in lexicon:
                    lexicon[word] += 1
                else:
                    lexicon[word] = 1

Step 3: Load Words from Text Files Call the read_data() function to load words from the text files in1.txt and in2.txt into your lexicon.

read_data(lexicon, 'in1.txt')
read_data(lexicon, 'in2.txt')

Now, your lexicon is populated with words from the text files, and each word is associated with its frequency.

This problem has been solved

Similar Questions

In this step you should define a function read_data() that reads words from a text file into your lexicon.This function should take two arguments, the lexicon that words should be loaded into, and the name of the file to load.From the assignment brief, the criteria for a word is:A word is obtained as a sequence of characters separated by whitespaceWords should be stored in lowercaseAny numbers or punctuation characters should be removedUnique words should only be stored once in the lexicon (You can use the frequency of the word to show that a word appears twice in the files)Hint:Below is an example of how you can remove all punctiation and numeric characters from a string:my_str = 'H1-ell.0o'# Remove all punctuation and numeric charactersmy_str = ''.join([ch for ch in my_str if ch.isalpha()])print(my_str)>>> Hello

In this task, you should define two sorting algorithms that you can use to sort your lexicon of words. In the next task you will write a short program to use one of the sorting algorithms to sort your lexicon. Following sorting, words in your lexicon should be sorted in alphabetical order.It is entirely up to you which two sorting algorithms you include, as long as they are in the subject and different regarding time complexity.You should write one function per-sorting algorithm. Each of these functions should take a single parameter, the lexicon to be sorted. They should sort the lexicon in-place, so they do not need to return anything.

From the assignment brief, along with the spelling of the word, a word in your lexicon needs to store the following information:The frequency: How many times the word appears in the input files.The list of neighbours: A neighbour of a word w is a word that is of the same length and differs from w by only one letter.The best way this can be implemented is as a Word class, where you can populate your lexicon with Word objects.Your task in this section is to create a class Word that is used to represent a Word in your lexicon.When you do this, you will need to:Think about what instance variables should be defined (and how they should be initialized)Think about what methods you need to implement for this classIn a future task, you will need to sort your lexicon full of Word objects. In the labs you saw a similar example where you needed to sort a collection of Person objects. It may be useful to refer back to this to see what methods were required.You may find that once you attempt the following tasks, you need to come back to this class and add additional methods.

Write a function concatenate_files(filename1, filename2, new_filename) that concatenates the text from two source files such that the text from the file named by argument filename2 follows the text from filename1. The concatenated text should be written to a new file with the name given by new_filename. Your function must not return anything.We have provided sample input files named part1.txt and part2.txt containing a portion of the text from the novel Alice in Wonderland to test your function.

Define a function first_word(arg1),with a string argument arg1. In the function:a. use split() function to separate arg1 by "," into a list of wordsb. Return res that is the last element of the list of words

1/1

Upgrade your grade with Knowee

Get personalized homework help. Review tough concepts in more detail, or go deeper into your topic by exploring other relevant questions.