Knowee
Questions
Features
Study Tools

Using the date column created in Question 3A, write code to analyse the yearly change (if any) of the number of Scott Morrison favorite_count, which is the number of people who liked each tweet.You should group your dataframe by date and calculate the median of the number of people who liked each tweet. Save this grouped dataframe as a new dataframe called "grouped".Plot the number of people who liked each tweet versus date.PLEASE LEAVE THE COMMENT ### edTest(test_q3b) ### in the code box below for auto-marking.HintsImport the required library/librariesRemember to properly label your axisYour plot should look like this:

Question

Using the date column created in Question 3A, write code to analyse the yearly change (if any) of the number of Scott Morrison favorite_count, which is the number of people who liked each tweet.You should group your dataframe by date and calculate the median of the number of people who liked each tweet. Save this grouped dataframe as a new dataframe called "grouped".Plot the number of people who liked each tweet versus date.PLEASE LEAVE THE COMMENT ### edTest(test_q3b) ### in the code box below for auto-marking.HintsImport the required library/librariesRemember to properly label your axisYour plot should look like this:

...expand
🧐 Not the exact question you are looking for?Go ask a question

Solution

Here is a Python script that accomplishes the tasks you've outlined:

# Import required libraries
import matplotlib.pyplot as plt

# Group dataframe by date and calculate the median of the number of people who liked each tweet
grouped = tweet_df.groupby('date')['favorite_count'].median().reset_index()

# Plot the number of people who liked each tweet versus date
plt.figure(figsize=(10,6))
plt.plot(grouped['date'], grouped['favorite_count'])
plt.xlabel('Date')
plt.ylabel('Median Number of Likes')
plt.title('Yearly Change in Number of Likes on Scott Morrison\'s Tweets')
plt.grid(True)
plt.show()

### edTest(test_q3b) ###

This script first groups 'tweet_df' by the 'date' column and calculates the median number of likes ('favorite_count') for each year. It then plots these median values against the year. The x-axis is labeled 'Date', the y-axis is labeled 'Median Number of Likes', and the plot is titled 'Yearly Change in Number of Likes on Scott Morrison's Tweets'. The grid is also enabled for easier visualization of the data.

This problem has been solved

Similar Questions

Create a new column named 'date' in the data frame "tweet_df" (from Question 1), using "panda.to_datetime()" function with only the years extracted from the 'created_at' column.

The file ScottMorrisonMP.json contains tweets from April 10th, 2015 until August 3rd, 2020. The data are organised as following:'created_at', 'favorite_count', 'followers_count', 'full_text', 'id', 'retweet_count', 'source'created_at: this column contains information about the date and time the tweet was createdfavorite_count: number of people who liked the tweetfollowers_count: number of Twitter users that follow Scott Morrisonfull_text: full text of the tweetid: tweet IDretweet_count: number of times a tweet has been retweetedsource: the device used to send the tweetQuestion 1Write a Python script to:import the required librariesload the data (ScottMorrisonMP.json)convert to a data frame called tweet_dffind the 5 most retweeted tweets and order them by the most likes and display only the tweet text (full_text), number of times the tweet has been retweeted (retweet_count), the date and time of the tweet (created_at) and number of people who liked the tweet (favorite_count) in this order. Save this into a dataframe called top5.The data are in your working directory.PLEASE LEAVE THE COMMENT ### edTest(test_q1) ### in the code box below for auto-marking.Hintimport jsonuse "open("ScottMorrisonMP.json")" and "json.load()" functionsuse "panda.DataFrame.from_records()" function

Search for those tweets that contain either the word "COVID" or "pandemic". Save the output to a new dataframe called covid_tweets.Calculate the percentage (2 decimal places) of tweets that contain either the word "COVID" or "pandemic" and save this as "perc_covid". Use this value to create a sentence that says: "(perc_covid) % of tweets from Scott Morrison were about COVID or the pandemic". Save this sentence as a variable called answer.

Write Python code to count the frequency of hashtags in a twitter feed.Your code assumes a twitter feed variable tweets exists, which is a list of strings containing tweets. Each element of this list is a single tweet, stored as a string. For example, tweets may look like:tweets = ["Happy #IlliniFriday!", "It is a pretty campus, isn't it, #illini?", "Diving into the last weekend of winter break like... #ILLINI #JoinTheFight", "Are you wearing your Orange and Blue today, #Illini Nation?"]Your code should produce a sorted list of tuples stored in hashtag_counts, where each tuple looks like (hashtag, count), hashtag is a string and count is an integer. The list should be sorted by count in descending order, and if there are hashtags with identical counts, these should be sorted alphabetically, in ascending order, by hashtag.From the above example, our unsorted hashtag_counts might look like:[('#illini', 2), ('#jointhefight', 1), ('#illinifriday!', 1), ('#illini?', 1)]The hashtag_counts sorted by the above specifications will look like:[('#illini', 2), ('#illini?', 1), ('#illinifriday!', 1), ('#jointhefight', 1)]You may use str.split() to split each tweet into a list of words. A hashtag is any word that starts with a hash mark (#). (That means that the hash mark # should be included in the hashtag value above.)Steps/Hints:Preprocessing: You will need to convert each hashtag to lower case before you count it. For example, for this question #UIUC and #Uiuc add to the count of same hashtag (#uiuc).Do not further process the tweets or hashtags beyond using .split(), such as attempting to remove punctuation. While in the 'real world' you would absolutely do this, in this problem the autograder will be unhappy with you if you do.And if using .split(), do not pass any arguments (when no arguments are added then every kind of whitespace will be considered).You may find it helpful to use an intermediate data structure for this problem to count the frequency of each hastag.If you aren't sure how to sort or convert to lowercase, you may find Python docs how to sort and Python docs for string methods useful.

1.  In a survey of ages of people in a town, the following ages were recorded: 24, 25, 26, 26, 27, 27, 28, 29, 30. Calculate the mean, median, and mode.*5 points26.89; 27 ; 26 and 27 respectively25.90; 28 ; 27 and 28 respectively26.91; 29 ; 28 and 29 respectively27.91; 30 ; 29 and 30 respectively

1/1

Upgrade your grade with Knowee

Get personalized homework help. Review tough concepts in more detail, or go deeper into your topic by exploring other relevant questions.