Code Challenge 1DataFrames: beyond the basics1. You have a DataFrame df with a column 'A' of integers. For example:df = pd.DataFrame({'A': [1, 2, 2, 3, 4, 5, 5, 5, 6, 7, 7]})How do you filter out rows which contain the same integer as the row immediately above?You should be left with a column containing the following values:1, 2, 3, 4, 5, 6, 72. Given a DataFrame of random numeric values:df = pd.DataFrame(np.random.random(size=(5, 3))) # this is a 5x3 DataFrame of float valueshow do you subtract the row mean from each element in the row?3. A DataFrame has a column of groups 'grps' and and column of integer values 'vals':df = pd.DataFrame({'grps': list('aaabbcaabcccbbc'), 'vals': [12,345,3,1,45,14,4,52,54,23,235,21,57,3,87]})For each group, find the sum of the three greatest values. You should end up with the answer as follows:grpsa 409b 156c 345
Question
Code Challenge 1DataFrames: beyond the basics1. You have a DataFrame df with a column 'A' of integers. For example:df = pd.DataFrame({'A': [1, 2, 2, 3, 4, 5, 5, 5, 6, 7, 7]})How do you filter out rows which contain the same integer as the row immediately above?You should be left with a column containing the following values:1, 2, 3, 4, 5, 6, 72. Given a DataFrame of random numeric values:df = pd.DataFrame(np.random.random(size=(5, 3))) # this is a 5x3 DataFrame of float valueshow do you subtract the row mean from each element in the row?3. A DataFrame has a column of groups 'grps' and and column of integer values 'vals':df = pd.DataFrame({'grps': list('aaabbcaabcccbbc'), 'vals': [12,345,3,1,45,14,4,52,54,23,235,21,57,3,87]})For each group, find the sum of the three greatest values. You should end up with the answer as follows:grpsa 409b 156c 345
Solution
Sure, here are the solutions for your questions:
- To filter out rows which contain the same integer as the row immediately above, you can use the
duplicatedfunction in pandas. Here is how you can do it:
df = df.loc[df['A'].shift() != df['A']]
- To subtract the row mean from each element in the row, you can use the
applyfunction in pandas. Here is how you can do it:
df = df.sub(df.mean(axis=1), axis=0)
- To find the sum of the three greatest values for each group, you can use the
groupbyandnlargestfunctions in
Similar Questions
Which function in pandas is used to randomly sample fractions of rows from a DataFrame?sample()random()frac()choose()
Which rows of the DataFrame df will be selected based on the following syntax? df.iloc[22:57]
What we pass in DataFrame in pandas ?(1 Point)IntegerStringPandas seriesAll of the above
import pandas as pd import numpy as np info_nums = pd.DataFrame({'num': np.random.randint(1, 50, 11)}) print(info_nums) info_nums['num_bins'] = pd.cut(x=df_nums['num'], bins=[1, 25, 50]) print(info_nums) print(info_nums['num_bins'].unique())
Problem statementSend feedbackConsider a data frame df with columns ['A', 'B', 'C', 'D'] and rows ['r1', 'r2', 'r3']. Which of the following expression filters the rows whose column B values are greater than 45
Upgrade your grade with Knowee
Get personalized homework help. Review tough concepts in more detail, or go deeper into your topic by exploring other relevant questions.