`kailo_beewell_dashboard.synthesise_aggregate`

Functions which aggregate pupil-level data - as part of several files which provide functions for synthesis (creation and aggregation) of data for the dashboard.

Module Contents

Functions

`results_by_site_and_group`(data, agg_func, no_pupils[, ...])	Aggregate results for all possible sites (schools or areas) and groups
`aggregate_scores`(df)	Aggregate the score columns in the provided dataset, finding the mean and
`convert_boolean`(true_list, false_list, mask)	Conditionally replace values of boolean list from one list when True and
`aggregate_proportions`(data, response_col, labels[, ...])	Aggregates each of the columns provided by response_col, for the chosen
`aggregate_counts`(df)	Aggregates the provided dataframe by finding the total people in it.
`aggregate_demographic`(data, response_col, labels)	Aggregates the demographic data by school and group (seperate to

kailo_beewell_dashboard.synthesise_aggregate.results_by_site_and_group(data, agg_func, no_pupils, response_col=None, labels=None, group_type='standard', site_col='school_lab')

Aggregate results for all possible sites (schools or areas) and groups (setting result to 0 or NaN if no pupils from a particular group are present).

Parameters

datapandas dataframe: Pupil-level survey responses, with their school and demographics
agg_funcfunction: Method for aggregating the dataset
no_pupils: pandas dataframe: Output of agg_func() where all counts are set to 0 and other results set to NaN, to be used in cases where there are no pupils of a particular group (e.g. no FSM / SEN / Year 8)
response_collist: Optional argument used when agg_func is aggregate_proportions(). It is the list of columns that we want to aggregate.
labelsdictionary: Optional argument used when agg_func is aggregate_proportions(). It is a dictionary with all possible questions as keys, then values are another dictionary where keys are all the possible numeric (or nan) answers to the question, and values are relevant label for each answer.
group_typestring: Links to the type of demographic groupings performed. Either ‘standard’, ‘symbol’ or ‘none’ - default is standard.
site_col: string: Name of column with site - e.g. ‘school_lab’ (default), ‘msoa’.

Returns

resultpandas DataFrame: Dataframe where each row has the aggregation results, along with the relevant school and pupil groups used in that calculation

kailo_beewell_dashboard.synthesise_aggregate.aggregate_scores(df)

Aggregate the score columns in the provided dataset, finding the mean and count of non-NaN

Parameters:

dfdataframe: Dataframe with rows for each pupils and containing the score columns

Returns:

resdataframe: Dataframe with mean and count for each score

kailo_beewell_dashboard.synthesise_aggregate.convert_boolean(true_list, false_list, mask)

Conditionally replace values of boolean list from one list when True and another when False.

Parameters

true_listlist: Contains values to use if True
false_listlist: Contains values to use if False
masklist: Boolean list

kailo_beewell_dashboard.synthesise_aggregate.aggregate_proportions(data, response_col, labels, hide_low_response=False)

Aggregates each of the columns provided by response_col, for the chosen dataset.

This function uses the known possible values for each column, it counts occurences of each (inc. number missing) and makes the answer as a single dataframe row, where counts and percentages and categories are stored as lists within cells of that row. The function returns a dataframe containing all of those rows. It is designed to based on all possible values rather than only on values present - else e.g. if no-one responded 3, you could have a function that just returns counts of responses to 1, 2 and 4, which would then create issues when we try and plot the data.

For the branching question (talking about feelings), the value counts are calculated from a subset of the data (as the no response should only be from those who branched onto that question, and not those who branched onto the other question (or never answered the first branching question)).

Parameters

datadataframe: Dataframe with rows for each pupil and including all the response_col
response_collist: List of columns that we want to aggregate
labelsdictionary: Dictionary with all possible questions as keys, then values are another dictionary where keys are all the possible numeric (or nan) answers to the question, and values are the relevant label for each answer.
hide_low_responseboolean: Whether to hide responses when a response option gets less than 10 responses (rather than norm elsewhere, which is just requiring 10 responses to the entire item rather than to each response option)

Returns

pd.concat(rows): dataframe: Dataframe with the aggregate responses to each of the response_col

kailo_beewell_dashboard.synthesise_aggregate.aggregate_counts(df)

Aggregates the provided dataframe by finding the total people in it.

Parameters

dfDataframe: Dataframe with row for each pupil and columns that include the school and groups needed by results_by_site_and_group()

Returns

resDataframe: Dataframe with the count of pupils in each school and group

kailo_beewell_dashboard.synthesise_aggregate.aggregate_demographic(data, response_col, labels)

Aggregates the demographic data by school and group (seperate to results_by_school_and_group() as we want to aggregate by school v.s. all others rather than for each school, and as we don’t want to break down results any further by any demographic characteristics)

Parameters

datadataframe: Dataframe containing pupil-level demographic data
response_colarray: List of demographic columns to be aggregated
labelsdictionary: Dictionary with response options for each variable

Returns

resultdataframe: Dataframe with % responses to demographic questions, for each school, compared with all other schools

kailo_beewell_dashboard.synthesise_aggregate

Module Contents

Functions

Parameters

Returns

Parameters:

Returns:

Parameters

Parameters

Returns

Parameters

Returns

Parameters

Returns

`kailo_beewell_dashboard.synthesise_aggregate`