# Applied Statistical Modelling Assignment

Question 1

In Management Theory, there is interest in establishing an ideal manager-to-staff ratio. While it is acknowledged that there is no one-size-fits-all answer to this question, Human Resources (HR) departments may still want some guidance on how many managers, or supervisors, they will need to operate the business successfully for a given number of workers.

In a survey of 27 industrial establishments of varying size (from the same sector), the number of supervisors and the number of supervised workers were recorded. We want to model the relationship between the number of
supervisors (the response) and the number of supervised workers (the explanatory variable).

The data are given in the data frame super and stored in the file super.RData. The variables are as follows:
y Number of supervisors in the industrial establishment
x Number of workers in the industrial establishment

(a) Carry out an exploratory data analysis of the data. In particular:
(i) Provide a visual summary of each respective variable, and explain why your chosen method to visualise the data is appropriate for this type of data. [2]

(ii) Comment on the results of your summaries. [1]

(iii) Provide a scatterplot of y versus x. Give meaningful labels to the axes. [1]

(iv) Do you think a simple linear regression model of y versus x would be a good fit to the data? Explain why, or why not. [2]

(b) Find appropriate transformations for the data. In particular:
(i) Given your results from part (a), suggest three possible transformations for y and explain why these would be reasonable to try. [1]

(ii) Provide scatterplots of each of your transformed responses versus x. Explain which, if any, of these look promising for fitting a simple linear regression model, possibly after transforming x. If none of them seem suitable, suggest one further transformation and try it, giving a reason for your choice. [3]

(iii) Select the two most promising looking transformations of y from above and, if necessary, transform x appropriately. Provide scatterplots of the transformed responses versus the possibly transformed explanatory variables and comment on them. If necessary, try different transformations of x. [2]