The paper "МАЕ 356 Analytical Methods in Economics and Finance " is a great example of a statistics case study. Linear regression allows a better estimation of the trends, costs and other related factors between dependent (Y) and independent variables (X) as the method allows the presentation of data through a best-fit line. The reason due to which the line is termed as best fit is that the line minimizes the SSE (Sum of the Squared Errors of prediction) and therefore the line formed after minimizing the sum of squares is likely to be more correct as compared to other methods.
The error (Y – Y’ ) shown in this regression method is the difference found between the value of the dependent variable determined through regression (Y’ ) and the actual value (Y) (Grob, 2003). Due to this, it is comparatively easy to identify the trends between the dependent and independent variables by means of the line passing through the points representing the standard errors. Other benefits associated with the application of linear regression are that, The method enables the determination of trends by providing a single slope; The data fit the line is not biased, if it is considered that the deviations are randomly distributed with respect to the trend; The best fit allows the minimization of errors; and There is consistency in the trends shown by the line and it is, therefore, possible that whenever linear regression is applied over similar values, the results will always be the same (Grob, 2003). The summary statistics From the table below, it could be suggested that the standard deviations are close to the mean value of the variables.
The standard deviations of revenue are 375,32.51 on either side of the mean(343,965.7).
while the standard deviations of people are 983.4761 on either side of the mean. Looking at Income standard deviations of people are 4116.335 on either side of 41522.96. The standard deviations of competitors are 1.010152545on either side of the mean(2.8) while the standard deviations of price are 0.360838 on either side of the mean of 5.68. Furthermore, by observing the skewness of the variables collected it could be concluded that the data is not normally distributed and it could be inferred that mean, median and mode values are not very close to each other which is supportive of the normal distribution of data; Revenue People Income Competitors (COMPTORS) Price Mean 343965.7 5970.26 41522.96 2.8 5.68 Standard Error 5307.899 139.0845 582.1376 0.142857143 0.05103 Median 345166.5 6032 41339.5 3 5.75 Mode 5917 3 6 Standard Deviation 37532.51 983.4761 4116.335 1.010152545 0.360838 Sample Variance 1.41E+09 967225.3 16944212 1.020408163 0.130204 Kurtosis -0.5805 0.965044 0.86667 -0.192246809 -0.77389 Skewness -0.17644 -0.56688 0.380396 0.296984848 -0.67274 Range 152895 4827 21627 4 1 Minimum 256640 3172 30999 1 5 Maximum 409535 7999 52626 5 6 Sum 17198284 298513 2076148 140 284 Count 50 50 50 50 50 The regression model to estimate variables The regression equation to be solved will look like as follows- Rt = a + b. Pt + c. It + d. Ct + e. PRt +ut In the above regression equation Ri is the revenue in period t, ‘ a’ is the constant/intercept term, Pi is the people in period t, b is the coefficient of people, Ii is the income earned in period t, c is the coefficient of income, Ci is the competition in period t, d is the coefficient of competition, PRi is the competition in period t, e is the coefficient of price and ui is the error term.
The error term is included in the regression equation because practically all of the variation in the dependent variable can not be explained by dependant variables. In a regression model, it is not possible to incorporate all possible factors that affect the dependent variable. The error term is incorporated in a regression model in order to capture the effect of those factors on the dependent variables that are not included as the independent variable in the regression equation.
The value for a is more or less an intercept, although a multiple regression equation with two independent variables constitutes a plane rather than a line.
Allison, P. D. 1999. Multiple regression: a primer. Pine Forge Press.
Berk, R. A. 2003. Regression Analysis: A Constructive Critique. London : Sage.
Freedman, D.A., 2009. Statistical models: Theory and practice. Cambridge University Press.
Freund, J.E., Williams, F.J. and Perles, B.M., 1993. Elementary Business Statistics: The Modern Approach. 6th ed. Englewood Cliffs: NJ: Prentice Hall.
Grob, J., 2003. Linear Regression. New York: Springer.
Lindley, D. V. 1957. A statistical Paradox. Biometrika,187-192.
Neter, J., Wasserman, W. and Whitmore, G.A., 1993. Applied Statistics. 4th ed. Boston: Allyn & Bacon.
Stigler, S. M. 1981. Gauss and Intervention of Least Squares. The Annals of statistics, 9(3):465-474.
Upton, G. and Cook, I., 2008. Oxford Dictionary of Statistics. Oxford University Press.