356 Analytical Methods in Economics and Finance Case Study

356 Analytical Methods in Economics and Finance Name Institution Date Table of Contents Introduction 3 The summary statistics 3 The regression model to estimate variables 4 Estimated model and how independent affect the dependent variable 5 Independent variable significantly explain the dependent variable 5 Predict the Gross Annual Revenue of A Store for the Mean Value of the Variables PEOPLE, INCOME, COMPTORS and PRICE 6 The coefficient of determination adjusted for degrees of freedom (𝑹2) 6 Current selection of the variables 7 Introduction Linear regression allows a better estimation of the trends, costs and other related factors between dependent (Y) and independent variables (X) as the method allows the presentation of data through a best fit line. The reason due to which the line is termed as best fit is that the line minimizes the SSE (Sum of the Squared Errors of prediction) and therefore the line formed after minimizing the sum of squares is likely to be more correct as compared to other methods. The error (Y – Y’) shown in this regression method is the difference found between the value of the dependent variable determined through regression (Y’) and the actual value (Y) (Grob, 2003). Due to this, it is comparatively easy to identify the trends between the dependent and independent variables by means of the line passing through the points representing the standard errors. Other benefits associated with the application of linear regression are that, The method enables the determination of trends by providing a single slope; The data fit to the line is not biased, if it is considered that the deviations are randomly distributed with respect to the trend; The best fit allows the minimization of errors; and There is consistency in the trends shown by the line and it is therefore possible that whenever linear regression is applied over similar values, the results will always be same (Grob, 2003). The summary statistics From the table below, it could be suggested that the standard deviations are close to the mean value of the variables. The standard deviations of revenue are 375,32.51 on either side of the mean(343,965.7). while the standard deviations of people are 983.4761 on either side of the mean. Looking at Income standard deviations of people are 4116.335 on either side of 41522.96. The standard deviations of competitors are 1.010152545on either side of the mean(2.8) while the standard deviations of price are 0.360838 on either side of the mean of 5.68. Furthermore, by observing the skewness of the variables collected it could be concluded that the data is not normally distributed and it could be inferred that mean, median and mode values are not very close to each other which is supportive of the normal distribution of data; Revenue People Income Competitors (COMPTORS) Price Mean 343965.7 5970.26 41522.96 2.8 5.68 Standard Error 5307.899 139.0845 582.1376 0.142857143 0.05103 Median 345166.5 6032 41339.5 3 5.75 Mode 5917 3 6 Standard Deviation 37532.51 983.4761 4116.335 1.010152545 0.360838 Sample Variance 1.41E+09 967225.3 16944212 1.020408163 0.130204 Kurtosis -0.5805 0.965044 0.86667 -0.192246809 -0.77389 Skewness -0.17644 -0.56688 0.380396 0.296984848 -0.67274 Range 152895 4827 21627 4 1 Minimum 256640 3172 30999 1 5 Maximum 409535 7999 52626 5 6 Sum 17198284 298513 2076148 140 284 Count 50 50 50 50 50 The regression model to estimate variables The regression equation to be solved will look like as follows- Rt = a + b.Pt + c.It + d.Ct + e.PRt +ut In the above regression equation Ri is the revenue in period t, ‘a’ is the constant/intercept term, Pi is the people in period t, b is the coefficient of people, Ii is the income earned in period t, c is the coefficient of income, Ci is the competition in period t, d is the coefficient of competition, PRi is the competition in period t, e is the coefficient of price and ui is the error term. The error term is included in the regression equation because practically all of the variation in the dependent variable can not be explained by dependant variables. In a regression model it is not possible to incorporate all possible factors that affect the dependent variable. The error term, is incorporated in a regression model in order to capture the effect of those factors on the dependent variables that are not included as the independent variable in the regression equation. The value for a is more or less an intercept, although a multiple regression equation with two independent variables constitutes a plane rather than a line. Estimated model and how independent affect the dependent variable SUMMARY OUTPUT Regression Statistics Multiple R 0.766795 R Square 0.587974 Adjusted R Square 0.55135 Standard Error 25139.79 Observations 50 ANOVA df SS MS F Significance F Regression 4 4.06E+10 1.01E+10 16.05411 3.08E-08 Residual 45 2.84E+10 6.32E+08 Total 49 6.9E+10 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept -68363.2 78524.73 -0.87059 0.388597 -226520 89793.76 -226520 89793.76 PEOPLE 6.439423 3.705117 1.737981 0.089054 -1.02307 13.90191 -1.02307 13.90191 INCOME 7.272305 0.935795 7.771261 7.43E-10 5.387518 9.157092 5.387518 9.157092 COMPTORS -6709.43 3818.543 -1.75707 0.085709 -14400.4 981.5075 -14400.4 981.5075 PRICE 15968.76 10219.03 1.56265 0.125141 -4613.41 36550.94 -4613.41 36550.94 From the excel output of regression output from excel file for speed and bendiness shown below we note that the equation formed is Rt = 6.44Pt + 7.27It - 6709.4Ct + 15968.8PRt -68363.2 The equation is reasonable because it shows that an revenue. The intercept is the value of dependent variable in the absence of dependent variables. It means that without them, the revenue will be -68,363.2. Independent variable significantly explain the dependent variable The significance of the relationship between different independent variables and dependent variable can be determined by considering the p value for each variable in the regression analysis. t ratios and the significance level of the t ratios which is captured through P values is important for finding out significance of the independent variables of the regression equation explained in part a. If the values of t ratios with degrees of freedom (n-2) lies in the confidence interval considering 95 percent confidence level, and p values of the t ratios are found to be greater than 0.05, it can be said that the independent variables, fails to significantly affect dependent variable or vice versa. (Lindley, 1957; Stigler, 1981). In this way, it is found that the p values for PEOPLE, COMPTORS and PRICE are greater than 0.05 (when p > 0.05, relationship is not significant). On the other hand, the p value for INCOME is significantly lower than 0.05 and therefore a significant relationship between the INCOME and REVENUE exists. Predict the Gross Annual Revenue of A Store for the Mean Value of the Variables PEOPLE, INCOME, COMPTORS and PRICE Using the equation of the regression line found above, gives Rt = 6.44(5970.26) + 7.27(41522.96) - 6709.4(2.8) + 15968.8(5.68) -68363.2 = 343,313.66 The coefficient of determination adjusted for degrees of freedom (𝑹2) Since the value of R2 is dependent on the number of data pairs) and the number of variables, statisticians also calculate what is called an adjusted R2 , denoted by R2adj. This is based on the number of degrees of freedom. The adjusted R2 and takes into account the fact that when n and k are approximately equal, the value of R may be artificially high, due to sampling error rather than a true relationship among the variables. This occurs because the chance variations of all the variables are used in conjunction with each other to derive the regression equation. Even if the individual correlation coefficients for each independent variable and the dependent variable were all zero, the multiple correlation coefficients due to sampling error could be higher than zero. Hence, both R2 and R2adj are usually reported in a multiple regression analysis. The importance of adjusted R2 which is nothing but a ratio of explained sum of square of total sum of square. This ratio is known to as adjusted coefficient of determination. It measures what proportion of total variation in the dependent variable, i.e. Revenue, can be explained by the four independent variables. If the value of this coefficient of determination is found to be very high then the regression model can said to be a good fit for the given set of data (Stigler, 1981). The stepwise regression equation presented above indicates the coefficient of determination R-squared as 0.55135, which suggests that the stepwise regression performed between variables, is only able to predict 55.135% of the total variations observed in the values. The total variations are 0. 0.766795 from which 55.14% are explained via regression whereas 44.86% remain unexplained. This could be due to the lagging effect in the values, which is not addressed in this analysis. In addition, it is not addressed in this analysis due to the certain values that can be considered as outliers observed within the data. Current selection of the variables Comparing mean revenue and value obtained by the regression, it can be observed that there is a marginal difference between both values. From the provided table, the value is 343,965.7 for mean revenue and regression value for mean predictors is 343,313.66 which is 652.04 greater than the value obtained using the regression equation. This difference can be explained on the basis of the ability of the regression to predict the variations assessed by the value of coefficient of determination that is only able to predict 58.79% of the total variations in the values of variables. This means that the selected set of variables is accurate and the manager will be informed of the same. References Allison, P. D. 1999. Multiple regression: a primer. Pine Forge Press. Berk, R. A. 2003. Regression Analysis: A Constructive Critique. London : Sage. Freedman, D.A., 2009. Statistical models: Theory and practice. Cambridge University Press. Freund, J.E., Williams, F.J. and Perles, B.M., 1993. Elementary Business Statistics: The Modern Approach. 6th ed. Englewood Cliffs: NJ: Prentice Hall. Grob, J., 2003. Linear Regression. New York: Springer. Lindley, D. V. 1957. A statistical Paradox. Biometrika,187-192. Neter, J., Wasserman, W. and Whitmore, G.A., 1993. Applied Statistics. 4th ed. Boston: Allyn & Bacon. Stigler, S. M. 1981. Gauss and Intervention of Least Squares. The Annals of statistics, 9(3):465-474. Upton, G. and Cook, I., 2008. Oxford Dictionary of Statistics. Oxford University Press. Read More

356 Analytical Methods in Economics and Finance - Case Study Example

Extract of sample "356 Analytical Methods in Economics and Finance"

CHECK THESE SAMPLES OF 356 Analytical Methods in Economics and Finance

Strategic Management and External Environment for Kraft Food

Relationship between Human Capital and Time Performance in Project Management

Analytical Methods in Economics and Finance

Reward System as an Essential Tool for the Development of Organization

The Theory and Practice of Change Management

Analytical Methods in Economics and Finance

Analytical Methods in Economics and Finance

Analytical Approach to Economics and Finance