Capecari Consulting Statistics Assignment Example | Topics and Well Written Essays

Part A1 Table 1: Used Car Listings from Dealer Websites S.No Make Model Km Driven Selling Price ($) Dealer Name Green Star Rating 1 Toyota Tarago 6,471 55,888 Victoria City 3.5 2 Audi A8 136,627 29,990 Victoria City 2.5 3 Ford Focus 18,523 26,990 Victoria City 4.0 4 Ford Territory 63,256 26,888 South Australia 3.0 5 Honda Legend 129,000 22,850 New South Wales 3.0 Table 2: Source of Listings in Table 1 S.No Source URL 1 http://www.carsales.com.au/dealer/details/Toyota-Tarago-2012/AGC-AD-14982243/?Cr=3&sdmvc=1 2 http://www.carsguide.com.au/cars-for-sale/D_2870876/used-2006-AUDI-A8-L-4.2-QUATTRO-4E-05-UPGRADE-Automatic--Sedan-in-.html 3 http://www.carpoint.com.au/all-cars/dealer/details.aspx?R=AGC-AD-14767039&Cr=3 4 http://www.carpoint.com.au/all-cars/dealer/details.aspx?R=AGC-AD-15908126&Cr=4 5 http://www.carsguide.com.au/cars-for-sale/D_2989018/used-2008-HONDA-LEGEND--30-Automatic--Sedan-in-.html Q1. Include one example (a screenshot as well as link) of an advertisement that you have used as a data point indicating the information above. (2 points) Figure 1: Screen Shot for Car Listing S.No 1 Part A2 Q2. For the data you have collected, list the measurement properties and describe the sample data for variables model and green star rating using appropriate measures. Present your answer in tabular format (3 points) Table 3: Measurement Properties, Sample Data Variables and Green Star Ratings S.No Measurement Properties Sample Data Variables Green Star Rating 1 Tests using devices, observation, rating scales, fuel economy, drive type Engine strength, storage capacity, wheel drive 3.5 2 Drive tests, expert interview, transmission, weight, warranty, comfort Interior design, price, comfort ability 2.5 3 Questionnaire, breathalyser, colour, body, steering, running costs Colour, number of doors, number of seats 4.0 4 Ignition, safety, fuel consumption, registration plate, dimensions Speed, stability, fuel type 3.0 5 Pedestrian protection, racv, sensor elements Racv, price, kilometres driven 3.0 Part A3 Q3. Capecari consulting has collected and analysed a sample of transaction data on used cars from North American used car markets. The data consists of information about customer id (column panid), total time spent to search (hours) for used cars and transaction price of used car (AUD). Assuming that prices are normally distributed, you wish to examine prices in the two markets. To identify any issues in the price variable what technique is most useful for normally distributed data? Use this technique and identify if there exists any issue in the price variable in Capecari’s data. Specifically, identify the data point/s that may be an issue. List the value of the measure that you have used to identify specific data points. How does the mean price of used cars in North American markets compare with the mean selling price in your data? Indicate and highlight your answers clearly. (7 points) A scatter plot would be the best technique to use to identify any issues in the price variable. Figure 2: Price AUD Scatter Plot From figure 2 we observe that the price variable has some outliers. This is because most data points ought to lie on or close to the trend line, but we observe that there are some variables that lie extremely above or below the trend line, for instance, 50000, 40000, and 0. The mean price for used cars in North America is 14,045.74 and the mean selling price in my data is 32,421.2. The mean selling price in my data is higher (almost 4 times) than the mean price for used cars in North America. Q4. Following from Q3 above, after having checked the data for any issues, it is imperative to study the problem at hand. To understand consumer search for used cars it will be useful to analyse the amount of time that consumers spend to search for cars. It is hypothesized that consumers searching for high priced cars are likely to invest greater amount of time in their search for used cars. Analyse the magnitude and impact of the bivariate relationship and present your findings. Report the statistical significance and interpret your findings. (5 points) The determine relationship between price of a used car and the hours spent to search for the car, we regress the data for both variables using excel. In this case we regress the total search time (hours) on the price (AUD) and the following output is obtained from the regression analysis: Table 4: Regression Analysis Excel Output SUMMARY OUTPUT Regression Statistics Multiple R 0.237289 R Square 0.056306 Adjusted R Square 0.046577 Standard Error 32.7159 Observations 99 ANOVA df SS MS F Significance F Regression 1 6194.623 6194.623 5.787581 0.018035 Residual 97 103822 1070.33 Total 98 110016.7 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 22.42765 6.179024 3.629642 0.000455 10.164 34.6913 10.164 34.6913 price (AUD) 0.000896 0.000372 2.405739 0.018035 0.000157 0.001635 0.000157 0.001635 From the excel output above, the model for the price of the used car and the hours spent to search for the car is: The model above indicates that the constant time used to search for a used car is 22.428 hours (which is the intercept). +0.001 is the slope, it indicates that the total time used to search for a used car increases by 0.001 hours with every unit AUD increase in the price of the used car. 22.428 and +0.001 are just estimates that may well be right or wrong. More importantly even though we have a slope of +0.001, it may still be zero. If the slope is found to be zero in the relationship, there is no relationship between the total search time (hours) and price (AUD). Therefore, we test the hypothesis that the slope = 0. There are three various tools that can be used to test the hypothesis that the slope = 0: T-statistic (t-stat), P-value (p), and Confidence Interval (CI). From the output, we observe that the t-statistic for price (AUD) is 2.406, the p-value is 0.018 and the confidence interval (lower 95%, upper 95%) is (0.0002, 0.0016). In this case, I employ the t-statistic. This tool requires a confidence level. Of the three common confidence levels; 90%, 95%, and 99%, 95% (5% significance level) is the most common and is what I choose to use. Therefore, at 5% significance level we test the hypothesis that β1 = 0 versus the alternative that β1 ≠ 0. The t-statistic for price (AUD) is 2.406. The critical value is 1.96. 2.406 >1.96. Therefore, we conclude that, at 5% significance level, β1 is statistically significant. This implies that at 95% confidence level we are certain that the effect of price (AUD) on total search time (hours) in not zero seeing as 0.001 is statistically significant. Q5. For your analysis in Q5, what do you conclude about the model fit? Based on the above output, what is the value of the measure of association of search time and price (express up to 2 decimal points)? What will be the impact on this model if more information about consumer behaviour is included? (3 points) The statistical goodness of fit of a model is usually tested using the R-squared. Conventionally, 0 > R-squared < 1. A higher R-squared is usually desirable, for a model to be of good fit. From the table 4 above, the R-squared is 0.056, which means that the total search time (hours) explains around 5.6 per cent of the variation in price (AUD). Based on this information, we can conclude that the model is not of good fit. The value of the measure of association of search time and price expressed up to 2 decimal points is 0.00. If more information about consumer behaviour is included in the model, quite a few things change but the model is not affected significantly. The independent variables will be interpreted differently, and the degrees of freedom will change. However, the output will almost be the same. References Bridget C & Cathy L 2004. Research Methods in the Social Sciences, 1 ed. SAGE Publications Ltd Cowan, G., (1998), Statistical Data Analysis, New York, Oxford University Press Freedman, D.H. et al. (2007), Statistics 4th ed, New York, W.W Norton & Company Rumsey, D, (2011), Statistics for Dummies, Kindle Witte, R.S. and Witte, J.S., (2009) Statistics Tryfos and Peter 1996. Sampling methods for applied research: Text and methods. John Wiley Inc. Read More

Capecari Consulting Statistics - Assignment Example

Extract of sample "Capecari Consulting Statistics"

CHECK THESE SAMPLES OF Capecari Consulting Statistics

Statistics, Politics, and Policy

Statistics as Science of Counting

Hypothesis Testing

Relationship between Price and Searching Time

Statistics for Decision-Making - Mutual Funds