Fuel Economy Analysis

This demo is an example of performing data mining on historical fuel economy data. We have data from various cars built from year 2000 up to 2012.

Import Data into Table

Import from Excel using modified auto-generated function from Import Tool

```carData = importYearXLS(2007);
```

Table Summary

Display basic statistical summary

```summary(carData(:,{'RatedHP','MPG', 'CO2'}))
```
```Variables:
RatedHP: 2595x1 double
Values:
min        76
median    236
max       631
MPG: 2595x1 double
Values:
min        9.8
median    24.8
max       66.6
CO2: 2595x1 double
Values:
min       131
median    352
max       878
NaNs      257
```

Visualize

Plot MPG versus Rated Horsepower

```createMPGFigure(carData.RatedHP, carData.MPG);
``` Examine Grouping Effects of Categorical Data

```% Convert Car-Truck and City-Highway to categoricals
carData.Car_Truck = categorical(carData.Car_Truck);
carData.City_Highway = categorical(carData.City_Highway);

% In order to extract all "cars":
carIDs = carData.Car_Truck == 'car';

% In order to extract "city" data for "trucks":
city_truckIDs = (carData.City_Highway == 'city' & carData.Car_Truck == 'truck');

% City versus Highway
cityIDs = carData.City_Highway == 'city';
highwayIDs = carData.City_Highway == 'highway';
```

Distributions

Examine the distribution of MPG grouped by City or Highway

```mpgDistribution(carData.MPG(cityIDs), carData.MPG(highwayIDs))
``` Grouped Visualizations

Scatter plot by group.

```figure
gscatter(carData.RatedHP, carData.MPG, ...
{carData.Car_Truck, carData.City_Highway}, ...
'', '.', 10, 'on', 'Rated Horsepower', 'MPG')
``` Look at additional data: Engine Compression and CO2.

Then show a matrix of scatter plots by group

```figure
gplotmatrix([carData.RatedHP, carData.Comp], [carData.MPG, carData.CO2], ...
{carData.Car_Truck, carData.City_Highway}, ...
'', '.', 10, 'on', '', {'Rated Horsepower', 'Compression'}, {'MPG', 'CO2'})
``` Grouped Statistics

Perform group statistics based on specified grouping variables.

```varfun(@mean, carData,'InputVariables',{'RatedHP', 'MPG'},...
'GroupingVariables',{'City_Highway', 'Car_Truck'})
```
```ans =
City_Highway    Car_Truck    GroupCount    mean_RatedHP    mean_MPG
____________    _________    __________    ____________    ________
city_car         city            car          672           253.17          22.693
city_truck       city            truck        627           246.28          18.501
highway_car      highway         car          671           251.09          35.542
highway_truck    highway         truck        625           246.76          27.459
```

Analysis of Variance (ANOVA)

One way, 2-way, and n-way ANOVA are available.

```anovan(carData.MPG, {carData.Car_Truck, carData.City_Highway}, ...
'varnames', {'Veh. Type', 'MPG Type'}, ...
'model', 'interaction');
``` Boxplots

Boxplots are integral part of grouped statistics. It provides useful visualization for grouping effects.

```figure
boxplot(carData.MPG, {carData.Car_Truck, carData.City_Highway}, 'notch','on')
``` Extract Data for Curve Fitting

Create these variables for Curve Fitting App

```RatedHPCity = carData.RatedHP(cityIDs);
MPGCity     = carData.MPG(cityIDs);

% Use the App to develop a curve fit.
```

Curve Fitting

Equation:

```MPG = b1 + b2 * 1/RatedHP
```

We can solve this using the Curve Fitting Tool

```cftool(carData.RatedHP, carData.MPG)
```

The following is a modified version of the auto-generated m-file from cftool.

```cf = createMPGFit(carData.RatedHP, carData.MPG);
``` Plot Data and Model

The result from the Curve Fitting Toolbox has a plot method for displaying the result graphically. We can choose to display the prediction bounds for the fit.

```figure
hh = plot(cf, 'r', carData.RatedHP, carData.MPG, 'predobs', 0.95);
hh(2).LineWidth = 2;
for ii = [3 4]
hh(ii).LineStyle = '-';
hh(ii).Color = [0 0.5 0];
end
``` Plot of Data and Model (for different groups)

We will apply the similar modeling technique to the data for different combinations of groups (Car-Truck and City-Highway)

```% Model different combinations
modelMPG(carData, 'car', 'city')
modelMPG(carData, 'car', 'highway')
modelMPG(carData, 'truck', 'city')
modelMPG(carData, 'truck', 'highway')
```
```ans =
Linear model:
ans(x) = a + b*1/x
Coefficients (with 95% confidence bounds):
a =       10.12  (9.528, 10.72)
b =        2663  (2546, 2779)
ans =
Linear model:
ans(x) = a + b*1/x
Coefficients (with 95% confidence bounds):
a =       21.33  (20.58, 22.09)
b =        3005  (2857, 3153)
ans =
Linear model:
ans(x) = a + b*1/x
Coefficients (with 95% confidence bounds):
a =       8.473  (7.579, 9.368)
b =        2314  (2115, 2514)
ans =
Linear model:
ans(x) = a + b*1/x
Coefficients (with 95% confidence bounds):
a =       16.26  (15.11, 17.42)
b =        2589  (2332, 2846)
```    