By Sam Weiss
Introduction:
I changed from a 2006 Civic Si to a Mini Cooper JCW. Horsepower was
slightly increased but what really jumped was torque. On the highway the added
torque was welcome. Overtaking someone wasn’t a matter of downshifting to
fourth from sixth. It was more of a matter of pressing the go pedal.
But going around the twisties I find myself downshifting less. No
longer is it necessary to go from 3rd to 2nd just to stay
in the meat of the torque band (which I should add maxed out at 139 lbft at
6200 RPM). In the Mini downshifting from 3rd to 2nd isn’t
only unnecessary, it’s counterproductive. I lose time shifting, sure, but more
importantly, readl-ining the Mini doesn’t give me that sweet spine / groin tingling
sensation I’ve grown accustomed to with my 2.0 liter Honda screamer of an
engine. (as a side note, If I listened really carefully I could hear the engine
whisper, ever so faintly, “Soichioro” at full throttle). I’m left with slightly
quicker car but I feel like I’m working less to achieve that performance. The
difference in performance is obviously related to how the performance is achieved.
While the Honda achieves max torque at high RPM the Mini achieves it at low at
a relatively low RPM.
I’m left pondering whether I’d have fun more in my new Mini if it
had my old Honda Engine. I’m not about to swap engines just yet though. The
problems involved in such a such a project are innumerable to count. Not only
does it involve me knowing something about working on cars (I don’t) and the
fact I could ruin two perfectly nice cars. No the real problem I’m asking
myself is “Am I nuts to want an engine that’s worse on paper than my current
one?”.
Which brings me to my question: are high torque, low revving cars
less fun than low torque, high revving cars? This isn’t a new question. In
fact, it’s been debated before by those, like me who like a work out shifting
when you work out the revs, or those who still are stuck in the 1960’s and
measure engine size in Cubic Inches. And I’m sure there have been many a
drunken bar fight among these different car groups and I don’t believe my
writing skills can convince the other side of my side’s inherent
correctness.
Perhaps my data skillz can answer the question. So I’m going to go
splunking in EVO Magazine data to answer the question. I will attempt to build
a model to predict the EVO Rating of a car, and see whether either torque has a
negative association or whether Torque RPM is positively associated with it. I
think these are both good indications for my question at had.
Note on the data:
At the end of EVO Magazine (an English publication) there are
several pages of data. I scanned and used an optical character recognition
software to import that data into an excel spreadsheet (I asked on an EVO forum
whether they have the data in an excel format and they claimed they didn’t. I’m
assuming EVO has the rights over me publishing this data in excel form so I
hope, in the name of science!, they will be ok with me publishing the data so
anyone can play with this data themselves).
There were many typos in the excel sheet, and after
looking for outliers using Google Refine I got to work. My dataset may vary
well include incorrect data I missed. But as Donald Rumsfeld said, “You go into
an analysis with the data you have, not the data you might want or wish to have
at a later time” (I think he said that…).
Exploratry data analysis:
The data includes the
variables; Name of car, Manufacturer, Issue Number, Price (almost all in English
pounds), number of cylinders, size of engine (in Cubic Centimeters), Bhp, RPM
max Bhp is achieved, Lbft, RPM max Lbft is achieved, Weight (in Kg), seconds to
0-60 mph, seconds to 0-100 mph, max mph, CO2 emissions g/km, EC mpg, and EVO
rating (out of 5 stars).
Those cars that are not in production have the years sold
instead of price. Because I think price is a fundamental variable and no
analysis could be complete without, I will not include those that have no price
data and subsequently only those cars still on sale in the UK. Seconds from 0
to 100 will not be used because much of the data is missing. And Issue number,
max mph, ec mgp, and CO2 will not be used b/c I don’t think they are relevant
to the question at hand.
The number of observations I will use for the rest of the
analysis is 289.
Below on Table 1 are the general population characteristics.
Table 1: General Population
Characteristics
|
Observations
|
Horse Power (Bhp)
Mean (SD)
Median (1st Q – 3rd Q)
|
377.1 (189.66)
345 (219 – 505)
|
HP RPM
Mean (SD)
Median (1st Q – 3rd Q)
|
6342 (1143.3)
6100 (5500 – 7000)
|
Pound Feet Torque (lbft)
Mean (SD)
Median (1st Q – 3rd Q)
|
350.2 (169.8)
324.5 (221 – 457)
|
FTRPM
Mean (SD)
Median (1st Q – 3rd Q)
|
3651 (1772.3)
3500 (2000 – 5000)
|
Weight (kg)
Mean (SD)
Median (1st Q – 3rd Q)
|
1478 (567.11)
1482 (1202 – 1740)
|
0-60 (seconds)
Mean (SD)
Median (1st Q – 3rd Q)
|
5.27 (5.27)
4.8 (4.1- 6.1)
|
Price (English Pounds)
Mean (SD)
Median
(1st Q – 3rd Q)
|
106800
(209979.4)
54980 (7630 – 2000000)
|
Engine Size (Cubic Centimeters)
Mean (SD)
Median (1st Q – 3rd Q)
|
3579
(1709.675)
3456
(1997 – 4915)
|
Manufacture:
Audi
BMW
Porsche
Mercedez Benz
Aston Martin
Jaguar
Bentley
Caterham
Other
|
23 (8%)
20 (7%)
20 (7%)
17 (6%)
17 (5.9%)
13 (4.5%)
9 (3.1%)
9 (3.1%)
194 (67.8%)
|
Number of Cylinders:
2
3
4
5
6
8
10
12
16
|
1 (0.3%)
4 (1.39%)
103 (36%)
4 (1.39%)
47 (32.5%)
93 (32.5%)
8 (2.79%)
23 (8%)
3 (1%)
|
Rating (Stars):
5
4.5
4
3.5
3
2.5
|
63 (22%)
71 (24.8%)
101 (35.3%)
41 (14.3%)
8 (2.8%)
2 (0.7%)
|
First thing to note is that the cars in this dataset are not
representative of the total car population at all. For example, Bentley has 9
cars in the dataset while Toyota and Honda have one each (FRS and CRZ). In
addition, with 377 horsepower, 350.2 LBFT, and cost of £116,000 pounds, the average car in this
sample might be considered a just a bit skewed towards the high performance,
expensive side of the car spectrum (#subtlebritishhumor). But then again EVO is
no ordinary magazine and we should accept that those car journalist chose cars
that are interesting to car enthusiasts such as ourselves. I therefore think
that the cars represented here are a good sample of performance vehicles and
well suited to answering the question at hand.
EVO rating (response variable) is somewhat problematic for
usual regression analysis. Hypothetically, there are 11 (0, 0.5, 1,…, 4.5, 5)
different ratings that can be given to a car. However, in the sample, only five
different values are present. The mean EVO rating is 4.23 while 22% of the
sample earned the coveted 5 star EVO rating. Conversely, 0.7% received the
lowest score in the sample of 2.5 stars (these were the BMW 750i and BMW X6M).
Therefore, EVO scores results in two problems. First, the variability of the variable is very small rating is very small. Second, the number of potential outcomes is very small. These problems will be addressed below in the analysis section.
Above are the bivariate
relationships between all variables below the diagonal, the histograms of each
variables on the diagonal, and the corresponding correlation coefficient above
the diagonal. There are many significant two way relationships including those
between HP and LbFt and between HpRPM and LbFtRPM. The high correlation between
these variables might make it hard to distinguish the two effects in a model.
Analysis:
To remind the reader, I’m looking
to see whether torque (LbFt) is negatively related associated with EVO rating or
whether the RPM max torque is achieved (FTRPM) is positively associated with EVO
rating holding everything else constant.
While I’ve fitted quite a few
models, I’ve decided to show just two final models. Many of the variables
discussed before weren’t statistically significant or of specific interest to
this analysis. The first model is an ordered logit model and is more accurate given
the categorical nature of EVO rating data. The second model is a much simpler
multiple regression. As a note, I ended up log transforming all the variables
for my analysis.
The first model shows that Price.1
has a positive and statistically significantly relationship between EVO Ratings.
Weight and LbFt are both negatively related to EVO Ratings and significant at
the .10 level, while LbFt is not significant at 0.05 level. HP and FtRPM are
not significant at .10 level but come close with p values of .12 and .13, and
importantly both are positive.
MODEL
1: Ordered Logit
|
||||
Variable
|
Estimate
|
Std
Error
|
z value
|
Pr(>|z|)
|
log(Bhp)
|
1.3945
|
0.9043
|
1.542
|
0.12307
|
log(Weight)
|
-1.339
|
0.5103
|
-2.624
|
0.00869
|
Price.1
|
1.4605
|
0.3289
|
4.441
|
8.96E-06
|
log(ftrpm)
|
0.474
|
0.3185
|
1.488
|
0.13665
|
log(lbft)
|
-1.7578
|
0.9221
|
-1.906
|
0.05662
|
Threshold coefficients:
|
Estimate
|
Std.
Error
|
z value
|
|
2.5|3
|
2.52
|
4.09
|
0.616
|
|
3|3.5
|
4.158
|
4.041
|
1.029
|
|
3.5|4
|
6.074
|
4.037
|
1.505
|
|
4|4.5
|
8.247
|
4.057
|
2.033
|
|
4.5|5
|
9.898
|
4.068
|
2.433
|
Model One Problems:
Many of the thresholds aren’t
significant. This shouldn’t be too surprising since there are fewer observations
for these categories of ratings (in retrospect I should have included the two
observations in rating category 2.5 and labeled them as 3 and called that
category “3 or less”. Next time…).
The second model results are very
similar to the first models results. All the variables have the same sign. The
only difference is that FtRpm is now significant at the .05 level while LbFt
isn’t significant at the .10 level. As a reminder, I include this model for
completeness and to see whether there was any difference between the ordered
logit model. Since there doesn’t seem to be I tell the reader to focus on the
first model.
Model
2: Linear Regression
|
||||
Variable
|
Estimate
|
Std Error
|
z value
|
Pr(>|z|)
|
(Intercept)
|
2.10
|
0.98
|
2.15
|
0.03
|
log(Bhp)
|
0.33
|
0.22
|
1.54
|
0.13
|
log(Weight)
|
-0.29
|
0.12
|
-2.38
|
0.02
|
Price.1
|
0.28
|
0.07
|
4.24
|
0.00
|
log(ftrpm)
|
0.15
|
0.08
|
2.05
|
0.04
|
log(lbft)
|
-0.36
|
0.22
|
-1.59
|
0.11
|
In Conclusion:
Are cars that have higher FtRpm or lower LbFt
holding everything else constant more exciting? Using EVO data and their
ratings I think the answer is a restrained yes. Or more accurately, I think EVO
Magazine editors have a higher opinion of a car if the engine is a revver.
Obviously If you prefer high torque low rpm engines this analysis isn’t going
to convince you otherwise. Happy Driving.
No comments:
Post a Comment