Saturday, November 2, 2013

Why are Palestinians Killed by Israeli Defense Forces?

I watched 5 Broken Cameras and was surprised that Israeli Defense Forces (IDF) shot live ammunition at Palestinians protesting the security wall. One Palestinian Protester in the documentary was shot dead. I wanted to know how often protestors throwing stones were killed by the IDF. 

B'tselem has statistics for each fatality caused by IDF in both West Bank and Gaza Strip since 9/2000, with a short description of the event that took place. Some examples of these descriptions are show below:

2                          Killed while on his way to buy candy at a store next to her school.
3                                   Killed when Border Police came to his house to arrest him.
4                                                                         Killed in his house.
5                Killed during the arrest of his brother, who Wanted by Israel. Was not armed.
6  Wanted by Israel. Killed during an exchange of gunfire with soldiers who came to arrest him.

I'd like to classify these descriptions to see reasons people are killed by IDF. Doing so can give a better picture of the current conflict and problems to overcome.

First some descriptive Statistics:
6711 Total Observations - All considered Palestinian Citizens
517 Female and 6194 Males
Mean age is 25.38 years, with 19-29 1st-3rd quartile. 
Above is an Age Pyramid of Palestinians killed. Men are predominately the victims and have a slightly right skewed distribution with age. Women have a much more uniform distribution. 

Overall the demographics most likely to be killed by IDF forces are male and in their mid twenties.

Below is a time series plot of all Palestinians Killed monthly by Gaza or West Bank Location. 

However, 539 deaths had no description.Below are those without time series:

Missing data appears to occur with high number of deaths. 

To classify the data I used Latent Dirichlet Allocation. This method assumes each text description or post is an amalgamation of independent topics. Each topic has a probability distribution over the set of words in all posts. For each description the algorithm takes the words as given and produces a probability distribution over the set of topics. In effect, each post will result in a combination of "Topics" and will be a sum of different these Topics. The benefit of LDA is that multiple topics can be applied to each description. 

Topics are displayed below in decreasing Importance:

Topic 7Topic 9Topic 4Topic 14Topic 5Topic 12Topic 19Topic 1Topic 15Topic 18
Topic 8Topic 16Topic 10Topic 11Topic 20Topic 6Topic 2Topic 17Topic 13Topic 3
checkpointyunistruckthrowneighborassassin "area"militarialonglater

Two Topics I thought were worth mentioning were topics 11 and 19.

Topic 11 includes terms "Wanted", "Arrest" "Israel", "Person", "Undercover". This topic discusses Palestinians killed by IDF soldiers who were wanted by Israel. 

Above is the average proportion of each post attributed to this topic. One can see that it increased in 2006 in West Bank, while remaining relatively constant in Gaza. Because Israel has much more control of WB than Gaza Strip, it makes sense that Israel tries and arrest more people in WB and results in more deaths as a result.

Topic 11 discusses those who died during demonstrations; the very topic that brought my interset tot he topic. Its interesting that there has been a steady increase in West Bank while Gaza has seen a slight decrease in these. The Separation Barrier could be the reason for this divergence in protesting.

R Code


Tuesday, October 1, 2013

Israeli Immigration and Demographics

Israeli Jewish population has been increasing steadily since Independence. Number of Jews in world has remained fairly constant and increasing at a slow rate after WW2.

Jewish Israelis emigrated in distinct waves throughout post war period. First mast influx occurred in 1948-1951 when most of Middle East Jews (Iran, Iraq, Turkey, Yemen, etc) and Eastern European Jews came (mainly Poland, Romania, and Bulgaria).

Between 1952 - 1971 most migration was from North African (Morocco and Algeria) and Eastern Europe. The only large movement of Jews from Middle East into Israel was from Egypt (I'm assuming 1956 Suez War was the cause). Itzik informed me his family move from Iraq in 1952 which was after most Jewish Iraqi's left.

The 1980's were a lull of immigration, with Ethiopian Jews beginning to come.

Finally, at the end of Cold War Former USSR (mainly Russia and Ukraine) countries allowed Jew's to emigrate to Israel. This ongoing migration contained over 1 million Jews and the dramatic increase in Israeli Population is shown below.

Above is population in thousands of Israel by Religion. We can clearly see the large increase in Jewish population occurring in late 1980's early 1990's from former USSR immigratns. Also, one can see a sudden increase in Muslim population in 1967 due to the annexation of East Jerusalem.

To get a better understanding of growth rates in Israel Log population values of the same data are shown below.

Here we can see more clearly that Muslim Population is increasing at a much faster rate than Jews or any other group.

Also note that in 1981 number of Druze went up suddenly because Israel (all but in name) annexed the Golan Heights. I'm not sure why the Christian population increased in mid 1990's and decreased...

 Above is Population based on Ethnicity and not Religion as before. It looks much the same except the Christian and Muslim population are shown as Arabs.

Finally, I'd like to close off by including West Bank and Gaza Population. These territories include over 4 million people. If we include them in our current data then Jews are only slightly in the majority.

Thursday, September 26, 2013

Market Cap and Ownership Structure of Auto Companies

Automotive companies are generally compared by the number of cars sold. For example; Toyota and GM are often in competition to sell the most number of cars in a year. While interesting, these seem kind of irrelevant. For instance; if one sells many cars but makes no money on them, they won't be in business for long. 

I think a more interesting indicator is market capitalization. This is basically the value of the company on the open market and is based on the beliefs of future earnings. Below is the value of each company as of September 2 2013.

Toyota is number one by a large margin and blows GM out of the water in terms of value. The second most valuable company, VW at $108 Billion dollars, is just over half of Toyota's $208 billion. 

Fiat, which currently owns over half of Chrysler (Chrysler isn't showed because it's owned by UAW and Fiat with no shares on market to price it), and is worth less then Tesla a new company that sells relatively few cars. 

These market values are not independent in the sense that some companies own parts of other companies. Below is a network graph with percentage of ownership along the edges with size of vertex proportional to market cap. An arrow pointing to a company from another indicates that company is partly owned  (for example A <- B means B owns part of A). 

Two things to note about the ownership graph:
1. It's easier to see how VW and Toyota both own part of Suzuki with this graph. Of course since Toyota owns Suzuki through its holding in Subaru so it's holding is minuscule.

2. There is cool menage a trois between Nissan, Renault, and Mercedes, where all have ownership in another.

Saturday, September 7, 2013

Mr. Al-Sabah Welcomes you to the Middle East

A very brief Letter to the Editor to the Financial Times discusses the relations among the Middle East succintly and accurately. Quoted in full:

"Sir, Iran is backing Assad. Gulf states are against Assad!

Assad is against Muslim Brotherhood. Muslim Brotherhood and Obama are against General Sisi.
But Gulf states are pro Sisi! Which means they are against Muslim Brotherhood!
Iran is pro Hamas, but Hamas is backing Muslim Brotherhood!
Obama is backing Muslim Brotherhood, yet Hamas is against the US!
Gulf states are pro US. But Turkey is with Gulf states against Assad; yet Turkey is pro Muslim Brotherhood against General Sisi. And General Sisi is being backed by the Gulf states!
Welcome to the Middle East and have a nice day."

These relations are quite difficult to comprehend all together, so I made a graph to try and clarify the relationships.

Yup, looks like the Middle East is a cluster****!

Thursday, September 5, 2013

Car Magazine Hot Hatch Hall of Fame Graphs

Car Magazine's June 2013 issue picked 17 of the "greatest hot hatches of all time" and included specs. Unfortanely they didn't include any graphs of numbers to see how hot hatches developed through time, so I took it upon myself as a civic duty (my way of giving back to the community) to display the numbers in graphs. I also included weight for each car (in lbs of course) and UK CPI index to find the real value of prices of cars sold when new.

Steady decline in 0-60 times throughout the years. The Peugoet 106 Rallye had the slowest time of all at 10.3 seconds and was sold between 1994 - 1998.  I've seen reported times for the Pug at 8.3 seconds so this might be a typo, then again it only has 100 bhp so its not going to be a speed demon...

Interestingly, some of the quickest hot hatches were sold in the late 1980's and early 1990's. These included the Lancia Delta Integrale (5.7 sec), Ford Escort Cosworth (6.2 sec), and Nissan Sunny GTI-R (6.1 sec). A plausible reason for the quickness of these cars was that all had AWD and therefore had more traction off the line to accelerate quickly.

 Above graph is Real Price when the car was new in 2009 Pounds. The real price of the Delta Integrale, Sunny GTI-R, and Ford Escort Cosworth are all relatively more expensive in real terms. However, the high cost of those cars was probably because of the very high performance for their time (and today's time, look at 0-60 mph).

Very generally cars go through life stages. When new they look bold and modern. When slightly old (5-10 years) they look dated and their faults are well known. But something interesting happens to cars after about 10 years: they start to look cool again and they are appreciated more. I wanted to see if this U-shaped "lifecycle" is present in the price data. To compare cars and their particular life cycle stage I divided the Real cost when new by the current cost today and plotted those values against time. The resulting figure is shown above. In some sense the U is there: a cars value relative to its initial real price declines for the first 10-15 years and then increases again. 

Looking at this graph, I'm thinking maybe I should buy a Puegeot 306 GTi-6. After all even now its performance are up to par with modern cars. In addition, its at the trough of the coolness curve so it might appreciate in value.

 Above two graphs show the what RPM the maximum horsepower and maximum torque occurs at for each car. The color is categorized by whether the car is turbocharged or not. Both graphs show that naturally aspirated cars increase RPM through time while turbocharged engines reduce RPM.

I'm kind of puzzled by this. Obviously to get more HP out of a naturally aspirated engine one needs to increase the rev range and should work in principle for Turbos. However, turbocharged engines seem to be taking a different route and are able to get the performance that surpasses that of a naturally aspirated engine but at a much (and continuously lower) rev band.

 Most engines are at 2 liters and doesn't appear to be a strong trend.

 Maximum Speed shows a strong trend by time.

Data and R Code:

Thursday, May 23, 2013

Can Nissan Double It's Sales in by 2017?

Carlos Ghosn announced the Nissan's sales target to double by 2017. Is this a reasonable target?

Using data from Wards Auto, I  fifth difference in percent (X(t)-X(t-5))/X(t-5) and saw between 1963 and 2012 what companies actually increased sales by at least 100%.

The last company to accomplish such a feat in 5 years was Hyundai between 1998 and 2002. Increasing from 0.57% market share to 2.19% Market share and corresponding sales increase from 91,000 to 375,000.

Nissan achieved the goal between 1969-1973 with sales increasing from 0.79% to 2.19% market share and sales went from 91,000 to 319,000.

It's pretty obvious that doubling sales when sales are initially at a low level is fairly easy. So I looked what was the starting largest Market Share / yearly sales of companies at the beginning of period that achieved double growht. Honda started with 0.92% market share and 102,000 sales in 1979 (the largest number of raw sales that at least doubled in 5 years) while Hyundai started out 113,000 sales and 0.73% market share in 1997. These were the largest increased in 5 years.

Compare those numbers with current Nissan sales/ market share of over 1.1 million and 7.72%. Doubling sales of this magnitude has never been achieved, ever. I don't think Nissan can pull it off.

Friday, May 17, 2013

What cars should be faster/slower around EVO's racetrack?

Lap times are dependent on such variables as horsepower and weight. More horsepower will correspond to a faster lap time while more weight will lead to a slower lap time. However, there are many other variables not in the specs that can effect the lap time. For example; some cars are hard to handle on the limit and will make the driver cautious / slower around the track. Conversely a more confidence inspiring car will make the driver push the car to its limits and have a quicker lap time. Other factors such as; steering feel, power delivery, weight transfer through the corners, etc. will effect the lap time to a degree that the specs of a car do not.

What cars should be faster or slower on a racetrack given their own specifications?

Using EVO Lap times and variables including; Number of Cylinders, RPM @ max RPM, HP, RPM @ max LBFT, LBFT, Engine Size, and Weight, I made a regression model and looked at those that over/underperformed given the model.

This then, is just looking at the residuals and seeing what cars should be better. This may sound stupid at first. After all no one says "Well..yea you're car is 5 seconds faster, but it really should be 10 seconds faster!". Then again most of comparisons involve residuals. Whenever anyone says "Good for the price" they're talking residuals given a price. Here I'm talking residuals given HP and Weight figures.

Below are the cars and how fast they went around the EVO lap in seconds. (Other data can be found here)

Lap Time
Ferrari 458 Italia
Caterham Levante VS
Porsche 997 GT2 RS
Lotus 2-Eleven GT4
Caterham Superlight R500
McLaren MP4-12C
Noble M600
Porsche 997 GT3 RS 4.0
Lamborghini Murcielago LP670-4 SV
Ariel Atom 3 Supercharged
KTM X-Bow (300bhp)
Ferrari 430 Scuderia
Porsche 997.2 CT3 RS (3.8)
Brooke Double R
Lamborghini Gallard LP560-4
Lamborghini Murcielago LP640
Porsche 997.2 GT3
Porsche Carrera GT
Porsche 997 Turbo S
Nissan GT-R
Lotus 340R (190bhp)
Caterham Superlight R300
Maserati GranTurismo MC Stradale
Mercedes-Benz SLS AMG
Porsche Boxster Spyder
Ferrari California
BMW E92 M3 Coupe
Mercedes-Benz SL65 AMG Black
Audi RS5
Audi R8 Spyder V8
Porsche Cayman R
BMW M5 (F10)
Aston Martin V12 Vantage
BMW 1-series M Coupe
Mitsubishi Evo X FQ-400
Mitsubishi Evo X RS 360
Renaultsport Megane 265 Trophy
Audi TT RS
Aston Martin DBS
Porsche Panamera Turbo
Jaguar XJ220
Mercedes-Benz E63 AMG
Porsche Cayenne Turbo
Lotus Evora
Nissan 370Z
Porsche Panamera S
Lotus Elise SC
Mercedes-Benz C63 AMG Coupe
BMW E46 M3 C5L
Renaultsport Megane R26.R
Vauxhall VXR8 Bathurst S
Audi RS6 Avant
Jaguar XFR
Honda Civic Type-R Mugen 2.0
Lexus IS-F
Porsche Boxster S
Subaru WRX STI
Jaguar XJ Supersport
SEAT Leon Cupra R
Bentley Continental Supersports
Lotus Elise Club Racer
Maserati Quattroporte S
Renaultsport Megane 250 Cup
Honda NSX
Nissan 370Z Roadster
VW Scirocco 2.0 TSI
Ford Focus RS (Mk2)
Renaultsport Clio 200 Cup
VW Golf GTI [Mk6)

Below is a Pairs Plot of each of variables.

I ran a step regression with Lap Time as dependent and all other variables (excluding weather since there all except 3 were Dry). The process lead to a model including; HPRPM, Engine Size, BHP, and Weight and is shown below. 

Coefficients: Estimate Std. Error t value pvalue
(Intercept) 88.2097659 2.1478362 41.069 approx zero
hprpm -0.0005044 0.0002497 -2.02 0.0475
CubicCenti 0.0003122 0.0002945 1.06 0.293
Bhp -0.0207499 0.0027082 -7.662 1.16E-10
Weight 0.0056975 0.0008757 6.506 1.28E-08

Residual standard error: 1.974 on 67 degrees of freedom
Multiple R-quared: 0.6635 ADJ R-squared:0.6435
F-statistic:33.03 on 4 and 67 DF

Using this model the residuals

Car Residual
KTM X-Bow (300bhp) -2.93487945
Lotus 2-Eleven GT4 -2.83245965
Nissan GT-R -2.66665528
Ferrari 458 Italia -2.57420938
Porsche 997.2 CT3 RS (3.8) -2.55684266
Mitsubishi Evo X RS 360 -2.50226452
Maserati GranTurismo MC Stradale -2.46471552
Renaultsport Megane 265 Trophy -2.44615874
Porsche Cayenne Turbo -1.91766928
BMW 1-series M Coupe -1.83047226
Porsche 997 GT3 RS 4.0 -1.81502563
Caterham Superlight R500 -1.77237657
Porsche Boxster Spyder -1.6582512
Porsche 997.2 GT3 -1.54770591
Lamborghini Murcielago LP640 -1.35574562
Porsche 997 GT2 RS -1.28253206
Audi TT RS -1.27050135
Porsche Panamera Turbo -1.17816797
Lotus Evora -1.10761578
Porsche Panamera S -1.04072814
Porsche 997 Turbo S -0.83654924
Ferrari 430 Scuderia -0.82185243
Audi R8 Spyder V8 -0.68468002
Porsche Cayman R -0.68457919
Lamborghini Gallard LP560-4 -0.67647912
KTM X-Bow -0.63837555
Nissan 370Z -0.62867383
BMW E92 M3 Coupe -0.58344003
Audi RS5 -0.56349199
Renaultsport Megane R26.R -0.50017286
Lotus 340R (190bhp) -0.44282995
McLaren MP4-12C -0.40117778
Brooke Double R -0.3714648
Ferrari California -0.33430501
Caterham Superlight R300 -0.3061938
Lamborghini Murcielago LP670-4 SV -0.24397178
BMW M5 (F10) -0.18196681
Mitsubishi Evo X FQ-400 -0.18097319
Ariel Atom 3 Supercharged -0.10631631
Subaru WRX STI -0.08331381
Aston Martin V12 Vantage 0.02661749
Mercedes-Benz E63 AMG 0.11885765
Mercedes-Benz SLS AMG 0.33430532
SEAT Leon Cupra R 0.47876336
Mercedes-Benz C63 AMG Coupe 0.48358713
Aston Martin DBS 0.54115489
Lexus IS-F 0.55513591
Porsche Boxster S 0.67495797
Maserati Quattroporte S 0.67528337
VW Scirocco 2.0 TSI 0.7166074
Mercedes-Benz SL65 AMG Black 0.87429901
Renaultsport Megane 250 Cup 1.0633422
Honda Civic Type-R Mugen 2.0 1.24547732
Noble M600 1.30749315
Jaguar XFR 1.33734955
Bentley Continental Supersports 1.38429526
Audi RS6 Avant 1.61605625
Vauxhall VXR8 Bathurst S 1.76433676
Lotus Elise Club Racer 1.84730036
Jaguar XJ Supersport 1.86583708
Porsche Carrera GT 2.00596389
BMW E46 M3 C5L 2.03663438
Honda NSX 2.27324813
Nissan 370Z Roadster 2.37761095
Lotus Elise SC 2.53179383
Ford Focus RS (Mk2) 2.94808301
VW Golf GTI [Mk6) 3.02997026
Renaultsport Clio 200 Cup 3.87547123
Jaguar XJ220 3.90086627
Caterham Levante VS 4.13508509

Jaguar XJ220 should be faster (underachiever) while Ferrari 458 should be slower (overachiever). Makes some sense.