Various new acknowledgements.
Rephrasing of WARNING, TIP, and TECHNICAL DETAILS (“curvy road”) material.
p.66. Figure 3-12 should appear before Figure 3-13.
p.71. Final rule should be:
IF (Balance ≥ 50k) and (Age ≥ 45) THEN Class = No Write-Off
p.84. Fig 4-2 fixed to match points of 4-1.
Fig 4-3. Decision boundary should be expressed with an equals sign, i.e.:
Age = Balance x -1.5 + 60
p.85. Equation 4-1 should be:
p.86. Equation should be:
f(x) = 60 - 1.0 × Age - 1.5 × Balance
Figure 4-9 is wrong. Hinge loss starts on the opposite margin from what is indicated. See new and im
Added references to Lapointe and Legendre’s site containing papers and data: http://adn.biol.umontreal.ca/~numericalecology/data/scotch.html
p.143. Vector XB-XA should point to the left.
p.165. Tweaked Fig 6-6 to make point distances match clusterings.
p.168. Caption of Fig 6-9, removed “At top is the entire dendrogram”.
p.170. Figs 6-10 and 6-11 switched to match citation order.
p.180. Group J second item should say: “The best of its class: Linkwood (Speyside), 12 years, 83 points”
p.181. Fig 6-14 updated to consistent notation. Caption corrected (had leftmost and rightmost leaves reversed).
p.191. Several mis-statements fixed:
• “She could have achieved 90% accuracy but only got 37%” → “She could have achieved 90% accuracy but only got 64%”
• “by correctly identifying all the negative examples but only 30% of the positive” → “by correctly identifying all the negative examples but only 60% of the positive”
• “model A’s accuracy declines to 37% while model B’s rises to 93%” → “model A’s accuracy declines to 64% while model B’s rises to 96%”
p.192. Fig 7-1 altered to be clearer, and shaded areas adjusted to match proportions in example.
p.197. Fig 7-2 altered so to remove the phrases “tp rate”, “fp rate”, etc., from the bottom right matrix. Matrix entries are joint probabilities, not the rates (which are conditional probabilities).
p.199. We buried a point that is now more prominent: “We will express all values as benefits, with costs being negative benefits, so the function we’re specifying is b(predicted, actual).” In other words, at this point in the chapter we merge costs and benefits into a single benefit function, b. Unfortunately, we left a few cost (c function) references in. These should be b’s.
p.201. In the sentence before Equation 7-2, the class priors should be p(p) and p(n).
p.202. In the second line of expected profit formula. after plus sign, it should be p(Y|n * c(Y,n)]
p.204. In the sidebar, the definitions of sensitivity and specificity are reversed. The first should be Specificity and the second should be Sensitivity.
Chapter 9 was edited for typos in the text and equations. The existence of different versions of Naive Bayes (based on different event models) was introduced and explained in a new sidebar. The subtle notion of the absence of evidence as evidence is now discussed. Some subtle errors in the presentation of the evidence lifts were corrected.
p.242. Delete equation immediately before the sentence “Combining this with Equation 9-3, ...”.
p.242. Last equation should be:
p.246. Star Trek appears twice in Table 9-1. The first appearance (lift 1.39) is Star Trek (Movie); the second entry (lift 1.32) is for Star Trek.
p.286. First paragraph, the expression: p(S|x, not T)⋅u S (x) should be p(S|x, not T) * uS(x)
p.287. The equation for the expected benefit of not targeting should be:
Errata and Corrections
Errata (First edition)
Note: Not all corrections will appear here. We are continually improving the material and making updates based on feedback from many sources: classroom comments, email, the Google group, and the official O'Reilly errata page. We try to respond to every comment and correction, and major changes will be recorded, but we don't document every tweak.
We are currently preparing major updates for 2019!
Figure 4-9 should look like this: