Abstract:
Bayesian inferencing as a machine learning technique was evaluated for identifying
pre-crash activity and crash type from accident narratives describing 3,686 motor
vehicle crashes. It was hypothesized that a Bayesian model could learn from a computer
search for 63 keywords related to accident categories. Learning was described in terms
of the ability to accurately classify previously unclassifiable narratives not containing
the original keywords. When narratives contained keywords, the results obtained using
both the Bayesian model and keyword search corresponded closely to expert ratings
(P(detection)≥0.9, and P(false positive)≤0.05). For narratives not containing keywords,
when the threshold used by the Bayesian model was varied between p>0.5 and p>0.9,
the overall probability of detecting a category assigned by the expert varied between
67% and 12%. False positives correspondingly varied between 32% and 3%. These latter
results demonstrated that the Bayesian system learned from the results of the keyword
searches.
Keywords:
Narrative Text - Bayesian Methods - Epidemiology