A survey of domain knowledge elicitation in applied machine learning

A Sankey diagram visualizing the results of the survey paper. — This Sankey diagram shows the 73 elicitation paths coded according to our taxonomy. Each node represents one low-level code in the taxonomy. The color of a node encodes the top-level category that the node belongs to. The horizontal position of a node encodes the middle-level category that the node is under.

Abstract

Eliciting knowledge from domain experts can play an important role throughout the machine learning process, from correctly specifying the task to evaluating model results. However, knowledge elicitation is also fraught with challenges. In this work, we consider why and how machine learning researchers elicit knowledge from experts in the model development process. We develop a taxonomy to characterize elicitation approaches according to the elicitation goal, elicitation target, elicitation process, and use of elicited knowledge. We analyze the elicitation trends observed in 28 papers with this taxonomy and identify opportunities for adding rigor to these elicitation approaches. We suggest future directions for research in elicitation for machine learning by highlighting avenues for further exploration and drawing on what we can learn from elicitation research in other fields.

Materials

PDF | DOI | Supplement | BibTeX

Authors

Daniel Kerrigan

Jessica Hullman

Enrico Bertini

Citation

A survey of domain knowledge elicitation in applied machine learning

Daniel Kerrigan, Jessica Hullman, and Enrico Bertini. Multimodal Technologies and Interaction. 2021. DOI: 10.3390/mti5120073

PDF | DOI | Supplement | BibTeX

Khoury Vis Lab — Northeastern University
* West Village H, Room 302, 440 Huntington Ave, Boston, MA 02115, USA
* 100 Fore Street, Portland, ME 04101, USA
* Carnegie Hall, 201, 5000 MacArthur Blvd, Oakland, CA 94613, USA