Uncertainty transfer with knowledge distillation

Ibtihel, Amara; Clark, James J.

Knowledge distillation is a technique that consists in training a student network, usually of a low capacity, to mimic the representation space and the performance of a pre-trained teacher network, often cumbersome, large and very high capacity. Starting from the observation that a student can learn about the teacher’s ability in providing predictions, we examine the idea of uncertainty transfer from teacher to student network. We show that through distillation, the distilled network does not only mimic the teacher’s performance but somehow captures the original network’s uncertainty behavior. We provide experiments validating our hypothesis on the MNIST dataset.

Paru en avril 2020 , 9 pages

Document

G2023-EIW10.pdf (370 Ko)

GERAD

G-2020-23-EIW10

Uncertainty transfer with knowledge distillation

Amara Ibtihel et James J. Clark

Document