I think there is a problem with the implementation of the Jensen-Shannon divergence in DoLa and a new hallucination detection method ReDeep. Here I described the problem:
Jeryi-Sun/ReDEeP-ICLR#2
The code for JS divergence is very similar and I think the same issue is in DoLa and as a consequence in transformers.