New study in Royal Society Open Science:
> In a direct comparison of LLM-generated and human-authored science summaries, LLM summaries were nearly five times more likely to contain broad generalizations (odds ratio = 4.85, 95% CI [3.06, 7.70], p < 0.001). Notably, newer models tended to perform worse in generalization accuracy than earlier ones.
https://royalsocietypublishing.org/doi/epdf/10.1098/rsos.241776