Every time I try to replicate or create useful cases for LLMs in technical areas I understand (including code/software development, corroborating various claims, summarizing text, etc) I have such a terrible experience convincing it to actually answer what I ask, without hallucinating or its clear training biases to shine through.
I have yet to see anyone's shared "useful output" (non-trivial tasks) without the same flaws.
I'm not surprised, but confused: people keep claiming it works??