Post-transformer inference: 224× compression of Llama-70B with improved accuracy
https://zenodo.org/records/17873275
#HackerNews #PostTransformer #Inference #Llama70B #Compression #ImprovedAccuracy
Post-transformer inference: 224× compression of Llama-70B with improved accuracy
https://zenodo.org/records/17873275
#HackerNews #PostTransformer #Inference #Llama70B #Compression #ImprovedAccuracy