PLD vs Two-Model Speculative Decoding

Benchmarking ngram-simple, ngram-mod, and draft-target speculative decoding on synthetic multi-turn edit workloads. T1 creates the artifact; T2-T4 make small edits and re-emit the full result.

Synthetic prompt dataset Hugging Face

Loading benchmark data

Reading benchmark summary JSON