Apparently some models of Xerox photocopiers are substituting one number for another in photocopied documents. This isn’t an OCR error, and it isn’t just blurred pixels; these are whole different numbers being printed in apparently true copies. Confused Xerox copiers rewrite documents, expert finds
[German computer scientist David Kriesel] said the anomaly is caused by Jbig2, an image compression standard.
Image compression is typically used in scanners and copiers to make file sizes of scans smaller.
Jbig2 would substitute figures it thought were the same, meaning similar numbers were being wrongly swapped.
The results are duplicatable, and have been found in at least two models of Xerox machines, both with original and recently patched software installed. Photocopied invoices, part numbers, engineering tables, and medical information could be just plain wrong, even if the document that was being copied was 100% correct.
Mr. Kriesel presents his findings, complete with examples, here: Xerox scanners/photocopiers randomly alter numbers in scanned documents
In this article I present in which way scanners / copiers of the Xerox WorkCentre Line randomly alter written numbers in pages that are scanned. This is not an OCR problem (as we switched off OCR on purpose), it is a lot worse - patches of the pixel data are randomly replaced in a very subtle and dangerous way: The scanned images look correct at first glance, even though numbers may actually be incorrect.