A demonstration · Datalab Surya OCR 2 (650M)
A 650-million-parameter model transcribing historic newspaper scans from Europeana — across seven languages and two scripts. Each page below shows the original scan beside the model’s reading-order transcription. Toggle to the original Europeana OCR to compare against decades-old text, or reveal the layout blocks the model recovered.