• skyzouwdev 6 hours ago

    That’s a cool approach. Running OCR locally avoids the usual privacy and latency trade-offs, and turning the phone into a network-accessible endpoint is clever.

    Curious about performance — how fast is the Vision Framework on-device compared to something like Tesseract or cloud OCR APIs? And does the app stay responsive if the phone is handling multiple requests at once?

    • gumboshoes 8 hours ago

      Like many OCR solutions, this unfortunately is incomplete. For serious work the final output should be something like a PDF of the original image with the OCRed text embedded. Why? Ground truth. OCR is not reliable enough to isolate its output from the source. The original needs to be available for checking.

      • sgt a day ago

        Cool - although I can't help to think that running a macOS VM and run the Vision Framework tool on it will be less clunky in the long run. Phones don't like to run with screens on 24/7 etc.