Colour separate, OCR, re-combine

Martin_Packer · April 30, 2021, 7:37am

Our process produces 8-colour (maximum) GIFs. Today I happily display these using HTML 5 Canvas and the same PHP / javascript code can swap out a colour for another (more modern?) one. Corollary is I know from this code which colours are present.

As part of the production process some text appears in 2 of the colours only (black and blue). So I would like to:

Separate each colour into its own graphic (layer?)
OCR the text in each of these two layers.
Replace the text with a newer font with the same text in - in these layers.
Recombine the layers into a composite graphic.

The output could be bit mapped (PNG?) or vector (eg SVG) as downstream processing can handle either. (In the SVG case, md2pptx uses CairoSVG to turn a SVG into a PNG under the covers.)

Any thoughts on how to accomplish some of the above? I put it under “Mac” because that’s where I expect to be building this.

rlivingston · May 2, 2021, 9:42am

This may seem like a glib answer, but I would wonder can you do this by hand? Are there 15 steps, one after another, that you do to accomplish this task in one or two or three applications?

If you can do it “by hand”, are there any place(s) along the way where you have to make a human judgment? Or can you just move along passing a picture to some OCR program and then copying and pasting the generated text back where it is needed. If you had some diligent but not savvy intern to train for this task could you do that? Do you understand the process in the applications you are using to do figure out all the steps you need to perform what you need?

Once I broke up the task into steps, I would create a Keyboard Maestro macro to do them.

sylumer · May 2, 2021, 10:15am

You are starting with a GIF, a single layer graphic (assuming non-animated of course). Would replacing text with the same text in a different font not leave you with gaps in your image data where the two fonts do not overlap?

Even if your “newer” font was larger, character spacing would still generate be likely to create misalignment and imperfections.

Based on that, it seems to me like being able to change the original graphic or have that process generate an additional graphic would be required in order to address the potential for image data loss.

Martin_Packer · May 2, 2021, 7:18pm

Actually this would be a frequent thing. And, no, no intern to do this for me. (Nor to rewrite the GDDM calls to use more modern fonts.)

The human judgment would be in whether the OCR had got it right.

Martin_Packer · May 2, 2021, 7:22pm

Actually these are against a white background so even blanking the text out would not leave holes or discontinuities.

And I’d make sure the font was small enough so the text occupied less space than what it’s replacing. And I can centre it over what it’s replacing - given the most likely use case is a graph title.

Anyhow I think I’ll go for bounded OCR - probably cutting the title text out into a fresh graphic to do it.

I’ll also, as you say, continue to pursue a “create it right in the first place” approach.