Tesseract OCR Plugin OB2

I made a OCR plugin that works with Tesseract 4

Author: DJ Hooligan

Screenshots:
the plugin expects a base64 img as input,
so first you do a http request at the location of the captcha img an then you need to convert it to base64 and assign variable name “base64img”

__________________________

http request:
__________________________

plugin1

__________________________

conversion to base64:
__________________________

plugin2

__________________________

The plugin block
__________________________

plugin3

[THE PLUGIN:]
OB2TesseractPlugin_with_info.zip (328.0 KB)
the plugin .zip is inside this zip, I could not upload the manual .pdf so I’ve zipped it together with te plugin and an custom tesseract language.

Updated version, with tesseract included:

plugin

file is to big to upload here…

enjoy :smiley:

4 Likes

Amazing work, thanks for sharing this.

Nice plugin! Maybe you should read about these new attributes I added so users can try the outcome of the OCR with some sample images directly from the block itself!

Also there’s a parameter of the BlockCategory attribute to set the color of the foreground text, maybe you should set it to white cause the dark color on that shade of blue is really hard to see :sweat_smile:

Thanks for the contribution! :heart:

1 Like

I’m working on it :slight_smile:
image

image

2 Likes

If you encounter any issue please let me know and I will push a fix in the next patch! Thanks!

@Ruri I think you should add an OCR Block which uses an engine like the Tesseract as used in the Plugin, that would be very helpful for many! :smiley: :heart:

Completely reworked version! build and tested on windows with OB2 native v0.2.3
Tested on Linux to with OB2 v0.2.4 [beta]!
It contains Tesseract and ImageMagick libraries.
Also included are 26 captcha languages (with example images)

All settings can be directly tested from stacker view.

part1|439X1054

:eight_pointed_black_star: a bit of useful info:
Manual and info about the cp languages.zip (294.0 KB)

:eight_pointed_black_star: The Plugin:
OB2TesseractPlugin.zip (99.9 KB)

:eight_pointed_black_star: Plugin Dependencies:
(larger than 4Mb so can’t upload here)
Plugin Dependencies.zip

:eight_pointed_black_star: Tesseract Captcha Languages:
tessdata.zip

:eight_pointed_black_star: and for those that use Linux:
OB2TesseractPlugin_Linux.tar.gz

regards
image

8 Likes

Awesome use of the new plugin features! I’m really glad they were useful to someone ^^ If you have any feedback please share so I can improve them. Thanks for the plugin!

1 Like

Hi bro, How to install this plugin.
I unziped to Plugin folder of OB2 but I Can find them. How to install Magick.net?
Can you show me how to install it?

unzip both OB2TesseractPlugin.zip and Pugin Dependencies (OB2TesseractPluginDependencies.zip) inside your OB2\UserData\Plugins folder
you can unpack tessdata.zip wherever you want, you don’t have to overwrite or replace your current tesseract files.
this is how your plugin folder should look like;
Clipboard02

2 Likes

Ok thanks bro but I got this error after testing:
img
Capture

1 Like

You have to start OB2 as administrator and fix the error

image

2 Likes

Same for me, even with administrators privilege

remove the convert Bytes => Base64 block and put data.RAWSOURCE as image source.
don’t know why but you need to close the config after making these changes, else the error keeps popping up?!

Bros I hope you can give me your email or somewhere we can chat privately I’m not a noob and I’m not ready to learn but pls I need your service and I will pay

thanks for your Pugin Bro.But i meet some questions,talk with me.please…
QQ截图_2022年8月1日1时49分37秒

1 Like

try data.RAWSOURCE as image source

Hi @DJHooligan. How can i fix letters for image at 7 letters. the letters of image is 7 but the plugin some time give 7 letters and some time give 5 letters and 6

5

7

6

there is no way to tell tesseract to look for 7 letters, I’ve read the code more than 100 times, you could use a script but all you get is an empty result if tessereract found more or less letters…

1 Like

hi, where to put this pack in openbullet