您的位置:首页 > 移动开发 > IOS开发

How To: Compile and Use Tesseract (3.01) on iOS (SDK 5)

2013-11-21 16:53 459 查看

http://tinsuke.wordpress.com/2011/11/01/how-to-compile-and-use-tesseract-3-01-on-ios-sdk-5/

Update

I don’t have access to a Mac computer now (actually it has been 3 months) and I couldn’t update the guide to Xcode 4.5 (iOS 6), but the fine gentlemanbengl3rt has
done so and the updated script is available at: http://goo.gl/wQea5 (I haven’t been able to test it, so some feedback on whether it works or not would be appreciated!)
I never thought that
my last post would have so much audience. Among other things, it earned me 3 direct job interview offers (1 of  ‘em from Google itself, maintainer of tesseract), an invite to write articles to a TI digital e-magazine and a few digital friends, but that’s
something to discuss at other posts. Thank you!

Getting back to what really matters:
last post was focused on cross compiling (potentially) any library for iOS (armv6/armv7/i386) and to use as an example I chose Tesseract, which was the library I was using on a work project. But the repercussion was so great and both Tesseract and iOS got
newer versions that I’ve decided to write this post specifically about getting Tesseract compiled and using it on your iOS project.

As stated earlier, Tesseract has been officially launched at version 3.01 (that now uses an autogen.sh setup script and an improved configure script ) and iOS has received a major upgrade, version 5.0. As you may guess, these changes broke my script!



So let’s restart this party! (or: Compiling Tesseract 3.01 for iOS SDK 5.0)

The basics about the script were explained at
last post and I’ll be just covering the changes and how to use it.

As noted by
Rafael, the default C/C++/Objective-C compilers for iOS 5 (bundled with Xcode 4.2) have changed, actually, now you just need Clang, so the CPP, CXX, CXXPP, and CC definitions (inside setenv_all()) have changed to:

Additionally, as Tesseract now has on autogen.sh script to run before configuring, we run it before each configure call:

And because Tesseract’s configure script now accepts a path to Leptonica to be specified, no hacks with it are needed, just calling it with another parameter is just enough:

To build your desired library, create a directory, I’ll refer to it as “./build/”. Inside it, create the following structure:

./build/
dependencies/ – which will receive the .h and compiled lib*.a files
leptonica-1.68/ – directory with the
Leptonica 1.68 source files
tesseract-3.01/ – directory with the
Tesseract 3.01 source files
build_dependencies.sh – our build script (link at the end of the post)

Open Terminal, enter our “./build/” directory, cross your fingers (one very important step pointed out byVenusbai)
and run:

Well, if you’re lucky enough and deserve the holy right to use Tesseract on mobile Apps, check the dependencies folder content and there you’ll have all the needed header and library files to play with OCR on your iPhone (I don’t have one, personally prefer
Android, but you got it….).

Great!!! Now what?! (or: Using Tesseract on your iOS project)

Create one new iOS project at Xcode (or just open your existing one)
Add the generated ./build/dependencies/ folder to your project. It contains the needed .h Header and lib*.a Library files
Add the tessdata folder, containing, well, erhm, hum, the tessdata files you need at your project. If you don’t know what the “tessdata” folder is: it contains preprocessed data for a certain language so Tesseract can recognize that language, download language
data from: http://code.google.com/p/tesseract-ocr/downloads/list. Check the sub-instructions below to add it the right way
(not the default Xcode way…)
Right-click your project/group at Xcode
Choose “Add files to your project”
Select the “tessdata” folder
At the same window, check the “Create folder references for any added folders”. This is the most important step, as it instructs Xcode to add your “tessdata” folder as a regular folder (a resource, as well), not as a Xcode project group.

Create your TessBaseAPI object with the code below to start playing with it!
Make sure that every source file that includes/imports or sees (includes/imports one file that may include/import) Tesseract Header files has the .mm extension instead of the regular .m. This allows the compiler to interpret Tesseract Headers as C/C++ headers.

Well, that’s it! Hope you can reproduce this and I also provide to download one Xcode 4.2 iOS SDK 5 project with Tesseract configured and already recognizing one sample image, check it out if having any troubles following this howto.

Files for Download

build_dependencies.sh
Xcode 4.2
iOS SDK 5

build_dependencies.sh
Xcode 4.5
iOS SDK 6

Sample Xcode project
Xcode 4.2
iOS SDK 5
Leptonica 1.68
Tesseract v3.01

Final Considerations

I really hope you guys have enjoyed it and if you have any opinion, compliment, suggestion or just wanna state something, feel free to comment, I’ll try to approve it ASAP.

acknowledgements

bengl3rt: for theupdated script (Xcode 4.5 / iOS 6)
Patrick: for the“spurious
‘/’ in a sed” fix
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: