Groups | Blog | Home
all groups > dotnet drawing api > september 2005 >

dotnet drawing api : OCR in .NET



Eyal
9/7/2005 12:48:05 PM
Hello all.

I'm a C# (web) developer, with a task to convert thousands of TIFF images
into text. I am trying to do so programatically with .NET.

Can anyone tell me how to use OCR in .NET?
Is there a built-in library?
Any tutorials / examples?

Eyal
9/7/2005 1:19:07 PM
Thank you, Bob.

I'm really trying to put together a large-scale automation process.. I don't
know if Office* will do.

I have thousands of legal documents that I need to REGEx through and capture
key values from the image, such as addresses, dollar amounts, full name and
such..

I have all the REGEx queries ready to go.. but I need a way to basically run
through a thousand images an hour, convert to text, save full text and
selective segments in a database for cataloging.

I can't really use the Office* application for this. However, if I could
somehow use the available Office* .dll to dynamically process the images from
my C# code, without launching the Office* application, that could be a way to
go about it.

Is that possible?

[quoted text, click to view]
Eyal
9/7/2005 1:38:02 PM
THIS IS GREAT!!
I'll try it out and let you all know how it works.

Thank you all..
- Eyal.

[quoted text, click to view]
Bob Powell [MVP]
9/7/2005 10:04:38 PM
I suggest that you take the angle of getting some of the office automation
to work for you.

I don't know exactly what's available but I don't think you should be
investicating writing your own OCR or any such solution.

I know that there is OCR built in to Office so theoretically you could use
COM automation or at a pinch a Process.Start call to kick it off.

Perhaps you'd be better off over in one of the office groups.

--
Bob Powell [MVP]
Visual C#, System.Drawing

Ramuseco Limited .NET consulting
http://www.ramuseco.com

Find great Windows Forms articles in Windows Forms Tips and Tricks
http://www.bobpowell.net/tipstricks.htm

Answer those GDI+ questions with the GDI+ FAQ
http://www.bobpowell.net/faqmain.htm

All new articles provide code in C# and VB.NET.
Subscribe to the RSS feeds provided and never miss a new article.





[quoted text, click to view]

Sergey Bogdanov
9/7/2005 11:12:15 PM
This one could be useful:
http://www.codeproject.com/csharp/MODI.asp


--
Sergey Bogdanov [.NET CF MVP, MCSD]
http://www.sergeybogdanov.com


[quoted text, click to view]
dave greaf
9/10/2005 6:16:18 PM
Check For Leadtools, they have a nice .Net port of their tools.
Of course you'll need some bucks for it ...
But I'm not sure the office components are supported in a "server" context.


[quoted text, click to view]

AddThis Social Bookmark Button