Creating an Intelligent Archive
There are numerous indexing strategies provided by today's digital archival systems (EDM, CD Archivers, etc.).
Manual Field Indexing and searching is the most commonly used and allows users to find the desired drawing or document using Boolean logic and traditional wild cards searching. The results of the query are often presented in a result "hit list". Field indexing is very common but requires moderate labor to create the intelligence desired. Some systems include a fuzzy search capability that operates similar to a "sounds like" basis. The resulting hits are ranked according to their closeness to the original search value.
Hollerith Indexing
There are a number of alternative approaches to manually indexing your database. Aperture cards, found in engineering applications, already contain information specific to a drawing in the form of punched Hollerith data. Aperture card scanners capture this information into a text file associated with the scanned image. This data can be field "parsed" and placed into specific fields of a database saving time and money and assuring accurate capture of client records.
Existing Data Integration
It is also common for companies to have a corporate database (revisions history, or document libraries) . This information can be imported into an archival system with a match to the appropriate object thus reducing or eliminating the need for multiple information sets and manual indexing.
Zone Indexing
Zone OCR/ICR indexing, which is found in the office imaging environment, is a proven technology that has the promise of easing the manual indexing process. This approach allows OCR (text recognition) to be conducted within specific areas of a scanned image and the results placed into a specific field.
Full-Text Indexing
The latest extension of OCR/ICR text recognition and native file parsing is the implementation of full text indexing as opposed to field indexing. Full text indexing is when all text found on a document or drawing is indexed into a searchable database. This technique is currently being used to find CAD drawings, native office files, and scanned office documents. It has the promise of being expanded into the scanned engineering archive environment as well. The benefit to users is the full index of the contents of their files with little to no up-front indexing costs. In addition to creating a ranked list of matches listed, more advanced tools highlight the desired text or symbol within the document itself.