Tuesday, June 26, 2007

PDF Search Engine- 2D Searches

PDF, which is an acronym for portable document formant was initially developed by Adobe Systems beginning in 1993. Its primary purpose was to represent two dimensional documents in a way that was independent both of the device and the display resolution. a pdf file can typically be read by most web browsers on most devices. PDF files include the complete description of either a two-dimensional or with Acrobat 3D, three-dimensional documents. Currently pdf files are considered open standard. Preparation for submission as an ISO standard is in process.

Over the years, pdf files have become the accepted way of capturing and storing two and now three dimensional images. Governmental jurisdictions, colleges and university research archives, medical records and many other applications have been converted to pdf files so that full retention is available and searchable. This allows massive amounts of information to be protected and stored in much the same way as microfiche and microfilm used to be retained. However, the use of the computer for preparation and storage of the files means that almost anyone with a desktop has access to the information. No longer is it necessary to have a bulky microfiche or microfilm reader in order to view the images.

In the past, however the major drawback of PDF files is that only one file at a time could be searched and then only by using the 'find tool' embedded within the document. So in order to do a search of a particular phrase occurring in multiple pdf files, the researcher would need to pull up the appropriate pdf file, use the embedded 'find' feature and view the results individually. The researcher would then move on the the next file, call it up, and repeat the process. This process is both time-consuming and tedious.

A new generation of software has been developed around the concept of searching for data within multiple pdf files. This software allows a cache of multiple pdf documents to be reviewed by the pdf software and multiple results presented, just as in the results page of a Google search or a Yahoo search.

It should be emphasized that the sophisticated pdf search engine software goes beyond just pdf files, it do other types of specialized search engine work as well, such as audio, OCR and scanned images. In fact, searches can be conducted simultaneously in pdf and over two hundred other formats.

Such technology is a boon to researchers of all types, whether it be textual, photographic or other types of media format. The pdf file recognition means that data doesn't have to be converted to .doc or any other format in order to conduct searches. In some cases, such conversions were not possible, anyway. The wealth of information available to genealogical or historical researchers is directly due to the continued advances being made by such companies as Adobe in providing a format and other software developers who are instrumental in making the information available in a usable format. PDF has become somewhat of a revolution in storing, searching and sending electronic documents.

No comments: