CiPdf is associated with CiImage object. It allows to set/get parameters controlling functionality CiImage.Open, CiImage.SaveAs and CiImage.Append methods working with PDF files.
CiImage.SaveAs method writes to PDF file Author, Creator, Keywords, Subject, Title, ModDate metadata.
CiImage.Open method obtains from PDF file Author, Creator, Keywords, Subject, Title, CreationDate, Producer metadata.
PDF file is a storage format that encapsulates various type of data: images, text, graphics, annotation etc. Originally PDF was designed as a way to represent textual and graphic data in system-in depended manner. Many scanners and imaging system use PDF to store multi-page images, as substitute for multi-page TIFF format. Such images can be extracted directly from PDF file.
To obtain an image of "non-image" content PDF pages, like text, graphic, annotation (including annotation-based barcodes) PDF page has to be rasterized. Rasterization is similar to taking a snapshot of a page as displayed by the PDF viewer.
CiImage.Open method uses value of readMode property to treat PDF file as collection of image, collection of rasterized pages or mix of those.
- If readMode = epemImage, Open method locates images encapsulated in PDF file.
If image is less them minImageWidth and minImageHeight (expressed in inches, ignored if 0.0) it is skipped.
Some scanners store black-and-white images inside of PDF using color format. To improve efficiency of image processing set useMinImageColors to ciTrue. Image is automatically converted to minimum number of color needed, i.e. grayscale or black-and-white. - If readMode = epemRaster, Open method rasterizes each page. Color of resulting image is determined by rasterColorMore property. If set to eprmAuto color is set based on colors present on the page. NOTE: eprmAuto mode is slower than explicit setting of color. Resolution of rasterized image is specified for each of 3 modes: Black-and-white, grayscale, and color. See dpiRasterBw, dpiRasterGs, dpiRasterRgb.
- If readMode = epemAuto, Open method analyzes each page of PDF file for presence of an image, occupying 90% of the page. If such image is present, this page is considered an image page and image is extracted. Otherwise page is rasterized.