Imaging Systems Maja Kruegle
Content
Author’s Note This book was written during the 2013 Fall Semester at the Rochester Insitute of Technology. The contents contained in this book were part of class discussions and experiments in the Imaging System class facilitated by Nitin Sampat.
Content
5 15 23 35
Fundamentals Input Processing Output
Fundamentals
5
Fundamentals
Imaging Pipeline The digital image process is essentially inspired by historical analog photography in addition to its darkroom processing and printing. The following pipeline is an overview of the imaging system as a whole and the procedures which are undertaken in order to create a digital image.
Capture Image
Input
Light enters the camera in the form of photons which are then converted to electrons.
Sensor The camera’s sensor is made of light sensitive material which acts as a light meter and records the amount of light at each pixel.
Analog to Digital Converter Within the camera, the voltage collected at pixel sites is quantized into numerical values. The digital count is limited in number in accordance to bit depth.
Raw Data Packaged in File
6
Binary information created by the ADC is packaged into a file format known as a “raw� file.
Opened in Raw Converter Raw processor deciphers image file from the given binary numbers. Neutral Balance CFA Interterpolation (“demosaicing�) Gamma Correction
Artistic Corrections After the photograph is initially processed by the raw converter, the photographer has the option to do further adjustments strictly based on aesthetics.
Fundamentals
Processing
Exposure Brightness Contrast Sharpness Color
Output
Convert to Desired File Type The image is converted to a tiff, jpeg, or other file type depending on intented output.
Raster Image Processor The RIP converts a vector digital image into a high-resolution raster image. It is located in either the host, printer, or in a separate unit.
Print Image is sent through printing device. 7
Image A
8
Fundamentals
Histograms
ETTR
In statistics, histograms graph the frequency that something occurs. For a photograph, a histogram graphs the levels, or the dynamic range, along the x-axis and the pixel count along the y-axis. From looking at a historgram, a photographer can discern under or over exposure, clipping, and posterization. Histograms can be viewed while shooting, on the back of a digital camera, or can be referenced while processing the image.
In the context of raw digital capture, it is common practice for photographers to “expose to the right” of the histogram and make sure that all of the image information fits between the boundaries of the dynamic range of the detector.
“Mountains” extending across the entire range indicate a wide tonal range. However, there is no “perfect histogram”. Histograms provide vital information about the tonal range of a photograph. It is up to the photographer to decide whether or not they want a high, normal, or low key image. Images where most of the tones land in the highlights are considered “high key”, while those with tones mostly in the shadows are “low key”.
Fundamentals
Above is the full RGB histogram for Image A. The image has been exposed to the right and has no clipping in the highlights or shadows. To the left, the histograms for the individual color channels are represented.
This theory bases proper exposure on highlights and, arguably, allows for less loss of image information in post processing while mainting a good signal to noise ratio. However, when operating under this method, it is important to be careful not expose too far to the right, which will result in clipped information in the highlights that cannot be recovered. Initially, photographs may look too bright when exposing to the right, but this is intentially left to be fixed during digital processing.
9
Fundamentals The Four Resolutions
10
There are four kinds of resolutions that must be explored on the topic of image quality. Together, these resolutions represent all of the information contained in an image. They are: spatial, tonal, spectral, and temporal.
Spectral resolution refers to the number of color channels. The most common example of this is RGB, a 3-channel color space made up of red, green, and blue channels.
Spatial resolution is a sampling of X by Y pixels on a 2D grid. The spatial resolution is calculated by multiplying the width and height of a sensor. For example, a one megapixel camera has a sensor made up of 1024x1024 pixels. Spatial resolution is particularly important when discussing text as type requires more spatial resolution than pictures do.
Temporal resolution refers to time. This kind of resolution is applicable to video files. Temporal resolution determines the refresh rate on a monitor. The higher the refresh rate, the better the image will look.
Tonal resolution describes the amount of light that a particular sensor can capture. It also represents the number of gray levels contained in a particular image. Tonal resolution can also be referred to as brightness or the dynamic range of an image. Its units can be either bit depth, a contrast ratio, bits/pixel, f-stops, or decibels.
Calculating Image Size Understanding the four types of resolution allows you to calculate image size. The formula is as follows: Spatial Resolution Tonal Resolution Spectral Resolution x Temporal Resolution Size of 1 Image
The image size of a photograph can be altered through pixel elimination or replication. This is also known as interpolation. When altering the image size of a photograph, Photoshop offers multiple resampling options in order to increase or decrease the size. The three most common interpolation techniques are bilinear, bicubic, and nearest neighbor. There are advantages and disadvantages to each of the options. Altering spatial resoltuion may result in compression artifacts. Blocking, posterization, aliasing, stair stepping, or haloing may occur.
Nearest neighbor interpolation protects the integrity of the photograph’s data, but often results in undesirable jaggedness. However, it requires the least amount of time to process. Bilinear interpolation averages the image information by rows and then columns. It often results in a softening of details. Bicubic interpolation takes the most time to process out of the three methods, but results in an image with the best balance between smoothness and sharpness.
Fundamentals
Altering Spatial Resolution
Image resized using nearest neighbor interpolation and viewed at 200%
Image resized using bilinear interpolation and viewed at 200%
Image resized using bicubic interpolation and viewed at 200%
11
Fundamentals 12
A black to white gradient is converted to 3, 4, 6, and 8 bits. It visually describes the effects of higher and lower tonal resolutions.
Fundamentals Above is a visual comparison of tonal Therefore, the most practical tonal resolutions for a color image. A human eye resolution to print at for an acceptable can only see up to around 7 bits (or 128 color image is 8 bits/pixel. levels).
13
Input Chapter Two
15
Input 16
Capturing an Image
CCD and CMOS sensors
Before digital sensors were created, photographic film was used to detect and store an analog image. In the analog system, a latent image was created on the film which was then developed into a fixed negative or positive. In our digital world, camera sensors serve the same purpose as analog film. Light hits a photosite which then generates voltage.
Both CCD and CMOS sensors utilize a color filter array (CFA), an arrangement of RGB color filters, in order to collect their color information. Each pixel site collects a luminance reading in either a red, green, or blue channel. The difference between these two sensor types lies in the method by which the data is collected.
Input
CCDs (Charge Coupled Device) are smaller sensors that are often used in compact cameras and cell phones. They are more light sensitive than CMOS sensors, which makes them less susceptible to noise. In general, CCDs utilize a “bucket brigade” or “conveyer belt” process in order to capture a charge. Interline transfer CCDs have transfer shift registers along each column while frame transfer CCDs transfer each row incrementally. The frame transfer method is used in still photography, while interline transfer is used in camcorders.
A visual of the “converyer belt” process utilized in CCD sensors.
CMOS (Complementary Metal Oxide Semiconductor) sensors, are larger than CCDs and are commonly used in DSLR cameras. They use less power and instantaneously collect luminance information through integrated circutry. However, microlenses are utilized to increase light sensitivities which makes them more susceptible to noise. Also, early versions had issues with dynamic range. Thankfully, technological advances in recent years has allowed the light sensitivity of CMOS sensors to match that of a CCD sensor.
CMOS censors utilize integrated circutry to more quickly obtain data. 17
Input
Foveon Sensor A foveon sensor is a “direct image sensor� that made up of three layers of pixels that simultaneously capture all of the color information. This technology is possible due to silicon’s inherent ability to absorb different colors at different depths. A foveon sensor uses a standard CMOS semiconductor process. The necessary processing power is reduced by this method becuase there is no interpolation necessary. Manufacturers claim that these sensors result in more vivid colors and sharper images.
Silicon obsorbs different colors at different levels alowing a foveon sensor to take in RGB color information at every photosite.
Foveon sensor pixels collect color information for all three RGB channels.
18
The bayer pattern is a version of color filter array that is used in CCD and CMOS sensors. Red, green, and blue filters are arranged in a particular pattern a photo sensor. There is twice the amount of green than red and blue in order to mimic the color sensitivity of a human eye. The raw image is referred to as a bayer pattern image. However, this process results in only a third of the color data and a demosaicing algorithm is used to fill in the missing data.
Input
Bayer Pattern
Red, green, and blue filters separate the visible light so that each photosite in a sensor collects color information for only one of the RGB channels.
There are twice as many green filters as red and blue in order to mimic the color sensitivity of a human eye. 19
Input Camera Lenses
Normal Lens
Camera’s do not inherently require lenses in order to create an image. However, the optics of a lens has the ability to bring a specific point of the scene into focus. In addition, they will result in an overall sharper image and create depth of field.
Different cameras and sensor sizes have different normal focal length lenses. A “normal” lens for a particular camera scenario is based on a normal angle of view of the human eye, which is about 47 degrees. For example, a 50mm lens is considered “normal” for a full frame DSLR.
There are positive and negative lenses. A positive lense has a thicker center than edges and can make an image on its own. Negative lenses are thinner at the center than at the edges and must be paired with other lenses in order to create an image. Lenses have varying focal lengths that control magnification, the angle of view, and perspective. The longer the lens, the more magnification that occurs. The shorter the lens, the wider the angle of view. Perspective refers to the relative size of objects in a scene.
20
A telephoto or long focal length lens has a greater focal length than a normal lens. This results in increased magnification, compressed perspective, and a shallow depth of field.
Input
Telephoto Lens
Wide Angle Lens A wide angle or short focal length lens has a smaller focal length than a normal lens. This results in decreased magnification, a wide angle of view, and deep depths of field. However, wide angle lenses tend to have a lot of issues with distortion.
Zoom Lens A zoom or variable focal length lens has the ability to shoot at a range of focal lengths in one lens. Lens elements are moved internally in order to produce the defferent focal lengths. There are various disadvantages to zoom lenses. For example, the pictures produced will never be as sharp as with the use of a prime, fixed focal length lens. In addition, they are often not well corrected.
Macro Lens Macro lenses are used to focus at a very short lens to subject distance. They are corrected for flatness of field at the short distance. A “true” macro lens is 1:1 magnification or greater and the term “macro” is often misused by manufacturers.
Supplementary Lens Supplementary lenses are additional lenses which can be screwed onto the front of a main lens. They are often measured in diopters and are a cheap way of increasing focal length. However, the amount of correction for distortion is limited. 21
Processing Chapter Three
23
Processing
Raw File A raw file refers to the mosaiced image that comes directly out of the camera. However, it must undergo demosaicing, gamma correction, and noise reduction before it is viewed as a photograph on a display.
0 1
Propriety raw formats are created by the different camera manufaturers. Examples of these extentions are NEF, CR2, ORF, etc. Essentially, the metadata for the raw files are organized differently. They are also often “locked” in order to prevent non-first party software from writing to them. Digital Negatives (DNG) are an attempt at standardizing these propriety formats.
0
0
0 1 0
JPEG
1
JPEG stands for Joint Photographic Experts Group, which is the organization which created this format. JPEG are generally 8-bits per color and range in file size of 500 KB to 8 MB. There is generational degradation which occurs with repeated saves. While shooting, photographers have the option of creating in camera JPEGs. These files take a 12-bit raw file and compress it in camera. This allows for faster write-speed. JPEGs can help speed up a photographer’s workflow, but editing the images quickly results in compression artifacts.
TIFF TIFF stands for tagged image file format. Images saves in this way can be 8-bit or 16-bit and utilize either lossy or lossless compression. Lossy compression discards information that can become visible while lossless compression reduces the file size without losing image quality. This is possible due to statistical redundancy such as inter-pixel or psycho-visual redundancy.
24
0
1
Raw files require raw engines to interpret their data. They generally range in size from about 10 MB to 40 MB of data. Non-destructive changes applied during processing are made in metadata instructions. They allow for more manipulation without image degredation.
0 0 1
1 1 0
1
1 1
0
0
1
1
0 1 1 1 0 0 1 1 01 1 0 0
0
File formats act as storage containers for the binary information which make up an image.
Bitmap vs. Vector
Parametric editing has become more and more common in digital image processing. It answers the need to work non-destructively (even with JPEGs or Tiffs), to apply identical adjustments a variety of photographs (batch processing), interpret single images in multiple different ways (virtual copies).
Images stored as pixels with a set resolution are referred to as bitmap images. In this case, the larger the pixel, the less detail the image has. Photographs are always in a bitmap format.
This method has the capability to speed up workflow and processing time as well as reduce storage. Examples of software that use parametric image editing are Adobe Camera Raw, Adobe Lightroom, Capture One, Aperture, etc.
Vector images, on the otherhand, are stored as mathematical concepts. Each time they are opened, they are drawn by following a set of mathematical instructions. There is therefore no set resolution and can be scaled to any size. Therefore, it is used primarily for text and graphics, not photographs.
Processing
Parametric Image Editing (PIE)
The left “A” is a vector image, while the right “A” is a bitmap image whose size has been increased.
25
Processing
This histogram describes the properly exposed image above it. The image information is all contained within the boundaries of the dynamic range of the detector.
26
Processing
This histogram describes an over exposed image. The highlights are clipped and the histogram is hanging to the right of the dynamic range.
This histogram describes an under exposed image. The histogram hangs to the left of the dynamic range.
This histogram describes a high contrast image. The image information peaks in the black adn white areas of the histogram.
27
Processing
Noise Reduction Image noise is a random pattern of variation in brightness or color information. It is often produced either by the sensor or circuitry in a digital camera. In general, noise is considered undesirable and photographers will do their best to reduce it in post processing. Reducing noise in a digital image is the equivalent of blowing the dust off of film in the sense that it “cleans off” the image file. Many photography editing software will offer various noise reduction functions which can be applied to an image.
Image A: Original photograph
Median Filter Median filtering is a means by which “salt and peppar” noise can be reduced in a digital image while trying to maintain edges. Image B: Noise has been added to Image A in photoshop. Viewed at 70% magnification.
28
Image C: A median filter was applied to Image B. Viewed at 70% magnification.
Noise has been added to the original image (A) for illustration. A median filter is applied to Image B, which results in Image C. A side effect of median filtering is that the image adopts a painting-like visual quality.
Processing
Center element of the kernel is placed over the source pixel. The source pixel is then replaced with a weighted sum of itself and nearby pixels. http://tinyurl.com/lhrb7ls
Image Sharpening
Convolution Kernels
Image sharpness is a big concern for photographers. Sharpness is enhanced in different ways throughout the imaging pipeline. However, artistic sharpening is applied during image processing and can be accomplished through various methods. Two of the most common are convolution kernels and unsharp masking. Photoshop also offers high pass filter and Smart Sharpen methods.
Convolution kernels are small matrixes that can be applied to an image in order to achieve blurring, sharpening, edge-detection, etc. The scale of a kernel refers to the number of pixels. The offset is a number added or subtracted to all of the pixels that can increase or decrease the entire effect.
Editing in L*ab Mode Due to the human eye’s hightened sensitivity to lightness over color, the a* and b* channels in a photograph can be altered without very much noticeable change. Noise reduction and sharpening can be applied successfully in this way.
The output value for each pixel is calculated by multiplying each original value with the corresponding kernel values and adding them all together. The kernel’s origin corresponds with the input image pixel. A blur filter has an origin value surrounded by all positive numbers. A sharpen filter has a positive origin surrounded by all negative numbers.
29
Processing
Image A
Unsharp Mask Unsharp masking essentially exaggerates differences in brightnesses along the edges within an image. It does not have the same result as procuring a sharp image in camera. However, it gives photographers the ability to create the appearance of more defined edges as they process their image. This method is especially helpful because it is less prone to sharpening noise. In this example, Image A was sharpened using an unsharp mask. This sharpening method works becuase the higher frequencies in an image are equal to detail or change of brightness.
30
In order to achieve this effect, a high frequency mask (D) is added to the original image (A). The high frequency mask used to accomplish unsharp masking is created by subtracting a blurred version of an image from the original (B-C). This results in an image containing only the high frequencies of the original (D). Adding the mask back to the original image results in twice the amount of high frequencies in the sharpened image.
This is the original unsharpened image.
Processing
Image B
Image C A gaussian blur has been applied in order to create a blurred version of the original image.
Image D This is the high frequency mask, which is created by subtracting the blurred image from the original image.
31
Processing Image in RGB color space.
32
A color space is a system of names or numbers by which the colors in an image are defined. Color spaces describe the spectral resolution of an image. All colors that we see can by created by mixing the 3 primary colors. Therefore, an RBG color space is made up of red, blue, and green channels. RGB utilizes additive color and is generally used when viewing images on monitors. In contrast, CMYK color spaces are comprised of cyan, magenta, and yellow channels. CMYK utilizes subtractive color and is most commonly used in the context of printing. These are the two most common color spaces used in the imaging pipeline.
However, other color spaces, such as Pantone and CIELAB space, also serve specific purposes. The Pantone system is a standardized color reproduction system in which specific color swatches are assigned numbers and names so that they can be matched, regardless of equipment that is used. Pantone colors are used in a variety of different industries such as in the manufacturing of paints, fabrics, and plastics. On the otherhand, the CIELAB (or L•ab) color space maps out colors in a “master” space of all percievable colors. L represents “lightness” while “a” and “b” constitute color opponent dimensions. This is a more formal way of mathematically pinpointing colors.
An image in a RGB color space is comprised of red, green, and blue channels.
An image in a CMYK color space is comprised of cyan, magenta, and yellow channels.
Processing
Color Spaces
33
Output Chapter Four
35
Output PPI
DPI
Pixels are the smallest element of an image which can display different “gray levels” and that can be controlled. Pixels are composed in a series of rows and columns and their size depends upon the size of the image which they make up. If a device is capable of handling multiple gray levels, it should be referred to in PPI.
Dots are the smallest binary element that can be generated by a device. DPI refers to the spatial density of the dots which can be printed or displayed within an inch.
The “pizels per inch” or PPI of an image describes the number of pixels which a device can display or print in an inch. In the printing world, ppi is often referred to as “lines per inch” or LPI. LPI was a concept that developed from the idea that screens are described as having a certain number of “lines per inch”. This will always be at a fixed frequency because the line passes through the center of the pixels, no matter what their size or shape.
36
DPI, or “dots per inch” are used when outputing images in binary printers such as inkjet and laser. However, DPI is often misused within the photography industry. Many manufacturers describe the capacities of their products with DPI when they are actually referring to PPI. For example, DPI in reference to a CCD scanner, digital camera, monitor, or continuous tone printer should really be in PPI.
PPI and DPI conversion PPI = DPI/[ (dots/pixel)]
Output
Halftoning Continuous images that are printed from a digital file must be first converted to halftone in order to be printed by a binary device. This is essentially converting an image from PPI to DPI. However, in halftoning, spatial resolution is sacrificed in order to gain greater tonal resolution. The continuous tone is broken down into a series of dots, with darker areas containing a higher concentration or larger sized dots. There are various types of halftone “screens� which can be applied to images. They vary in the direction of the lines which are applied. Printers have found that an image with a screen applied at an angle is less noticeable than one that is either vertical or horizontal. This is because of the human eye’s sensitivity to vertical and horizontal lines.
Stochastic Halftoning Stochastic halftoning is also known as FM halftoning and is a method which has become more common in modern, desktop, inkjet printers. When this process is used, the dots are distributed randomly within a pixel instead of being simply clustered around a single dot. Either fixed size dots or dots of variable sizes can be used. This technology aids in avoiding moire patterns, but is prone to looking more grainy, especially in the highlights. Most desktop printers utilize this method while lasers generally use AM screening. http://tinyurl.com/kxktmkl
37
Output 38
A halftone filter is applied (bottom) to the original color image (top) in order to illustrate how a binary printer may feign continuous tone in a printed image.
Color printing is made possible with the use of the CMYK or subtractive color space. The process of color halftoning is the same as for a black and white image except that the steps are repeated for each color channel. The small, layered dots are smoothed by the human eye and create the illusion of a color continuous tone image. There are distracting effects that can occur when color halftoning. The most common are the emphasis on edges or moire patterning. Therefore, the CMYK color channels are oriented at different screening angles in order to avoid moire patterns. There are multiple variations of these angles depending on the printer and desired effect.
Output
Color Halftoning
http://tinyurl.com/lp5s4kt
Magnified view of original and halftoned image.
For color halftoning, CMYK colors are printed individually and at varying screening angles in order to avoid moire patterning.
39
Output http://tinyurl.com/bps9qyz
Analog Printing Analog printers still have their place in the printing world when it comes to printing with speed. However, it is typically a more expensive process because a plate is made for every page. This makes it necessary for printers to print large quantities of the same thing in order to have a run be cost effective.
40
Output
http://tinyurl.com/mwqyzu3
Digital Printing Digital printing is a slower process than analog, but offers greater flexibility. The analog “plate� is replaced by a computer file using software. The three major digital printer types are inkjet, laser, and dye-sublimation. There are also printers that use either dyes, pigments, or toners. Dyes are water soluable, so pigments are better for archival purposes.
http://tinyurl.com/bps9qyz
41
Output http://tinyurl.com/ov5ca5d
RIP and Marking Engine
Scanner to Printer
RIP stands for “raster image processing� and is a means by which vector digital image information is converted into a highresolution raster image. A RIP translates the language, halftones the images, and implements color management. The RIP may reside in the host, printer, or as a separate unit.
In order to calculate the proper resolution for scanning an image for print, the conversion between PPI and DPI must be taken into account. A scanned image should be discussed in PPI and must be converted to DPI in order to calculate proper scanning resolution.
RIPs are implemented in three stages. First, the PDLs (page description languages) are interpreted and translated into a representation of each page. Second, the representation is rendered into a continuous tone bitmap. Finally, the continuous-tone bitmap is converted into a halftone image for printing.
42
If the scale of the original scan and output size are known, the following formula can be used to determine the scanning resolution: Output Resolution Magnification x 2__________________ Scanning Resolution
Laser printers transfer toner to a paper base, which is then fused in place. This method results in better light and water fastness than ink. There is high text quality, but poor photo quality due to halftoning. The images are created by the means of a laser beam traveling across the printer’s photoreceptor.
Output
Laser
http://tinyurl.com/ov5ca5d
Inkjet Inkjet printers create images by dropping very small droplets of ink onto photo paper. The printers use either “drop-on-demand” or “continuous” inkjet technology and can utilize either dyes or pigments. This is the most common printer type for digital image output.
http://tinyurl.com/q9ktpov
Dye-Sublimation Dye sublimation printers use heat to transfer colorant from a donor ribbon to the final print. This process results in a continuous tone image. However, the media and printers are more expensive, slower, and have restricted output sizes. http://tinyurl.com/py4vm89
43
44
Output
LCD
There are a variety of display types/monitors that photographers may use as a means to present their images when printing is not the intended output.
LCD (liquid crystal display) screens use RGB filters over a cell of transparent liquid crystal material. A backlight shines through the LCD in order to create color. An issue with LCDs is that over time, the crystals may get stuck and so turning off LCDs when not in use is highly suggested. Another issue found with LCDs is that the highly saturated color filters are very dense and sometimes result in a dim display. Most LCDs produce around 60% of standard color gamut in order to maximize screen brightness and battery running time.
Different characteristics to consider when comparing the visual qualities of displays are resolution (ppi), color gamut, bit depth, contrast ratio (black point), response time, and viewing angles. In addition, one might consider the aspect ratio, size, and power consumption of the device. Digital display is becoming increasingly essential to the photographic workflow.
Output
Display Technologies
OLED OLED (organic light emitting diode) displays apply currents to organic compounds which luminesce. OLEDs can be printed by inkjet or screen printers, are lightweight, flexible, have a wide-viewing angle, are very bright, and have a faster response time than LCDs. Two other important advantages are that no backlight is necessary, which results in deep black levels and a lighter screen. However, they are expensive to build, have short lifespans, and often have color casts (due to a quicker degredation of blue light levels than red and green).
AMOLED AMOLED (active matrix OLED) displays are typically used for mobile phones, media players, and digital camera backs.
Plasma Plasma displays create light by sparking a plasma discharge that emits ultraviolet light. This type of display allows for bright, saturated colors that are not dependent on view angle like an LCD screen is. Plasma screens also produce deep blacks, have less motion blur, and a higher refresh rate. The disadvantages are that they have low resolutions, are expensive, and age quickly.
45