DIGITAL IMAGE PROCESSING
Over the past dozen years forensic and medical applications of technology first developed to record and transmit pictures from outer space have changed the way we see things here on earth, including Old English manuscripts. With their talents combined, an electronic camera designed for use with documents and a digital computer can now frequently enhance the legibility of formerly obscure or even invisible texts. The computer first converts the analogue image, in this case a videotape, to a digital image by dividing it into a microscopic grid and numbering each part by its relative brightness. Specific image processing programs can then radically improve the contrast, for example by stretching the range of brightness throughout the grid from black to white, emphasizing edges, and suppressing random background noise that comes from the equipment rather than the document. Applied to some of the most illegible passages in the Beowulf manuscript, this new technology indeed shows us some things we had not seen before and forces us to reconsider some established readings.
Introduction to Digital Image Processing:
Â¢ Vision allows humans to perceive and understand the world surrounding us.
Â¢ Computer vision aims to duplicate the effect of human vision by electronically perceiving and understanding an image.
Â¢ Giving computers the ability to see is not an easy task - we live in a three dimensional (3D) world, and when computers try to analyze objects in 3D space, available visual sensors (e.g., TV cameras) usually give two dimensional (2D) images, and this projection to a lower number of dimensions incurs an enormous loss of information.
Â¢ In order to simplify the task of computer vision understanding, two levels are usually distinguished; low-level image processing and high level image understanding.
Â¢ Usually very little knowledge about the content of images
Â¢ High level processing is based on knowledge, goals, and plans of how to achieve those goals. Artificial intelligence (AI) methods are used in many cases. High-level computer vision tries to imitate human cognition and the ability to make decisions according to the information contained in the image.
Â¢ This course deals almost exclusively with low-level image processing, high level in which is a continuation of this course.
Â¢ Age processing is discussed in the course Image Analysis and Understanding, which is a continuation of this course.
Many of the techniques of digital image processing, or digital picture processing as it was often called, were developed in the 1960s at the Jet Propulsion Laboratory, MIT, Bell Labs, University of Maryland, and few other places, with application to satellite imagery, wire photo standards conversion, medical imaging, videophone, character recognition, and photo enhancement. But the cost of processing was fairly high with the computing equipment of that era. In the 1970s, digital image processing proliferated, when cheaper computers Creating a film or electronic image of any picture or paper form. It is accomplished by scanning or photographing an object and turning it into a matrix of dots (bitmap), the meaning of which is unknown to the computer, only to the human viewer. Scanned images of text may be encoded into computer data (ASCII or EBCDIC) with page recognition software (OCR).
Â¢ A signal is a function depending on some variable with physical meaning.
Â¢ Signals can be
o One-dimensional (e.g., dependent on time),
o Two-dimensional (e.g., images dependent on two co-ordinates in a plane),
o Three-dimensional (e.g., describing an object in space),
o Or higher dimensional.
Pattern recognition is a field within the area of machine learning. Alternatively, it can be defined as "the act of taking in raw data and taking an action based on the category of the data" . As such, it is a collection of methods for supervised learning.
Pattern recognition aims to classify data (patterns) based on either a priori knowledge or on statistical information extracted from the patterns. The patterns to be classified are usually groups of measurements or observations, defining points in an appropriate multidimensional space. Are to represent, for example, color images consisting of three component colors.
Â¢ The image can be modeled by a continuous function of two or three variables;
Â¢ Arguments are co-ordinates x, y in a plane, while if images change in time a third variable t might be added.
Â¢ The image function values correspond to the brightness at image points.
Â¢ The function value can express other physical quantities as well (temperature, pressure distribution, distance from the observer, etc.).
Â¢ The brightness integrates different optical quantities - using brightness as a basic quantity allows us to avoid the description of the very complicated process of image formation.
Â¢ The image on the human eye retina or on a TV camera sensor is intrinsically 2D. We shall call such a 2D image bearing information about brightness points an intensity image.
Â¢ The real world, which surrounds us, is intrinsically 3D.
Â¢ The 2D intensity image is the result of a perspective projection of the 3D scene.
Â¢ When 3D objects are mapped into the camera plane by perspective projection a lot of information disappears as such a transformation is not one-to-one.
Â¢ Recognizing or reconstructing objects in a 3D scene from one image is an ill-posed problem.
Â¢ Recovering information lost by perspective projection is only one, mainly geometric, problem of computer vision.
Â¢ The second problem is how to understand image brightness. The only information available in an intensity image is brightness of the appropriate pixel, which is dependent on a number of independent factors such as
o Object surface reflectance properties (given by the surface material, microstructure and marking),
o Illumination properties,
o And object surface orientation with respect to a viewer and light source.
Digital image properties:
Metric properties of digital images:
Â¢ Distance is an important example.
Â¢ The distance between two pixels in a digital image is a significant quantitative measure.
Â¢ The Euclidean distance is defined by Eq. 2.42
o City block distance
o Chessboard distance Eq. 2.44
Â¢ Pixel adjacency is another important concept in digital images.
Â¢ It will become necessary to consider important sets consisting of several adjacent pixels -- regions.
Â¢ Region is a contiguous set.
Â¢ Contiguity paradoxes of the square grid
Â¢ One possible solution to contiguity paradoxes is to treat objects using 4-neighborhood and background using 8-neighborhood (or vice versa).
Â¢ A hexagonal grid solves many problems of the square grids ... any point in the hexagonal raster has the same distance to all its six neighbors.
Â¢ Border R is the set of pixels within the region that have one or more neighbors outside R ... inner borders, outer borders exist.
Â¢ Edge is a local property of a pixel and its immediate neighborhood --it is a vector given by a magnitude and direction.
Â¢ The edge direction is perpendicular to the gradient direction which points in the direction of image function growth.
Â¢ Border and edge ... the border is a global concept related to a region, while edge expresses local properties of an image function.
Â¢ Crack edges ... four crack edges are attached to each pixel, which are defined by its relation to its 4-neighbors. The direction of the crack edge is that of increasing brightness, and is a multiple of 90 degrees, while its magnitude is the absolute difference between the brightness of the relevant pair of pixels. (Fig. 2.9)
Topological properties of digital images
Â¢ Topological properties of images are invariant to rubber sheet transformations. Stretching does not change contiguity of the object parts and does not change the number One such image property is the Euler--Poincare characteristic defined as the difference between the number of regions and the number of holes in them.
Â¢ Convex hull is used to describe topological properties of objects.
Â¢ r of holes in regions.
Â¢ The convex hull is the smallest region which contains the object, such that any two points of the region can be connected by a straight line, all points of which belong to the region.
A scalar function may be sufficient to describe a monochromatic image, while vector functions are to represent, for example, color images consisting of three component colors.
Further, surveillance by humans is dependent on the quality of the human operator and lot off actors like operator fatigue negligence may lead to degradation of performance. These factors may can intelligent vision system a better option. As in systems that use gait signature for recognition in vehicle video sensors for driver assistance.