Received June 9, 2020, accepted June 24, 2020, date of publication June 26, 2020, date of current version July 7, 2020. Digital Object Identifier 10.1109/ACCESS.2020.3005249
Low-Light Image Enhancement Using Volume-Based Subspace Analysis WONJUN KIM 1 , (Member, IEEE), RYONG LEE 2 , MINWOO PARK SANG-HWAN LEE 2 , AND MYUNG-SEOK CHOI2
2,
1 Department 2 Research
of Electrical and Electronics Engineering, Konkuk University, Seoul 05029, South Korea Data Sharing Center, Korea Institute of Science and Technology Information, Daejeon 34141, South Korea
Corresponding author: Ryong Lee (ryonglee@kisti.re.kr) This work was supported by a Research and Development project, ‘‘Enabling a System for Sharing and Disseminating Research Data,’’ of Korea Institute of Science and Technology Information (KISTI), South Korea, under Grant K-20-L01-C04-S01.
ABSTRACT Low-light image enhancement is a key technique to overcome the quality degradation of photos taken under challenging illumination conditions. Even though the significant progress has been made for enhancing the poor visibility, the intrinsic noise amplified in low-light areas still remains as an obstacle for further improvement in visual quality. In this paper, a novel and simple method for low-light image enhancement is proposed. Specifically, the subspace, which has an ability to separately reveal illumination and noise, is constructed from a group of similar image patches, so-called volume, at each pixel position. Based on the principal energy analysis onto this volume-based subspace, the illumination component is accurately inferred from a given image while the unnecessary noise is simultaneously suppressed. This leads to clearly unveiling the underlying structure in low-light areas without loss of details. Experimental results show the efficiency and robustness of the proposed method for low-light image enhancement compared to state-of-the-art methods. INDEX TERMS Low-light image enhancement, quality degradation, subspace, volume-based principal energy analysis, illumination component.
I. INTRODUCTION
With the rapid development of mobile devices equipped by cameras, especially smartphones, a vast amount of photos are taken and shared everyday. However, due to complicated lighting conditions under real-world environments, e.g., uneven illuminations, backlight, and casting shadows, acquired images are often underexposed and low-visible, severely degrading the user’s viewing experience. Furthermore, loss of details and color distortions in such degraded images lead to the significant performance drop in further applications of computer vision such as object detection, tracking, and segmentation, which demand high-quality inputs for precise results. To tackle this problem, diverse methods for low-light image enhancement have been introduced for last decades, which can be mainly categorized into twofold: statistical information-based approach and decomposition-based approach. The former stretches the dynamic range of the The associate editor coordinating the review of this manuscript and approving it for publication was Guitao Cao 118370
.
low-light image by modifying the distribution of intensity values, whereas the latter utilizes the physical model of the light reflection to adjust illumination components independently from scene structures. In the early stage, statistical information-based methods, such as histogram equalization (HE) and its variants [1], [2], have been widely adopted due to their simplicity and effectiveness for enhancing the low contrast. However, images restored via these methods are likely to be partially exaggerated or weakly enhanced under uneven lighting conditions. On the other hand, inspired by the Retinex theory [3] which assumes that the given scene can be regarded as the product of illumination and reflectance, decomposition-based methods have been actively studied. Most algorithms belonging to this category attempted to separate lighting components from a given scene and subsequently combine adjusted illuminations back with the reflectance layer to generate enhanced results. To do this, various optimization techniques, e.g., variational framework [4], direction minimization [5], etc., have been adopted to accurately estimate the illumination component by minimizing the difference between target and estimated results with
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 8, 2020
W. Kim et al.: Low-Light Image Enhancement Using Volume-Based Subspace Analysis
constraints regarding the general properties of illumination. Even though the visual degradation due to the unnaturalness such as over-exaggerated and halo artifacts is successfully resolved, such methods still have respective limitations on estimating the highly nonlinear process of the illumination decomposition. Most recently, several researchers have started to adopt deep neural networks (DNN) for low-light image enhancement by using expert-retouched image pairs as training data [7], [8]. Specifically, the problem of image enhancement can be re-formulated as the problem of image generation, thus DNN-based generative models began to be actively applied for this task without any consideration of the physical model. However, one clear limitation of DNN-based methods, as expected, is that the performance is strongly dependent on the expert’s supervision, which is highly subjective. Furthermore, their supervised learning strategies often fail to provide satisfactory results when ‘‘unseen’’ examples in the training set are given for test. In this paper, a novel yet simple method for low-light image enhancement is proposed. Based on the general principle that similar image patches contain similar lighting conditions, we propose to exploit the principal energy of the subspace, which is generated from a group of similar image patches in a least square sense, so-called volume, as the illumination component. By back-projecting the volume restored via the principal energy into the image space (i.e., original location of each image patch), the illumination component is accurately estimated with abundant contextual information. One important advantage of the proposed method is that the noise-reduced image, which is helpful to suppress unnecessary noise in the reflectance, also can be simultaneously obtained during the volume-based subspace analysis. After that, the illumination image is adjusted by using the simple operation, e.g., Gamma correction, and re-combined with the reflectance, which separates from the noise-reduced image according to the Retinex theory, for generating the enhancement result. The main contributions of this paper can be summarized as follows: • We propose to exploit the principal energy computed from the subspace of the volume as the illumination component. Compared to the single local patch-based approaches [9], [15], the illumination component can be reliably isolated from others in the subspace since it is constructed by utilizing more contextual information from multiple image patches. • The proposed method also applies the adaptive truncation scheme to the volume-based subspace for suppressing the unnecessary noise amplification in low-light areas during the enhancement process. • The performance of the proposed method is evaluated on various benchmark datasets. Experimental results show that the proposed method provides the reliable enhancement results compared to previous approaches. It is noteworthy that the performance of the proposed method is also comparable to that of DNN-based stateof-the-art methods. VOLUME 8, 2020
The reminder of this paper is organized as follows. The comparative review of previous methods for low-light image enhancement is provided in Section II. The technical details of the proposed method is explained in Section III. Experimental results on benchmark datasets are demonstrated in Section IV. The conclusions follow in Section V. II. RELATED WORK A. STATISTICAL INFORMATION-BASED METHODS
In the beginning, many researchers developed the histogrambased adjustment schemes, which are conceptually simple and easy to implement. In particular, the adaptive equalization technique with the locally-clipping constraint [1] has shown successful results for image enhancement in medical as well as natural images. Inspired by the remarkable performance of [1], several studies concentrated on adaptively adjusting the dynamic range of pixel intensities by efficiently imposing the data-driven properties to the histogram of the original input image. For example, Lee et al. [10] utilized the layered difference in the two-dimensional histogram, which is constructed based on the relation between neighboring pixels, for efficiently stretching the narrow-shaped histogram. Raju and Nair [11] applied the fuzzy membership to the process of the histogram modification to mitigate the ambiguity driven by the quantization effect in the histogram. Such histogram-based approaches are effective and work fast, however, their performance is limited due to lack of the spatial information. B. DECOMPOSITION-BASED METHODS
On the other hand, based on the Retinex theory [3] that the pixel value can be decomposed into illumination and reflectance components, many researchers have attempted to adjust the lighting effect independently of the underlying structure. As an example, the retinex output [12], [13], which is obtained by computing the difference between the original input and its smoothed version in the log domain, has been popularly employed as the enhancement result. However, the decomposition task is basically ill-posed (i.e., ground truth for illumination and reflectance is not given). To alleviate this problem, various optimization techniques have been adopted with constraints related to general properties of illumination components. Kimmel et al. [14] proposed to define the energy function with two penalty terms, i.e., smoothness for illumination and piece-wise constant for reflectance, in a variational framework. Although their optimization scheme provides the prominent enhancement result, the log transformation for penalizing terms magnified noise especially in highly textured regions due to its derivative process. To cope with this limitation, Fu et al. [4] devised a weighted variational model, which is able to simultaneously estimate reflectance and illumination components without noise amplification during the enhancement process. Guo et al. [5] initialized the illumination map by selecting the maximum value among RGB channels at each pixel position 118371
W. Kim et al.: Low-Light Image Enhancement Using Volume-Based Subspace Analysis
and consistently updated it by imposing a structure prior into their objective function, which is resolved by the alternating direction minimization technique. Ying et al. [6] investigated the relationship between two images acquired with different exposures for estimating illuminations. Besides, the illumination component at each pixel position is estimated via the principal energy of the intensity lattice defined in a small local region [15]. Even though textural details are successfully restored in the reflectance component by decomposition-based approaches, visually unpleasant effects such as halo artifacts and intensive noise still occur in low-light areas of enhancement results. C. LEARNING-BASED METHODS
Most recently, owing to the great success of the deep neural network (DNN), several researchers start to apply DNN-based generative models to resolve the problem of image enhancement. Specifically, Ignatov et al. [16] utilized stacked residual blocks [17] to generate the enhancement results from the latent feature space. To train their network, loss functions are designed with consideration of four factors, i.e., color similarity, illumination smoothness, type of contents, and naturalness. Gharbi et al. [18] proposed to embed bilateral grid processing and local affine color transforms into the DNN architecture for reproducing enhancement results. Park et al. [19] depicted the re-touching process by human based on the deep reinforcement learning technique. Chen et al. [7] proposed to adopt the two-way generative adversarial network (GAN) [20], which is trained to develop the ability of generation by tightly maintaining the consistency between the original input and the re-generated one from the enhancement result. Wang et al. [8] attempted to infer the illumination layer by utilizing the ensemble of globally and locally encoded features instead of directly generating the enhancement result to efficiently cover diverse variations of lighting conditions. Even though learning-based approaches bring the significant improvement of the enhancement performance even in complicated real-world situations, as expected, the performance of such methods is inevitably dependent on the subjective re-touching process. Moreover, the performance is significantly dropped when different types of images compared to training samples (i.e., unseen data) are provided for test. III. PROPOSED METHOD
The key idea of the proposed method is to estimate the illumination component of each pixel from the subspace, which is constructed based on a group of similar image patches, i.e., volume, sharing the lighting condition as well as the underlying structure. Compared to that of the single image patch, the volume-based subspace contains more contextual information and thus makes the illumination component reveal more accurately. Since the lighting effect is dominant within a small local area, the principal energy on the volume-based subspace gives a good approximation of the illumination component. Furthermore, by adaptively 118372
truncating energies distributed along each basis of this volume-based subspace, the noise-reduced image also can be restored simultaneously, which is helpful to clearly separate out the reflectance. This is fairly desirable to suppress unnecessary noise amplification in enhancement results. Technical details of the proposed method will be explained in the following subsections. A. VOLUME-BASED SUBSPACE ANALYSIS
First of all, the volume is defined at each pixel position by computing the intensity difference between image patches of the original input as follows: B(x) = {B(y)| dist(B(x), B(y)) < τM },
(1)
where x and y denote positions of the current pixel and its surroundings, respectively. B(·) is the image patch centered at the corresponding pixel position and its size is set to N × N pixels. τM is the threshold value and 10.0 is used for our implementation. The intensity difference between image patches is computed as follows: X dist(B(x), B(y)) = |I (r) − I (c)|, (2) r∈Wx ,c∈Wy
where Wx and Wy denote a set of pixel positions belonging to the image patch centered at x and y in the original intensity image I , respectively. Note that we select top M patches among candidates satisfying the condition given in (1) to generate the volume. For the subspace analysis, the volume B(x) whose size is N × N × M is reshaped to the two-dimensional matrix Z (x) of M × N 2 as shown in Fig. 1.
FIGURE 1. The process of the volume generation for the current pixel position x. Note that the white rectangle and red ones in the original input are the image patches centered at x and its surroundings, respectively.
To estimate the principal energy onto the subspace obtained from the volume, we adopt the singular value decomposition (SVD), which is known to show the great performance in factorizing independent components of the given distribution [21]. The subspace analysis of Z (x) can be formulated as follows: Z (x) = USV T ,
(3)
where S is the diagonal matrix whose elements represent the singular values in a descending order, which indicate the energy density of each independent component. Therefore, we exploit the first singular value s1 , i.e., the principal energy, as the illumination component. V denotes the basis vectors indicating each axis of the subspace. Note that U is orthogonal to V . Moreover, the unnecessary noise of the original VOLUME 8, 2020
W. Kim et al.: Low-Light Image Enhancement Using Volume-Based Subspace Analysis
input can be efficiently reduced in the same subspace by adaptively truncating singular values based on the simple principle that high-order singular values are likely to correspond to noise. To do this, we propose to adaptively define the cut-off index p by utilizing the ratio between the first (i.e., principal) and the second singular values as follows:
Îą s1
P , (4) p = max i si > i si s2 where Îą denotes the scaling factor and is set to 180.0 in our implementation. The ratio between s1 and s2 becomes large in the homogenous region (s1 s2 ) whereas this value decreases in the highly textured structures (s1 â&#x2030;&#x2C6; s2 ). Therefore, the noise reduction can be adaptively conducted according to the complexity of the content in the corresponding volume. In summary, illumination and noise-reduced matrices restored from this subspace analysis are represented as follows: ZĚ&#x192;L (x) = US1 V , T
ZĚ&#x192;IN (x) = USp V , T
(5)
where the subscript of each matrix denotes the cut-off index of singular values to be preserved (others are set to zero). In the following, such restored matrices are reshaped to volumes, i.e., BĚ&#x192;L (x) and BĚ&#x192;IN (x), and each image patch belonging to the volume is backprojected to the original position for generation of illumination and noise-reduced images, i.e., LĚ&#x192; and IĚ&#x192;N , respectively. An example of the backprojection process is shown in Fig. 2. Since the image patch can be selected multiple times during the enhancement process, the value of each pixel belonging to the corresponding patch is iteratively updated whenever selected as follows: w(q) ¡ L(q) + LĚ&#x192;(q) , w(q) + 1 w(q) ¡ IN (q) + IĚ&#x192;N (q) , IN (q) â&#x2020;? w(q) + 1 L(q) â&#x2020;?
(6) (7)
where q denotes the pixel position in image patches backprojected from the volume BĚ&#x192;L (x) (or BĚ&#x192;IN (x)). w(q) is the number of times that the value at the pixel position q is updated. The initial status for illumination and noise-reduced images is set to the same value of the original intensity image, i.e., L(q) = IN (q) = I (q) at the initial step. By such updating process with more contextual information, corresponding results are efficiently refined and thus the proposed method is capable of providing reliable illumination and noise-reduced images, i.e., L and IN , for image enhancement. It is noteworthy that
FIGURE 2. The backprojection process for generation of the illumination image. Note that the center dot in the right-most image indicates the position of x. VOLUME 8, 2020
the noise-reduced image is helpful for clearly revealing the reflectance, which leads to the better visibility in the enhancement result. B. IMAGE ENHANCEMENT
To adjust the illumination component estimated in the previous subsection, the conventional Gamma correction is conducted as follows: L(x) 1/Îł , (8) Le (x) = 255 Ă&#x2014; T where T is the scaling factor making the adjustment results be in the range of the intensity value, i.e., [0, 255] and set to 255.0 in our implementation. As introduced in most previous methods, we use 2.2 for the Îł value. On the other hand, the reflectance can be easily obtained according to the Retinex theory as follows: R(x) =
IN (x) , L(x) +
(9)
where is the small positive value to avoid the zero division. It should be emphasized that the noise-reduced image IN , which is restored from (7), is employed instead of the original intensity image I to efficiently resolve the problem of noise amplification in dark areas. Finally, the enhancement result can be obtained by re-combining reflectance and adjusted illumination components as follows: F(x) = Le (x) Ă&#x2014; R(x).
(10)
To locally supplement the result of global stretching conducted in (8), the CLAHE [1] operator is applied to the restored result F(x). Finally, the enhanced color image can be obtained by applying the enhanced intensity image to the conversion of HSVâ&#x2020;&#x2019;RGB. The results of the proposed method are shown in Fig. 3. The comparison between several parts of the original input and its enhanced result is shown in Fig. 4. As can be seen, the underlying structures in the dark area are successfully revealed without any significant distortion (e.g., noise amplification). The effect of the noise-reduced image is also shown in Fig. 5. We can see that the unnecessary noise amplification is successfully suppressed in dark areas of the enhancement result (e.g., wall in the background, which is enlarged by the white rectangle) in terms of the noise-reduced image.
FIGURE 3. (a) Intensity image of the original input. (b) Estimated illumination image L. (c) Estimated reflectance image R. (d) Enhanced illumination image Le . (e) Final enhancement result. 118373
W. Kim et al.: Low-Light Image Enhancement Using Volume-Based Subspace Analysis
FIGURE 4. (a) Original input (left) and its enhanced result (right). (b) Enlarged versions of several parts from the original input and the enhancement result.
FIGURE 5. (a) Original input. (b) Enhancement result with the noise-reduced image IN . (c) Enhancement result without the noise-reduced image (i.e., use the original intensity image I). Note that the unnecessary noise is efficiently suppressed due to the reflectance estimated from the noise-reduced image (see the enlarged region (i.e., white rectangle)).
For the sake of completeness, the summary of the proposed volume-based subspace analysis for low-light image enhancement is provided in Algorithm 1.
Algorithm 1 Volume-Based Subspace Analysis for Low-Light Image Enhancement Data: I : intensity channel of the input color image x = (x, y): pixel position, H : height, W : width Result: Enhanced color image 1. Volume-based subspace analysis : Estimate illumination and noise-reduced images while 1 â&#x2030;¤ x â&#x2030;¤ W and 1 â&#x2030;¤ y â&#x2030;¤ H do i) Construct the volume B(x) via M image patches â&#x2020;&#x2019; Convert B(x) to the matrix Z (x) ii) Conduct the subspace analysis using Z (x) â&#x2020;&#x2019; Compute ZĚ&#x192;L (x) and ZĚ&#x192;IN (x) (see (5)) â&#x2020;&#x2019; Restore volumes, i.e., BĚ&#x192;L (x) and BĚ&#x192;IN (x) â&#x2020;&#x2019; Backproject volumes into image patches iii) Refine estimated results iteratively as follows: LĚ&#x192;(q) â&#x2020;&#x2019; Illumination: L(q) â&#x2020;? w(q)¡L(q)+ w(q)+1 N (q)+IĚ&#x192;N (q) â&#x2020;&#x2019; Noise-reduced: IN (q) â&#x2020;? w(q)¡Iw(q)+1 Pixel index increases: x = x + 1 and y = y + 1 end ? Generate illumination L and noise-reduced image IN 2. Image enhancement : Generate the enhancement result while 1 â&#x2030;¤ x â&#x2030;¤ W and 1 â&#x2030;¤ y â&#x2030;¤ H do i) Conduct the Gamma enhancement as follows: 1/Îł L(x) â&#x2020;&#x2019; Le (x) = 255 Ă&#x2014; T ii) Separate the reflectance from the noise-reduced image according to the Retinex theory: IN (x) â&#x2020;&#x2019; R(x) = L(x)+ iii) Restore the intensity: F(x) = Le (x) Ă&#x2014; R(x) Pixel index increases: x = x + 1 and y = y + 1 end ? Final step: CLAHE is applied to F(x) ? For the color enhancement result, conduct HSVâ&#x2020;&#x2019;RGB with the restored intensity in the final step
C. IMPLEMENTATION DETAILS
In this subsection, implementation details of the proposed method are explained. First of all, parameters related to the volume-based subspace analysis are determined as follows: the size of the image patch is set to 6 Ă&#x2014; 6 pixels, i.e., N = 6, and the number of image patches to construct the volume is set to 16, i.e., M = 16. To construct the volume, the distance (i.e., intensity difference) is computed by using image patches, which are within the range of 25 pixels from the current pixel position x. Note that this searching range can be adaptively determined according to the image size, e.g., min(width, height)/30 in our test, which yields the similar result without the performance drop. The enhancement process is conducted on the intensity channel only, which can be separately handled by utilizing the conversion from the RGB color space to the HSV one. Note that the performance variation according to parameter settings will be analyzed in detail in the following Section. The proposed method is implemented with the C language and runs on a single PC equipped by Intel Xeon 2.2GHz CPU 118374
and 64 GB of RAM. The processing time of the proposed method is averagely 1.7sec for benchmark datasets. For the calculation precision, it is sufficient for most variables to be declared as the float-typed data since the main process is concentrated on computing the intensity difference and singular values of the volume-based subspace. IV. EXPERIMENTAL RESULTS
To demonstrate the efficiency and robustness of the proposed method, we evaluate the performance of our model based on two benchmark datasets, i.e., NASA [22] (25 images) and HDR [23] (seven images) datasets, which have been most widely employed for this task in literature. The resolutions of sample images in NASA and HDR datasets are 1312 Ă&#x2014; 2000 (or 2000 Ă&#x2014; 1312) and 900 Ă&#x2014; 1350 (or 1350 Ă&#x2014; 900) pixels, respectively. Details of the performance evaluation will be explained in the following subsections. VOLUME 8, 2020
W. Kim et al.: Low-Light Image Enhancement Using Volume-Based Subspace Analysis
FIGURE 6. (a) Original input images selected from the NASA dataset. Enhancement results by (b) SRIE [4], (c) LIME [5], (d) CRM [6], (e) PPEA [15], (f) HDRNet [18], (g) DR [19], (h) DPE [7], (i) UPE [8], and (j) the proposed method. Note that (b)â&#x20AC;&#x201C;(e) are based on decomposition schemes while (f)â&#x20AC;&#x201C;(i) adopt deep neural networks.
A. QUALITATIVE ANALYSIS
To evaluate the performance of the proposed method, we compare ours with eight representative methods, i.e., four decomposition-based (SRIE [4], LIME [5], CRM [6], and PPEA [15]) and four learning-based (HDRNet [18], DR [19], DPE [7], and UPE [8]) methods. Several enhancement results by all the methods are shown in Figs. 6 and 7. Note that learning-based methods utilize the MIT-Adobe FiveK dataset [24], which contains 5,000 pairs of under-exposed images and corresponding enhanced versions re-touched by five experts, for training their deep neural networks. For the performance evaluation, we employ the pre-trained model provided by authors without any modification. Note that all the learning-based methods are tested on a PC with an Intel i7-6850K@3.6GHz CPU and a single NVIDIA GeForce Titan XP GPU. Specifically, LIME and PPEA generally work for low-light images, however, they often yield
VOLUME 8, 2020
over-enhanced (and exaggerated) results in complicated lighting conditions as shown in the first and the third rows of Fig. 6, for example. Even though SRIE also generates visually acceptable enhancement results, it tends to under-enhanced results in very dark areas as shown in the last rows of Figs. 6 and 7. The enhancement results by CRM are somewhat washed out (see the first and the last rows of Figs. 6 and 7). In the case of learning-based methods, it is easy to see that enhancement results are strongly dependent on the expert supervision. More concretely, DPE is likely to be conservative for illumination adjustment while HDRNet and DR often fail to avoid over-saturated artifacts occurring in the bright area (see background of (f) and (g) in Fig. 7). UPE provides quite reliable enhancement results, however, this scheme is prone to yield under-enhanced results in very dark areas as shown in Fig. 7. It is noteworthy that learning-based approaches are vulnerable to unseen data in their training
118375
W. Kim et al.: Low-Light Image Enhancement Using Volume-Based Subspace Analysis
FIGURE 7. (a) Original input images selected from the HDR dataset. Enhancement results by (b) SRIE [4], (c) LIME [5], (d) CRM [6], (e) PPEA [15], (f) HDRNet [18], (g) DR [19], (h) DPE [7], (i) UPE [8], and (j) the proposed method. Note that (b)â&#x20AC;&#x201C;(e) are based on decomposition schemes while (f)â&#x20AC;&#x201C;(i) adopt deep neural networks.
FIGURE 9. 1st and 3rd rows: original input images from NASA and HDR datasets, respectively. 2nd and 4th rows: corresponding enhancement results by the proposed method. FIGURE 8. (a) Original input. Enhancement results by (b) LIME [5], (c) PPEA [15], (d) UPE [8], and (e) the proposed method. Note that results for white rectangles are enlarged in the second and the third rows.
phase and sometimes restore the underlying structure with different color attributes as shown in the first row of Fig. 6. That is, their performance is strongly dependent on the trained model. In contrast, the proposed method works reliably under various low-light conditions, e.g., shadows in indoor and outdoor environments, cloudy weather, backlight, etc., without any significant background distortion. To show the robustness of the proposed method, we also provide comparison results for some local regions in Fig. 8. As can be seen, the proposed method is able to restore the buried information clearly 118376
compared to previous approaches. In particular, the proposed method successfully preserves the object boundary without blocky artifacts as well as noise amplification (compared to Fig. 8(b) and (c)) while efficiently improving the contrast (compared to Fig. 8(d)). Therefore, it is thought that the proposed subspace constructed by utilizing abundant contextual information from multiple image patches, i.e., volume, is effective for estimating the illumination component for low-light image enhancement. More enhancement results by the proposed method are shown in Figs. 9 and 10. We can see that the proposed method yields the visually pleasing results for complicated real-world scenarios. VOLUME 8, 2020
W. Kim et al.: Low-Light Image Enhancement Using Volume-Based Subspace Analysis
TABLE 1. Performance comparison using the NASA dataset (concerning each metric, the best results are shown in bold).
TABLE 2. Performance comparison using the HDR dataset (concerning each metric, the best results are shown in bold).
FIGURE 10. More examples of image enhancement in real-world environments. Top: original input images. Bottom: enhancement results by the proposed method. Note that these images are taken by the smartphone.
B. QUANTITATIVE ANALYSIS
For the quantitative evaluation of enhancement results, the visual quality assessment (VQA) techniques have been widely employed in this field. In our experiments, we adopt three representative VQA metrics, which are NIQE [25], BTMQI [26], and NIQMC [27]. These metrics are defined in a no-reference manner, i.e., they do not require the reference image and compute the quality score by only utilizing the given query. Since the problem of image enhancement is generally regarded as the ill-posed problem without ground truth, it is thought that three metrics are suitable for evaluating the performance of low-light image enhancement. Specifically, the NIQE metric explores the natural scene statistics (NSS) based on quality-related image features and computes the perceptual score via this NSS model. BTMQI measures the performance of the color tone-mapping, e.g., colorfulness of enhancement results as well as distortions of color attributes. The NIQMC metric allows for the entropy of visually attractable regions to provide the quality score relevant to the human vision system. Based on such VQA metrics, the objective quality score can be provided by considering various viewpoints corresponding to the general criteria for enhancement results. The performance comparisons based on such metrics are shown in Tables 1 and 2. Note that lower values indicate the better performance for NIQE and BTMQI metrics while VOLUME 8, 2020
FIGURE 11. Metric scores computed by the proposed method for all the samples both in NASA (top) and HDR datasets (bottom). Note that the x-axis denotes the image index.
higher values denote the better quality in the case of NIQMC. Specifically, we can see that the proposed method generally performs well, i.e., improves the low visibility without color and structural distortions, under various lighting conditions of the NASA dataset compared to previous methods as shown in Table 1. Since samples from the HDR dataset are acquired under very dark environments, most previous approaches including deep neural network-based models have difficulties to restore the contrast of the normal environment as shown in Table 2. Even though deep neural network-based approaches show the good performance when samples from the same dataset for training are used for test (e.g., MIT-Adobe FiveK dataset [24]), it is easy to see that their performance is notably dropped for our benchmark datasets. This is because the enhancement process is learned only for the given domain in a supervised manner. In contrast, the proposed method provides the balanced contrast while clearly revealing textures as well as boundaries of objects regardless of domain properties (i.e., types of datasets). Therefore, it is thought that the proposed method can be efficiently applied to various real-world applications. Metric scores computed by the proposed method for all the samples both in NASA and HDR datasets are shown in Fig. 11. 118377
W. Kim et al.: Low-Light Image Enhancement Using Volume-Based Subspace Analysis
TABLE 3. Performance variation according to parameter settings of the proposed method.
pleasing enhancement results for low-light images, which are taken under diverse lighting conditions in both indoor and outdoor environments. One important advantage compared to the learning-based methods is that the proposed method does not require any prior knowledge for guiding the restoration process, thus can be applied to various real-world applications with the reliable performance for low-light image enhancement. REFERENCES
C. PERFORMANCE ANALYSIS BY PARAMETERS
In this subsection, we analyzed the performance variation according to parameter settings of the proposed method. First of all, the effect of the patch size is checked and the corresponding result is shown in Table 3. Even though the performance variation is not significant with the change of the patch size, we selected the size of 6 × 6 pixels for the best performance. Based on this setting, other comparative experiments are subsequently conducted. Specifically, the maximum number of image patches, i.e., M , which is required to construct the volume, also affects the performance of low-light image enhancement as shown in the second part of Table 3. In general, the performance of low-light image enhancement is improved as the number of image patches increases. This is because more contextual information can be allowed for accurately estimating illumination and noise-reduced images. However, the processing time increases by about 30% between two cases of M = 16 and M = 20. Based on this trade-off, it is thought that M = 16 is enough to achieve the reliable enhancement. In regard to the threshold value, i.e., τM , for determining whether the image patch within the searching range is similar with the image patch defined at the current pixel position or not, the performance difference between two cases, i.e., τM = 10.0 and τM = 15.0, is neglectable (see the third part of Table 3), thus τM = 10.0 is employed for our implementation. V. CONCLUSION
A simple and robust method for low-light image enhancement is proposed in this paper. To this end, we proposed to exploit the volume, which is constructed by stacking a group of image patches, for accurately estimating the illumination component via the corresponding subspace analysis. Moreover, the adaptive scheme for truncating energies defined in each direction of the volume-based subspace is also proposed to restore the noise-reduced image, which gives a great help to suppress unnecessary noise amplification in the enhancement result. Based on various experimental results, we confirm that the proposed method has a good ability to provide visually 118378
[1] A. M. Reza, ‘‘Realization of the contrast limited adaptive histogram equalization (CLAHE) for real-time image enhancement,’’ J. VLSI Signal Process.-Syst. Signal, Image, Video Technol., vol. 38, no. 1, pp. 35–44, Aug. 2004. [2] M. Abdullah-Al-Wadud, M. Hasanul Kabir, M. A. A. Dewan, and O. Chae, ‘‘A dynamic histogram equalization for image contrast enhancement,’’ IEEE Trans. Consum. Electron., vol. 53, no. 2, pp. 593–600, May 2007. [3] E. H. Land, ‘‘The retinex theory of color vision,’’ Sci. Amer., vol. 237, no. 6, pp. 108–129, Dec. 1977. [4] X. Fu, D. Zeng, Y. Huang, X.-P. Zhang, and X. Ding, ‘‘A weighted variational model for simultaneous reflectance and illumination estimation,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 2782–2790. [5] X. Guo, Y. Li, and H. Ling, ‘‘LIME: Low-light image enhancement via illumination map estimation,’’ IEEE Trans. Image Process., vol. 26, no. 2, pp. 982–993, Feb. 2017. [6] Z. Ying, G. Li, Y. Ren, R. Wang, and W. Wang, ‘‘A new low-light image enhancement algorithm using camera response model,’’ in Proc. IEEE Int. Conf. Comput. Vis. Workshops (ICCVW), Oct. 2017, pp. 3015–3022. [7] Y.-S. Chen, Y.-C. Wang, M.-H. Kao, and Y.-Y. Chuang, ‘‘Deep photo enhancer: Unpaired learning for image enhancement from photographs with GANs,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 6306–6314. [8] R. Wang, Q. Zhang, C.-W. Fu, X. Shen, W.-S. Zheng, and J. Jia, ‘‘Underexposed photo enhancement using deep illumination estimation,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 6842–6850. [9] W. Zhang, X. Zhong Fang, X. K. Yang, and Q. M. J. Wu, ‘‘Moving cast shadows detection using ratio edge,’’ IEEE Trans. Multimedia, vol. 9, no. 6, pp. 1202–1214, Oct. 2007. [10] C. Lee, C. Lee, and C.-S. Kim, ‘‘Contrast enhancement based on layered difference representation of 2D histograms,’’ IEEE Trans. Image Process., vol. 22, no. 12, pp. 5372–5384, Dec. 2013. [11] G. Raju and M. S. Nair, ‘‘A fast and efficient color image enhancement method based on fuzzy-logic and histogram,’’ AEU-Int. J. Electron. Commun., vol. 68, no. 3, pp. 237–243, Mar. 2014. [12] D. J. Jobson, Z. Rahman, and G. A. Woodell, ‘‘A multiscale retinex for bridging the gap between color images and the human observation of scenes,’’ IEEE Trans. Image Process., vol. 6, no. 7, pp. 965–976, Jul. 1997. [13] J. Ho Jang, Y. Bae, and J. Beom Ra, ‘‘Contrast-enhanced fusion of multisensor images using subband-decomposed multiscale retinex,’’ IEEE Trans. Image Process., vol. 21, no. 8, pp. 3479–3490, Aug. 2012. [14] R. Kimmel, M. Elad, D. Shaked, R. Keshet, and I. Sobel, ‘‘A variational framework for Retinex,’’ Int. J. Comput. Vis., vol. 52, no. 1, pp. 7–23, 2003. [15] W. Kim, ‘‘Image enhancement using patch-based principal energy analysis,’’ IEEE Access, vol. 6, pp. 72620–72628, Dec. 2018. [16] A. Ignatov, N. Kobyshev, R. Timofte, and K. Vanhoey, ‘‘DSLR-quality photos on mobile devices with deep convolutional networks,’’ in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 3297–3305. [17] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image recognition,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 770–778. [18] M. Gharbi, J. Chen, J. T. Barron, S. W. Hasinoff, and F. Durand, ‘‘Deep bilateral learning for real-time image enhancement,’’ ACM Trans. Graph., vol. 36, no. 4, pp. 1–12, Jul. 2017. [19] J. Park, J.-Y. Lee, D. Yoo, and I. S. Kweon, ‘‘Distort-and-recover: Color enhancement using deep reinforcement learning,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 5928–5936. [20] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, ‘‘Generative adversarial nets,’’ in Proc. Adv. Neural Inf. Process. Syst., Dec. 2014, pp. 2672–2680. VOLUME 8, 2020
W. Kim et al.: Low-Light Image Enhancement Using Volume-Based Subspace Analysis
[21] W. Zhang, X. Zhong Fang, and Y. Xu, ‘‘Detection of moving cast shadows using image orthogonal transform,’’ in Proc. 18th Int. Conf. Pattern Recognit. (ICPR), Aug. 2006, pp. 626–629. [22] Retinex Theory of Color Vision. Washington, DC, USA: NASA, 2001. [23] P. Sen, N. K. Kalantari, M. Yaesoubi, S. Darabi, D. B. Goldman, and E. Shechtman, ‘‘Robust patch-based hdr reconstruction of dynamic scenes,’’ ACM Trans. Graph., vol. 31, no. 6, pp. 1–11, Nov. 2012. [24] V. Bychkovsky, S. Paris, E. Chan, and F. Durand, ‘‘Learning photographic global tonal adjustment with a database of input/output image pairs,’’ in Proc. CVPR, Jun. 2011, pp. 97–104. [25] A. Mittal, R. Soundararajan, and A. C. Bovik, ‘‘Making a ‘completely blind’ image quality analyzer,’’ IEEE Signal Process. Lett., vol. 20, no. 3, pp. 209–212, Mar. 2013. [26] K. Gu, S. Wang, G. Zhai, S. Ma, X. Yang, W. Lin, W. Zhang, and W. Gao, ‘‘Blind quality assessment of tone-mapped images via analysis of information, naturalness, and structure,’’ IEEE Trans. Multimedia, vol. 18, no. 3, pp. 432–443, Mar. 2016. [27] K. Gu, W. Lin, G. Zhai, X. Yang, W. Zhang, and C. W. Chen, ‘‘Noreference quality metric of contrast-distorted images based on information maximization,’’ IEEE Trans. Cybern., vol. 47, no. 12, pp. 4559–4565, Dec. 2017.
WONJUN KIM (Member, IEEE) received the B.S. degree from the Department of Electronic Engineering, Sogang University, Seoul, South Korea, in 2006, the M.S. degree from the Department of Information and Communications, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea, in 2008, and the Ph.D. degree from the Department of Electrical Engineering, KAIST, in 2012. From September 2012 to February 2016, he was a Research Staff Member with the Samsung Advanced Institute of Technology (SAIT), Gyeonggi-do, South Korea. Since March 2016, he has been with the Department of Electrical and Electronics Engineering, Konkuk University, Seoul, where he is currently an Associate Professor. His research interests include image and video understanding, computer vision, pattern recognition, and biometrics, with an emphasis on background subtraction, saliency detection, and face and action recognition. He has served as a Regular Reviewer for over 30 international journal articles, including the IEEE TRANSACTIONS ON IMAGE PROCESSING, IEEE ACCESS, the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, the IEEE TRANSACTIONS ON MULTIMEDIA, the IEEE TRANSACTIONS ON CYBERNETICS, the IEEE SIGNAL PROCESSING LETTERS, Pattern Recognition, and so on.
VOLUME 8, 2020
RYONG LEE received the B.S. degree from the School of Electronics, Telecommunication and Computer Engineering, Korea Aerospace University, South Korea, in 1998, and the M.S. and Ph.D. degrees from the Department of Social Informatics, Kyoto University, Japan, in 2001 and 2003, respectively. From 2003 to 2008, he was a Research Staff Member with the Samsung Advanced Institute of Technology (SAIT), South Korea. Since 2013, he has been with the Korea Institute of Science and Technology Information (KISTI), South Korea. He is currently a Senior Researcher with the Research Data Sharing Center, KISTI. His research interests include spatial data analysis, the Internet of Things, smart city, and artificial intelligence. MINWOO PARK received the B.S. and M.S. degrees from the Division of Computer Convergence, Chungnam National University, South Korea, in 1992 and 2004, respectively. Since 1996, he has been with the Korea Institute of Science and Technology Information (KISTI), South Korea. He is currently a Team Manager with the Research Data Sharing Center, KISTI. His research interests include system architecture, information security, the Internet of Things, smart city, and artificial intelligence. SANG-HWAN LEE received the B.S. degree from the Department of Electronic Computing, University of Ulsan, South Korea, in 1992, and the M.S. degree in software engineering from Korea University, in 2004. Since 1995, he has been with the Korea Institute of Science and Technology Information (KISTI), South Korea. He is currently the Director of the Research Data Sharing Center, KISTI. His research interests include big data analysis, large research data, data governance, data ecosystems, and artificial intelligence. MYUNG-SEOK CHOI received the B.S., M.S., and Ph.D. degrees from the Department of Computer Science, Korea Advanced Institute of Science and Technology (KAIST), South Korea, in 1996, 1998, and 2005, respectively. Since 2005, he has been working as a Senior Researcher with the Korea Institute of Science and Technology Information (KISTI). He has experience in artificial intelligence and big data analytics. His research interests include open science, research data management, and artificial intelligence.
118379