Relationship Between Visual Complexity and Aesthetics: Application to Beauty Prediction of Photos Litian Sun(B) , Toshihiko Yamasaki, and Kiyoharu Aizawa The University of Tokyo, Tokyo, Japan {sun1101,yamasaki,aizawa}@hal.t.u-tokyo.ac.jp Abstract. Automatic evaluation of visual content by its aesthetic merit is becoming exceedingly important as the available volume of such content is expanding rapidly. Complexity is believed to be an important indicator of aesthetic assessment and widely used. However, psychological theories concerning complexity are only verified on limited situations, and the relationship between complexity and aesthetic experience on extensive scope of application is not yet clear. To this end, we designed an experiment to test human perception on the complexity of various photos. Then we propose a set of visual complexity features and show that the complexity level calculated from the proposed features have a near-monotonic relationship with human beings’ beauty expectation on thousands of photos. Further applications on beauty predication and quality assessment demonstrate the effectiveness of proposed method. Keywords: Aesthetic assessment · Visual complexity · Beauty prediction 1 Introduction The image processing and computer vision community has made great efforts to explore computational methods to make aesthetic decisions similar to human beings. Prediction of photograph aesthetic scores is an undoubtedly challenging problem. To understand how persons perceive visually pleasing stimuli, psychologists proposed many theories, in which complexity has been known to be an important indicator for aesthetic assessment. Pioneers in computational aesthetics as D. E. Berlyne [3,4] suggested that the aesthetic appeal of a pattern seems to depend on the arousing and de-arousing influence of its collative or structural properties, and that arousing quality is a direct linear function of complexity, or the amount of information, whereas pleasantness is generally related to these determinants in an inverted-U manner. Specifically, aesthetic appeals increase with complexity until an optimal level of arousal is reached, and after this point, further increase in complexity would elicit a drop in preference level. Psychologists have conducted a lot of experiments to c Springer International Publishing Switzerland 2015 L. Agapito et al. (Eds.): ECCV 2014 Workshops, Part I, LNCS 8925, pp. 20–34, 2015. DOI: 10.1007/978-3-319-16178-5 2 Relationship Between Visual Complexity and Aesthetics 21 verify or evaluate Berlyne’s theory. Many have successfully observed an invertedU function between complexity and aesthetic experience concerning architecture [1,13], while some only observes the ascending part of the curve [8], and some shows no support for an inverted U relation between preference and entropy [19]. The role that complexity plays in aesthetic preference prediction is also emphasized in the recent processing fluency theory [14,15] which goes further to explore the reason behind the relationship. It suggests that aesthetic experience is a function of the perceiver’s processing dynamics: the more fluently the perceiver can process an image, the more positive is their aesthetic response. Fluency theory works well in predicting aesthetic effects due to many low-level features such as preferences for larger and more highly contrastive displays. However, fluency theory does not square well with the Berlyne’s inverted-U results, in that it indicates a monotonic decrease in preference as a function of complexity. Although complexity is regarded as an important indicator for aesthetic assessment, the relationship between complexity and aesthetic appeal is still debatable and further verification is necessary. The main difficulty in psychological experiments is the limitation of sample size, leading to the problem of insufficient complexity range. Empirical experiments would become time-costing for participants when the sample size numbered in thousands. Thus empirical experiments could not yield a general guideline for aesthetic assessment. Furthermore, the application scope of psychology theories is not clear, as complexity may vary greatly for intra and extra-category images. Despite the lack of large-scale verification and compelling evidence in psychological theories, complexity has already been widely used for aesthetic classification for photo [18], art [6,17], and web-page design [20]. Mean gradient value is considered as measurement of complexity in some works [9,16,17]. Following a similar idea, features related to the file size of compressed image have been found to be a good approximation of judgements of visual complexity and efficient in aesthetic classification task [5,18], because compression algorithms such as JPEG and fractal compression generate good abstraction of lines, colors, repetition information of images. Nevertheless, previous complexity measurements did not take the other factors that may influence human sensation on complexity, such as curvature, object number, object size, pattern regularity, and pattern compositions. In this work, we evaluate the role of complexity played in aesthetic assessment and intend to verify the Berlyne’s inverted-U curve on thousands of photos through computational methods. We use the public database AVA (Aesthetic Visual Analysis) [12]1 , which is derived from online phtograph challenges, with a rich variety of content. As the aesthetic preference of each image is voted 200 times averagely, the difference between individuals is greatly alleviated. AVA contains photos of 8 categories, with 5000 photos in each category. We first designed a small-scale preliminary experiment to test whether human sensation on complexity is congruous. Then we proposed a set of visual complexity features which is capable to summarize composition, statistical and distribution 1 http://www.lucamarchesotti.com/ava/ 22 L. Sun et al. information of patterns in a photo, and applied gradient boost trees regression on these features to set up the complexity model. After that we calculated complexity levels for large-scale photo database, and analysed the relationship between beauty expectation and complexity level. As application, we used the proposed visual complexity features to predict beauty scores using gradient boost trees regression, and to determine aesthetic quality using random forest as classifier. The remainder of this article is organized as follows. The preliminary experiment is illustrated in Section 2. The visual complexity features are described in detail and used to train a complexity model in Section 3. The relationship between complexity and aesthetic experience is discussed in Section 4. The application of the proposed visual complexity features in beauty prediction is presented in Section 5. Finally, conclusions are given in Section 6. 2 Preliminary Experiment on Subjective Complexity We selected 10 photos from 2500 training samples of each category from AVA dataset, making it 80 photos in all. The images were selected as evenly distributed along the aesthetic ratings ranges. Specifically, although in online photograph challenges photos could be aesthetically rated from 1 to 10, the average beauty scores of the 2500 photos in the training set of “Animal” category vary from 2.62 to 8.25. So we sampled photos with the beauty score interval as (8.25−2.62)/10 ≈ 0.56. In this way we managed to collect photos of different aesthetic ratings from various categories. Five participants (2 female and 3 male, aged from 23 to 28) attended this study. All were graduate students with normal or corrected-to-normal vision. As depicted in Fig. 1, 10 images from the same category were shown at one time. And the participants were asked to choose a complexity level for these images from 5 options: 1 (very simple), 2 (simple), 3 (medium), 4 (complex) and 5 (very complex). Photos were shown by category in an alphabetic order: “Animal”, “Architecture”, “Cityscape”, “Floral”, “Fooddrink”, “Portrait” and “Stilllife”. Photos were arranged randomly to eliminate any possible pattern between complexity and aesthetic score. For each image, we calculated the mean and standard deviation of complexity level provided by the five participants. The complexity levels of the 80 images averaged over 5 participants ranged from 1.4 to 5.0, and the standard deviation ranged from 0 to 1.33. As for the image that participants had most different ratings, at least two persons agreed with the same complexity level. This indicates that complexity is measurable for human beings. Fig. 2 shows example images labelled with different complexity levels. To better understand how participants disagree on complexity levels, we show the distribution of standard deviation along the average complexity score in Fig. 3. Participants tend to agree with extreme complexity levels. The standard deviation is low for very simple or very complex images, while high for medium images. Table 1 shows the average degree of disagreement of participants concerning different categories. People tend to agree with the complexity level of images Relationship Between Visual Complexity and Aesthetics 23 Fig. 1. Interface of complexity labelling experiment for the “Animal” category Fig. 2. Example images of 5 complexity levels labelled by participants. The average complexity levels are rounded to integers, and from left to right they are 1(very simple), 2 (simple), 3 (medium), 4 (complex) and 5 (very complex). Images from the same column share the same averaged complexity level. Table 1. Average of standard deviation value for different categories Animal Architecture Cityscape Floral Fooddrink Landscape Portrait Stilllife 0.7302 0.6931 0.4102 0.8260 0.6914 0.5871 0.5690 0.7694 from “Cityscape”, “Landscape” and “Portrait” categories, while disagreement falls onto categories such as “Floral” and “Animal”. 24 L. Sun et al. 1 2 3 4 5 0 0.4 0.8 1.2 1.61.6 Average complexity score Standarddeviationofcomplexityscore Fig. 3. Distribution of mean and standard deviation of complexity level labelled for 80 images 3 Visual Complexity Model In this section we illustrate how the visual complexity features are extracted and trained to evaluate image complexity. Complexity is the degree of difficulty in reconstruction of description of an image. Visual complexity is correlated to factors like distribution of color, texture and edges, curvature, object number, object size, pattern regularity, pattern compositions, etc. We prepare lowlevel features like line segments, contour, and texture using the method of [2], sharpness using [21], and color information is represented in CIECAM02 color space. Then three categories of complexity information are extracted. Composition handles an image as a whole, and summarized the way in which the patterns are spatially distributed in the image. Statistics complexity treats an image as abstractions of object or texture patches. We count the number of objects and calculate the similarity between object and texture patches using mean and standard deviation values of certain information, such as curvature degree, texture regularity, etc. Distribution complexity regards an image as pixels, and measures the differences between distributions of a photo and a pure noise image using divergence. A total of 114-dimension feature is summarized in Table 2. 3.1 Composition Composition is calculated using the orthogonal variant moments (OVM) method in [10], which is designed to be sensitive to specific perturbations such as transformation, and tolerant to certain extent of unexpected disturbance at the same time. As for an image I(x, y), OVM generates a 5-D vector: fovm = (A, Lx, Ly, Dx, Dy), where A is the average value of input. (Lx) and (Ly) are orthogonal components of the surface area. (Dx) and (Dy) represent the position of object in the image. Detailed calculation process is listed as below. Relationship Between Visual Complexity and Aesthetics 25 Table 2. Summary of complexity features Category Short Name Dimension Composition Line segment 20 Color 25 Sharpness 5 Relative color 10 Statistics Eclipse fitness 10 Object number 1 Curvature 8 Texture entropy 1 Texture area 2 Distribution Line orientation 10 Texture orientation 10 Color distribution 12 η = 1 height × width A = η I(x, y)dxdy (1) Lx = η 1 + ∂I ∂x 2 dxdy Ly = η 1 + ∂I ∂y 2 dxdy (2) Dx = η (x + dx)I(x, y)dxdy Dy = η (y + dy)I(x, y)dxdy (3) To extract composition of a photo, we calculate OVM vectors from line segments, color, sharpness, and relative color information separately. The edge map generated by [2] is split into four parts using different thresholds. In this way, pattern displayed with different intensity or importance would be separated. Color information is divided into hue (including hue angle, hue eccentricity, hue composition), chroma, lightness. According to Moon-Spencer model [11], color harmony is closely related to the relative color. We focus on the surrounding region of contour lines. For each circular region with center point on the contour lines, the main relative hue and relative chroma is calculated as the difference between the most dominant color and the second most dominant color. To improve the computation efficiency, the contour lines are down sampled into center points. In this way, we obtain the relative information along the contour lines and its composition is summarized using OVM. 3.2 Statistics To statistically measure visual complexity, we use contour information to count the number of objects, and to calculate characteristics of objects contour such as 26 L. Sun et al. extent of fit to an ellipse, angular orientation, circularity, solidity, degree of curve and relative size compared with the whole picture. The circularity is represented by the ratio of the minor and major axes of the ellipses. Solidity is the ratio of contour area to its convex hull area. The degree of curve is measured as the ratio of the contour length to the perimeter of its minimum enclosing rectangular. These parameters of continuous lines in the contour map are summarized using mean and standard deviation values. As curves are believed to be more complex than straight lines, we extract the curvature from contour using method in [7]. Granularity and regularity of texture is measured using area statistics and entropy. 3.3 Distribution Another important measurement of visual complexity is distribution. Distribution information is represented as the combination of the histogram and its differences from histograms of templates fdb = [H, D]. Take the orientation of line segments for example, orientation ranged in [0,180) is cumulated and normalized into a histogram with 8 bins, H = [h1, h2, ..., hk] , k = 8. The differences between the orientation histogram of line segments and those of the reference images, R, is measured using chi-square divergence. We choose two reference histograms. One with an averaged distribution in histogram represents the extreme noisy situation. And another reference histogram with only one bin valued 1 and all the other bins valued 0 represents the extreme regular situation. Divergence with the two reference histograms characterize the irregular or regular degree of orientation distribution. Detailed calculation is illustrated as following. D = [d1, d2] , di = k j=1 ( hj ri,j − 1)2 hj (4) R1 = [r1,1, r1,2, ..., r1,k] , r1,i = 1 k (5) R2 = [r2,1, r2,2, ..., r2,k] , r2,i = 0, if i = m 1, if i = m , where m = arg max i hi (6) The color distribution is summarized into complexity feature in a similar way. As the hue composition ranges from 0 to 400, we set the histogram with 10 bins. 3.4 Training the Complexity Model As complexity is a continuous variable, a regression rather than classification would be a better choice to train complexity model from previously illustrated features. We employed gradient boosted trees for regression. Parameters such as the count of boosting iterations and maximal depth of each decision tree in the ensemble is optimized through 5-fold validation. Accuracy of regression is measured using root-mean-square error (RMSE). Relationship Between Visual Complexity and Aesthetics 27 To test and compare with other complexity features, we randomly select 10 out of the 80 photos labelled with complexity level in the experiment for testing and the left 70 photos are used for training. We conducted such training and testing 5 times. And the performance is measured by averaging the RMSEs in the 5-fold test. We compare the proposed visual complexity features with compression file size related features proposed in [18] and sum of gradient features used in [16]. The average RMSE for the proposed feature in the random 5-fold test is 0.35/0.83 (training/testing), while it is 0.47/1.05 by [18], and 0.45/0.89 by [16]. The proposed features outperform the comparison methods in random 5-fold test. In order to model the perceived complexity using the labelled complexity levels as accurate as possible, we split the 80 photos according to the standard deviation of complexity scores. The lower the standard deviation values, average complexity score is a better approximation of the actual visual complexity. So we only use photos with low rating disturbance for training, and expect the predicted complexity level is within the variance range of testing photos. We use the 70 photos with standard deviation less than 0.90 for training, and the left 10 photos with standard deviation vary from 0.90 to 1.33 for testing. The prediction accuracy and maximum absolute error are listed in Table 3. Table 3. Comparison of visual complexity features in regression Feature Training Testing RMSE Max err RMSE Max err Human perception 0.63 0.89 1.09 1.33 Proposed visual complexity feature 0.31 0.82 0.68 1.26 Compression file size related feature [18] 0.60 1.58 0.73 1.75 Sum of gradient [16] 0.29 0.80 0.69 1.39 Complexity levels are in the range of [1,5]. The worst complexity prediction using proposed visual complexity features in testing has the absolute error of 1.26, which is less than the maximum standard deviation (1.33) of complexity level labelled by participants. And for the training set, the absolute error of the worst prediction is also lower than the maximum standard deviation (0.89). Thus, we prepare a visual complexity model that could model human beings’ sensation of complexity very well. 4 Relationship between Complexity and Beauty In this section we apply the visual complexity model obtained in Section 3 to the training sets in AVA dataset (each category has 2500 training photos), calculate the expectation of aesthetic score, and explore its relationship with complexity level. 28 L. Sun et al. We calculate the visual complexity level for photos from the training sets in AVA data. As illustrated in Table 1, participants tend to agree with the complexity for photos from “Cityscape” category, so we expect more accurate complexity evaluation on “Cityscape” category than other categories. Example photos from “Cityscape” category with different complexity levels are shown in Fig. 4. Fig. 4. Example photos from “Cityscape” category of 5 complexity levels calculated by proposed visual complexity model. The two images from the same column share the same complexity level. The complexity level calculated using our proposed method is rounded to integrate levels. We employ one-way analysis of variance (ANOVA) to compare the expectation of beauty experience along with complexity level. ANOVA results suggest that the beauty score distribution of at least one complexity level is significantly different from those of other complexity levels (p < .05 for each category), and box plots for all 8 categories are shown in the left of Fig. 5. Due to the large variance ranges, the differences between the beauty score means of different complexity level is not clear. To further test the statistical significance of beauty score expectations, we conduct multiple comparison, group by group t-test, and show the results in the right part of Fig. 5. Beauty score expectations of the complexity level coloured as red are significantly different from the one coloured as blue. Ascending trends could be observed on the right column of Fig. 5 in “Cityscape” and “Landscape” categories, and descending trends are shown in “Floral” and “Fooddrink” categories, while in the other categories only weak ascending or descending trends could be observed. In “Portarit” and “Architecture” categories, we could only observe the ascending trend for the middle 3 complexity levels. And in “Animal” category, the descending trend is not clear for complexity levels 4 and 5. For “Cityscape” and “Landscape” categories, the ascending trends have clear statistical significance, except that aesthetic assessment expectations of photos Relationship Between Visual Complexity and Aesthetics 29 with intermediate complexity levels may be easily confused with those of adjacent complexity level. Taking “Cityscape” category for example, mean beauty score of simple photos (complexity level 2) is significantly different from those of extreme simple, complex and extreme complex photos (complexity levels of 1, 4 and 5), while it is hard to tell mean beauty scores of simple photos from that of medium photos (complexity level 2 and 3). We evaluate the relationship between aesthetic experience and complexity level on AVA dataset training photos (each category has 2500 training photos). Based on our results, we only observed ascending or descending parts of Berlyne’s invert-U curve for different categories. As for the ascending trends in “Architecture”, “Cityscape” and “Landscape”, this is because buildings or landscape scenes are already complex considering the lines and components and few photographer would like to produce too complex photos in these categories. Thus the optimal complexity level in the Berlynes inverted-U curve may be not included in the photo, and the drop of aesthetic experience when complexity level is higher than the optimal level is not observed. As “Animal”, “Floral” and “Fooddrink” categories in which most photos focus on single or small number of objects, the descending trends of beauty expectation is understandable. Too complex photo would lead to distraction and difficulty to focus onto the content of the photo. Simple photo is better to express the beauty of these categories. However “Portrait” and “Stilllife” categories are a little different, as photos convey more semantic meanings and are difficult to model by only low level features. 5 Application in Beauty Prediction As verified in Section 5, visual complexity is closely related to aesthetic experience of photos. In this section we try to predict beauty scores for photos using visual complexity features. Visual complexity features are first extracted as illustrated in Section 3. We employ gradient boosted tree to train the regression model. Parameters are optimized through 5-fold validation similar to Section 3.4. The regression accuracy is measured using RMSE, and the correlation coefficient between the predicted beauty score and the one labelled by human beings. As shown in Table 4, the proposed visual complexity features outperforms compression file size related features in [18] and sum of gradient used in [16]. Considering the fact that beauty scores range from 1 to 10 and the average error of the proposed method is 0.70 for the best case (“Landscape” category) and 0.97 for the worst case (“Animal” category), the proposed method is capable of giving a reasonable estimation of aesthetic experience with. We also tested the visual complexity features under the high/low quality classification task. Photos are divided into high or quality class by introducing a threshold parameter δ. Photos with beauty scores higher than 5.5 + δ is considered as of high quality, while photos with beauty scores lower than 5.5 − δ is considered as of low quality. Higher δ leads to more unambiguous training samples making the classification easier, and when δ = 0 the whole training set 30 L. Sun et al. 1 2 3 4 5 3 4 5 6 7 8 Complexity level Beautyscore 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 54321 2 groups have means significantly different from Group 5 (a) “Animal” 1 2 3 4 5 3 4 5 6 7 8 Complexity level Beautyscore 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 54321 2 groups have means significantly different from Group 4 (b) “Architecture” 1 2 3 4 5 3 4 5 6 7 8 Complexity level Beautyscore 4.6 4.8 5 5.2 5.4 5.6 5.8 6 54321 4 groups have means significantly different from Group 1 (c) “Cityscape” 1 2 3 4 5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 Complexity level Beautyscore 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 54321 2 groups have means significantly different from Group 2 (d) “Floral” 1 2 3 4 5 3 4 5 6 7 8 Complexity level Beautyscore 5 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 54321 4 groups have means significantly different from Group 1 (e) “Fooddrink” Fig. 5. Relationship between aesthetic experience and complexity level. Distribution of beauty experience along complexity level is represented by box plot in the left. And the difference significance is shown in the right. Relationship Between Visual Complexity and Aesthetics 31 1 2 3 4 5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 Complexity level Beautyscore 5 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6 54321 4 groups have means significantly different from Group 5 (f) “Landscape” 1 2 3 4 5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 Complexity level Beautyscore 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6 54321 2 groups have means significantly different from Group 2 (g) “Portrait” 1 2 3 4 5 2 3 4 5 6 7 Complexity level Beautyscore 5.15 5.2 5.25 5.3 5.35 5.4 5.45 5.5 5.55 5.6 54321 2 groups have means significantly different from Group 2 (h) “Stilllife” Fig. 5. (Continue) Table 4. Comparison of beauty prediction results Category Proposed method Compression related [18] Sum of Gradient [16] RMSE Correlation RMSE Correlation RMSE Correlation Animal 0.97 0.16 0.74 0.22 1.05 0.04 Architecture 0.83 0.21 0.71 0.17 0.94 0.06 Cityscape 0.83 0.30 0.82 0.18 0.82 0.13 Floral 0.83 0.21 0.77 0.18 0.88 0.01 Fooddrink 0.74 0.31 0.80 0.20 0.83 0.04 Landscape 0.70 0.38 0.85 0.15 0.76 0.12 Portrait 0.74 0.26 0.78 0.15 0.84 0.07 Stilllife 0.78 0.23 0.72 0.24 0.83 0.04 Beauty scores are in the range of [1,10]. 32 L. Sun et al. 1.0 0.8 0.6 0.4 0.2 0.0 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 Threshold Accuracy Proposed Compression related Sum of gradient (a) “Animal” 1.0 0.8 0.6 0.4 0.2 0.0 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 Threshold Accuracy Proposed Compression related Sum of gradient (b) “Architecture” 1.0 0.8 0.6 0.4 0.2 0.0 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 Threshold Accuracy Proposed Compression related Sum of gradient (c) “Cityscape” 1.0 0.8 0.6 0.4 0.2 0.0 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 Threshold Accuracy Proposed Compression related Sum of gradient (d) “Floral” 1.0 0.8 0.6 0.4 0.2 0.0 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 Threshold Accuracy Proposed Compression related Sum of gradient (e) “Fooddrink” 1.0 0.8 0.6 0.4 0.2 0.0 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 Threshold Accuracy Proposed Compression related Sum of gradient (f) “Landscape” 1.0 0.8 0.6 0.4 0.2 0.0 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 Threshold Accuracy Proposed Compression related Sum of gradient (g) “Portrait” 1.0 0.8 0.6 0.4 0.2 0.0 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 Threshold Accuracy Proposed Compression related Sum of gradient (h) “Stilllife” Fig. 6. Performance comparisons on high/low-quality classification task Relationship Between Visual Complexity and Aesthetics 33 is used. We employ random forests as classifier. The maximum tree depth is set as 5, and the maximum number of trees in the forest is set as 100. We set the threshold δ as [1.0, 0.9, ..., 0.1, 0.0]. For δ = 1, there are several hundreds of photos left for each category, which is large enough to test the proposed method. And the performance is shown in Fig. 6. These results are consistent with the performance in regression task. Visual complexity features have best performance for photos from “Landscape” category, and worst performance for “Portrait” category. This is because features indicating complexity in landscape photos mostly refer to more objects, and complex topographies and landforms could be easily summarized through composition and statistical features. On the contrary, the complexity of portrait photos which are mainly human faces is difficult to measure using only low-level features. Semantic interpretations are necessary, and familiarity may be a predominate factor in complexity detection. The proposed visual complexity features averagely outperform 8.5% compression related features in [18] and 14.7% over sum of gradient in [16] for the “Landscape” category, and for the “Portrait” category 3.9% over compression related features in [18] and 6.8% over sum of gradient in [16]. 6 Conclusions Through a small-scale experiment2 , we found that human beings’ judgement on complexity levels are congruous, hence complexity levels of photo is measurable. We proposed a set of visual complexity features and trained them into complexity model. And then we calculated complexity level for large-scale photo database to explore the relationship between beauty expectation and complexity level. Our analysis confirmed the ascending part of Berlyne’s inverted-U curve and the importance of complexity in aesthetic assessment. The proposed visual complexity features are proved to be efficient in both beauty prediction and quality classification tasks. In future work, we intend to enrich the definition of complexity to better model human beings’ complexity sensation and include semantic features such as familiarity. To improve the accuracy of complexity model, we would collect complexity labels in a crowd-sourcing way. We would like to further explore the role that complexity played in aesthetic assessment and try to predict aesthetic ranks for photos. References 1. Akalin, A., Yildirim, K., Wilson, C., Kilicoglu, O.: Architecture and engineering students’ evaluations of house fa¸cades: Preference, complexity and impressiveness. Journal of Environmental Psychology 29(1), 124–132 (2009) 2. Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(5), 898–916 (2011) 2 For further information about the photos used in the experiment and the collected complexity labellings, please contact sun1101@hal.t.u-tokyo.ac.jp 34 L. Sun et al. 3. Berlyne, D.E.: Studies in the new experimental aesthetics: Steps toward an objective psychology of aesthetic appreciation. Hemisphere (1974) 4. Berlyne, D.: Aesthetics and psychobiology. Appleton-Century-Crofts, New York (1971) 5. Donderi, D.C.: Visual complexity: a review. Psychological Bulletin 132(1), 73 (2006) 6. Forsythe, A., Nadal, M., Sheehy, N., Cela-Conde, C.J., Sawey, M.: Predicting beauty: fractal dimension and visual complexity in art. British Journal of Psychology 102(1), 49–70 (2011) 7. He, X.C., Yung, N.H.: Curvature scale space corner detector with adaptive threshold and dynamic region of support. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 2, pp. 791–794. IEEE (2004) 8. Heath, T., Smith, S.G., Lim, B.: Tall buildings and the urban skyline the effect of visual complexity on preferences. Environment and Behavior 32(4), 541–556 (2000) 9. Mallon, B., Redies, C., Hayn-Leichsenring, G.U.: Beauty in abstract paintings: perceptual contrast and statistical properties. Frontiers in Human Neuroscience 8 (2014) 10. Mart´ın H, J.A., Santos, M., de Lope, J.: Orthogonal variant moments features in image analysis. Information Sciences 180(6), 846–860 (2010) 11. Moon, P., Spencer, D.E.: Geometric formulation of classical color harmony. JOSA 34(1), 46–50 (1944) 12. Murray, N., Marchesotti, L., Perronnin, F.: Ava: A large-scale database for aesthetic visual analysis. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2408–2415. IEEE (2012) 13. Nasar, J.L.: What design for a presidential library? complexity, typicality, order, and historical significance. Empirical Studies of the Arts 20(1), 83–99 (2002) 14. Reber, R.: Processing fluency, aesthetic pleasure, and culturally shared taste. In: Aesthetic Science: Connecting Minds, Brains, and Experience, pp. 223–249 (2012) 15. Reber, R., Schwarz, N., Winkielman, P.: Processing fluency and aesthetic pleasure: is beauty in the perceiver’s processing experience? Personality and Social Psychology Review 8(4), 364–382 (2004) 16. Redies, C., Amirshahi, S.A., Koch, M., Denzler, J.: PHOG-derived aesthetic measures applied to color photographs of artworks, natural scenes and objects. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012 Ws/Demos, Part I. LNCS, vol. 7583, pp. 522–531. Springer, Heidelberg (2012) 17. Rigau, J., Feixas, M., Sbert, M.: Informational aesthetics measures. IEEE Computer Graphics and Applications 28(2), 24–34 (2008) 18. Romero, J., Machado, P., Carballal, A., Santos, A.: Using complexity estimates in aesthetic image classification. Journal of Mathematics and the Arts 6(2–3), 125–136 (2012) 19. Stamps III, A.E.: Entropy, visual diversity, and preference. The Journal of General Psychology 129(3), 300–320 (2002) 20. Tuch, A.N., Bargas-Avila, J.A., Opwis, K., Wilhelm, F.H.: Visual complexity of websites: Effects on users experience, physiology, performance, and memory. International Journal of Human-Computer Studies 67(9), 703–715 (2009) 21. Vu, C.T., Phan, T.D., Chandler, D.M.: A spectral and spatial measure of local perceived sharpness in natural images. IEEE Transactions on Image Processing 21(3), 934–945 (2012) http://www.springer.com/978-3-319-16177-8