Jeremy Bernardy and Erica Hway INTEGRATING HUMAN AND COMPUTER VISION ARCHITECTURE AS CATALYST 2015 WORKSHOP Andrea Johnson and Jentery Sayers GROUP PORTFOLIO
Project Abstract The final project focused on determining what “Reading Façades” meant to each group. As we explored the space around Pillsbury Hall we began to conclude that the façade of the building was not just the face but was also the surrounding context that completed a defined journey. To document this we video recorded a path that traversed the exterior of Pillsbury Hall. Within this path we noticed that what we perceived as defining moments of the space was quite varied from what the computer read as important. As humans we tended to focus on a specific object whereas the computer focused on the surrounding context
jbernardy42
schw1309
BERNARDY / HWAY
INDIVIDUAL SUMMARY Jeremy Bernardy Reading Façades assumed more of a spectrum in my exploration as the week progressed. In the beginning of the week façades meant more of a composition of hierarchical elements. These elements could be materials, windows and doors, entry ports; anything that defined a face of a building. Once the term “computer vision” became part of the discussion a new understanding began to develop. The focus of the photography expeditions shifted to one that was about the study of how computer modelling is different from what we, as humans, perceive in the real world environment. The final project once again transitioned this understanding and focus to one of how we can interpret computer vision and human vision together to create a façade that is open and transparent yet still visually discernible as a built boundary. When we [as a class] first photographed the faces of Rapson Hall it was interesting to see how the computer program PhotoScan would deal with the various degrees of difference between each person’s input of data. Distance, sunlight, resolution, and what was captured in the frame all played a role in how the final model rendered from the computer. From my exploration I found that computer modelling relies on similarities in environment but not in building or object texture. For instance it often helped to have the full building in the frame and then overlap each successive picture by at least 1/3 of the previous picture. The PhotoScan model, once proper techniques were used, was actually a fairly accurate representation of the actual object both in two dimensions and three dimensions (proportionally at least). In the second exploration of the Pillsbury Hall entrance archway the use of PhotoScan became truly apparent for the design of architecture. The intricate stone carvings and jagged texture of the stones used would be quite difficult to model on the fly even in programs such as Rhino. When Dustin Schipper was able to CADCAM a foam replica of the face that was carved in the actual stone with great precision this lead me to believe the value that the program would have for say historic preservation architecture or really any field where the object in question must be handled with care. In Jentery Sayers lecture this was also readily apparent in the antique pin that was modelled with moving parts. In my final project (with Erica Hway) we focused on how the definition of facade includes not only what you see in the building but also how the surrounding environment contributes. For the project we began by video recording a specified path that we believe defined the “façade” of Pillsbury Hall. Along this path we photographed defined moments where the façade changed dramatically but was also part of the larger whole. From this exploration we then analyzed each of the photographed points in PhotoScan to get a sense of what computer vision could tell us about our larger path. We found that often the computer would model the surrounding context with precision but left out the object that we [in human vision] were focusing on. The context from each of the models were then imported in Rhino in order to reduce each of the entries into a simplified version of only 50 faces. From this we combined each of the reduced models into a new model which represents the collection of what the computer interpreted as the “façade” of our recorded path. This new path is then the result of how computer vision reads facades through human vision input. BERNARDY / HWAY
INDIVIDUAL SUMMARY Erica Hway At the core of the week “Reading Facades” was a study and discussion of computer vision versus human vision. These discussions were based on and accompanied by explorations of our physical environment as well as the use of photogrammetry through programs such asPhotoScan and Rhino. Throughout the discussions, questions regarding computer vision and its use in the architectural field ranged from representational to speculative. The first exploration was around a place known to us, which was Rapson Hall. After taking a series of photographs and inputting them into the program PhotoScan, a 3D model was generated based on how the computer read our photos (photogrammetry). To my surprise, the model generated barely resembled the actual physical form of the building. However, the program did generate forms that responded to different lighting and material qualities. Furthermore, we realized that while we were able to relinquish control to the computer, the effects rendered in PhotoScan were greatly dependent on the methodology used to take the photos. With this knowledge, I did an individual exploration of various materials and lighting qualities throughout campus as well as used different photographic methods. One revelation that came from this exercise was the amount of context the computer may perceive based on various textures. For example, objects reflected on glazed surfaces may be translated into physical 3D objects in PhotoScan. Considering the speculative end of the spectrum, photogrammetry displays a computer’s unbiased perception of the world around us with the ability to have multiple foci. Following several individual explorations and perceiving a greater understanding of these systems, we explored two entries as a class: the rec center entry and the Pillsbury entry. The Pillsbury entry was particularly interesting in that large amounts of detail, including sculptural forms, could be perceived along the facade. Furthermore, multiple paths of transversal had different results (e.g. walking across the facade vs. walking through the facade). In my personal exploration, I walked parallel to the entry rather than through it. My results using PhotoScan were interesting in that it blended together landscape elements (trees, rocks, bushes) with the facade of the building. From here, I continued exploration of contextual analysis using photogrammetry and computer vision. The last step in this explorative process of the week (while not at all final) was in considering the computer perception of peripheral boundaries and how it relates to our perceptions of facades. As we move through spaces, we implicitly perceive spatial boundaries but are not always aware of the extents or contents of these boundaries. We often view a facade as a single face and independent entity within our context. Through the use of photogrammetry, it became evident that our context and periphery is a complex composition of elements with several material and lighting qualities that interact with each other. 3D models generated through the computer often did not distinguish between the branches overhead that spanned to the buildings across the street or the railings below. In fact, the straightforward focal point that humans usually perceive as the travel through space was completely disregarded using photogrammetry, and sort-of “tunnel of facades” was all that appeared. Perhaps the way that computer vision forms our periphery, while not true to the actual physical form, is a more accurate representation of how we implicitly experience our surrounding environment.
BERNARDY / HWAY
The Path The path that we chose used Pillsbury Hall as the focus objects with the surrounding context as study for how computer vision and human vision would interpret the space differently. The path included eight points spaced in a consistent pattern but yet at defining moments in the space. The video was recorded in a first person perspective with the subsequent photographs taken in a 180 degree panorama style set.
Computer Vision
Human Vision
BERNARDY / HWAY
McNamara Alumni Center Texture on the surface of the Photoscan photogrammetry model gives the facade volume. However; when you rotate the model it is clear that not enough photos were used to create a model with depth. Here only six photographs were used to create a quick study. Future modeling would benefit from additional photographs.
BERNARDY / HWAY
Photographed Critical Points The chosen points on our path are spaced in such a way to reflect the nature of the context while staying a consistent distance. The photographs were part of a larger series which were taken in a 180째 panorama style. The points also present the facade of Pillsbury Hall as we interpretted it to be; a re-presentation of what computer vision looks like with human vision influence.
First
Last
BERNARDY / HWAY
McNamara Alumni Center Texture on the surface of the Photoscan photogrammetry model gives the facade volume. However; when you rotate the model it is clear that not enough photos were used to create a model with depth. Here only six photographs were used to create a quick study. Future modeling would benefit from additional photographs.
2
3
4
5
6
7 BERNARDY / HWAY
Process The process used for the project was to first record and document the path. Once the photos were analyzed in PhotoScan to see what differences between computer vision and human vision were we ran the meshes through Rhino in order decimate the model to 50 faces.
BERNARDY / HWAY
BERNARDY / HWAY
Process The process used for the project was to first record and document the path. Once the photos were analyzed in PhotoScan to see what differences between computer vision and human vision were we ran the meshes through Rhino in order decimate the model to 50 faces.
BERNARDY / HWAY
BERNARDY / HWAY
Combined Final Model The final model represents a faรงade we created using photogrammetry, computer vision, and human vision. The faรงade of a building for us meant not only where the building starts and stops but also where the surrounding context begins and ends. This larger whole makes up the true face of the building.
BERNARDY / HWAY