Portfolio
Computational Design Selected Works 2017-2021
Wenjie Huang Computer Science
Wenjie Huang Tel : +86 13719192209 Email : JankinHuang0302@163.com
EDUCATION 17-22
Architecture, Bachelor of Architecture in Architecture Shenzhen University, College of Architecture and Urban Planning
INTERNSHIPS 21-22
XKoolTech,Back-End development with Python and C# , Research on Deep Learning in Architecture
19
11 Architect,Research on Educational Architecture
EXPERIENCE 21
2021 DigitalFUITURES: Workshops Artificial Intelligence & Big data Features prediction system Tongji Univerisity, Philip F.Yuan, Hao Zheng
21
Kaggle – Shopee Price Matching Competition Data exploratory analysis and Multimodal prediction
20
2020 DigitalFUITURES: Workshops Convolution NeuralNetwork in Architectural Design Tongji Univerisity, Philip F.Yuan, Hao Zheng
HONORS 21/20
DigitalFUITURES: Workshops excellent team member
21
Kaggle – Shopee Price Matching Competition Top11%
20
Second Prize in National Scholarship
19/20
First Prize Excellent Student Cadre Scholarship
SKILLS Skilled in Python development and Deep learning applications Skilled in Computational design and Architectural robotics
LANGUAGE Native in Mandarin Chinese, Native in Cantonese, Fluent in English
CONTENT
P4 - 9
Multi-Modal Price Matching [Efficient-Net, NF-Net,Bert]
P10 - 17
Deep Learning + Design Projects [CNN,GAN,Big data Features Prediction System]
P18 - 23
Market Memory Storage [YOLO_v3,Object Detection,Architecture Design]
01 Multi-Modal Price Match Guarantee Efficient-Net, NF-Net,Bert Kaggle : Shopee-Price Match Guarantee Teammate : Zoey Xu zoeyxu@aliyun.com Duration : 2021.5 Competition : Team Work (Leader)
Do you scan online retailers in search of the best deals?In this competition, we applied our machine learning skills to build a model that predicts which items are the same products. Captained my teammates to conduct data exploratory analysis, feature processing, building baseline (TFIDF+ResNet18+Result Splicing) and models optimization iteration. Responsible for parameters adjustment of Finetune and TFIDF for BERT model, and NFNet training for CV, the similarity algorithm selection, models’ fusion and prediction results splicing, ranked top 11 among over 3000 teams.
ABSTRACT Kaggle : Shopee-Price Match Guarantee Description : Retail companies use a variety of methods to assure customers that their products are the cheapest. Among them is product matching, which allows a company to offer products at rates that are competitive to the same product sold by another retailer. In this competition, well applied our machine learning skills to build a model that predicts which items are the same products. The applications go far beyond Shopee or other retailers. We contributions to product matching could support more accurate product categorization and uncover marketplace spam. Evaluation : F1 score = 2 × (Precision × Recall) / (Precision + Recall)
DATASETS 34250 (Image + Titles + Phash) 34250 images in the training set roughly 70,000 images in the hidden test set
[train/test].csv the training set metadata. posting_id - the ID code for the posting. image - the image id/md5sum. image_phash - a perceptual hash of the image. title - the product description for the posting. label_group - ID code for all postings that map to the same product. Not provided for the test set.
[train/test]images the images associated with the postings.(34250)
... × 34250
OVERVIEW My Solution in the Competition Top 11%
IMAGE Efficientnet B3 + Efficientnet B5 + Nfnet l0
TEXT BERT + TFIDF
PRECDITION KNN & Cosine Similarity
KNN -k-nearest neighbors algorithm
COMPARISON Comparison to Top 1% Kernel in the Competition
CONCLUSION Limitations & Future works
Future Works : After comparing with the winning kernel of the competition. The most serious deficiency of our kernel is post-processing and vote in prediction. According to the prediction results of the optimal LB score model in the early stage, we found that there are a large number of unmatched samples in the statistical results, that is, the matching results only match themselves, so we can reduce the matching standard (increase the matching distance threshold or reduce the similarity threshold) Comparison: Baseline ( TFIDF + ResNet18 + Result Splicing) : LB Score: 0.653 Top 80% My Solution (EfficientNet-B3 + EfficientNet-B5 + Nfnet-l0 + TFIDF + BERT + (Multi-Modal)) : LB Score: 0.734 Top 11% Top Solution (Nfnet-l0 + swin-small + EfficientNet-B0 + BERT + Indonesian-BERT + Multlingual-v1 + TFIDF) : LB Score:0.746 Top 1%
02 Deep Learning + Design Projects CNN,GAN,Big data Features Prediction System Projects in DigitalFUTURES and XkoolTech Tutor : Hao Zheng,Philip.F.Yuan,Tai Li Duration : 2020-2021 Design : 2.2/2.3 Individual Work 2.1 Team Work (Leader)
2.1 Urban Big data Features Prediction System Using the API of Amap to collect POI and geographic data of Shanghai, combined with the popularity and housing prices crawled of the Lianjia agent as basic dataset, adopted pix2pixHD model to establish forecast model for second-hand housing distribution map. 2.2 Reform the Ancient Façade Used Style Transfer technology to explore the possible applications of deep CNN in architecture design, studied the façade style of the Parthenon Temple and successfully applied the ancient Greek architectural style to modern vehicles 2.3 Landscape Generation with GAN We collected 120 landscape planes on pinterest and ArchiDaily. By training pix2pixHD model, the automatic generation of landscape planes is realized.
ABSTRACT 2.1 Urban Big data Features Prediction System
Description : Using the API of Amap to collect POI and geographic data of Shanghai, combined with the popularity and housing prices crawled of the Lianjia agent as basic dataset, adopted pix2pixHD model to establish forecast model for secondhand housing distribution map.
DATASETS POI Data + Graph Data
POI (Points of Interest) Data (From Amap.api)
Index : Index of second-hand houses in Shanghai.(Total 6000) location: Longitude and latitude of each sample. Hospnums: Number of hospitals within 1000m. Subnumbs: Number of hospitals within 800m.
Edubumbs: Number of hospitals within 500m. Total_Price: The price of the whole house. Unit_Price: Unit price per square meter. Follow_Count: Number of followers on lianjia.com.
Househeat: Househeat(θ) = α / log2(β) α = Number of followers on lianjia.com. β = Duration on sale. *Numbs are Based on the service radius of each POI point. Graphs Data (From Baidu Map &ArcGis)
Architectural outline of Shanghai
Urban road Network of Shanghai
District Centroid Distance: Sum of coefficients of sample distance from each urban center. ( Closest 5 Urban Center)
OVERVIEW Data Processing + Generative Adversarial Networks (GAN)
ALGORITHM Generative Adversarial Networks (GAN)
Encoder
Decoder
Generator
Discriminator
DATA PROCESSING Data Processing of POI Data of Shanghai Central Area
Combie/Househeat
Combie/Unitprice
Hospital_numbs
Edu_numbs
District Centroid Distance
EDA and Structure Data
MODEL TRAINING Ⅰ Several different attempts of model training
image segmentation (Training set) Input Feature
Input Label
POI_Combination_Img
POI_Segmentation
Househeat_Img
Househeat_Segmentation
8000×8000
225×512×512
8000×8000
225×512×512
Model 01 input
POI_Image
real
Househeat_Image
synthesized
input
real
synthesized
epoch195 - epoch200
POI_Image: Subway only, 225 Segmentation
Result: Training set over fitting, Test set under fitting
Househeat_Image: Visualization based on thermal radius
Reason: Subway POI only is not comprehensive enough, and the visualization method of househeat has errors
Model: Pix2Pix, 200 epochs, Tesla V100
Model 02 input
POI_Image
Househeat_Image
real
synthesized
input
real
synthesized
epoch75 - epoch80
POI_Image: The urban map of green space and river is added
Result: Training set under fitting
Househeat_Image: Visualization based on thermal radius
Reason: The visualization method of househeat has errors,It's hard to connect with the urban map.
Model: Pix2Pix, 200 epochs, Tesla V100
MODEL TRAINING Ⅱ Several different attempts of model training
Model 03 input
POI_Image
real
Househeat_Image
synthesized
input
real
synthesized
epoch190 - epoch200
POI_Image: Full features, 50 selected Segmentation
Result: Training set over fitting, Test set under fitting
Househeat_Image: Visualization based on heat
Reason: The selected 50 fragments are easy to cause over fitting and poor generalization
Model: Pix2Pix, 200 epochs, Tesla V100
Model 04 input
POI_Image
real
Househeat_Image
synthesized
input
real
epoch09- epoch14
POI_Image: Full features, 225 Segmentation
Result: Training set under fitting
Househeat_Image: Visualization based on small radius
Reason: Too many black areas in the labels Delivered inaccurate information.
Model: Pix2Pix, 200 epochs, Tesla V100
synthesized
Model 05 input
POI_Image
Househeat_Image
real
synthesized
input
real
synthesized
epoch190 - epoch200
POI_Image: RGB_Full features, 225 Segmentation, with Mask.
Result: Good performance in training set and test set
Househeat_Image: Visualization based on Weighted value.
Reason: RGB color distinguishes different features well, which can also be learned by machines. Mask reduces interference in non residential areas
Model: Pix2Pix, 200 epochs, Tesla V100
PRECDITION Using model 5 to predict Househeat in Beijing
03 Market Memory Storage YOLO_v3,Object Detection,Architecture Design
AI + Architecture Exhibition in Designsociety Tutor : Fei Qu fqu@szu.edu.cn Duration : 2021.12 Location : Shenzhen China Competition : Team Work (Leader)
Design society, china’s first dedicated cultural design hub, has opened its doors to the public in shenzhen. the new institution is housed within the sea world culture and arts center (SWCAC), a building designed by japanese architect fumihiko maki, and houses a new V&A gallery. Shekou fishing Market will be demolished by the end of November 2020. Invited by Design society and Huawei cloud , we will build an art installation in the sea world culture and arts center (SWCAC) to continue Shekou people's memory of Shekou fishing Market. This art installation combining AI and architecture will be displayed in the museum. We will place the camera in the market and use Yolo_v3 algorithm carries out human shape detection in real time. Through stepping motors and art devices in the museum, we conveys the same sense of crowding to exhibitors as in Shekou market. This sense of crowding will also be preserved in the form of data.
ABSTRACT AI × Architecture Interactive art installation Description : Shekou fishing Market will be demolished by the end of November 2020. Invited by Design society and Huawei cloud , we will build an art installation in the sea world culture and arts center (SWCAC) to continue Shekou people's memory of Shekou fishing Market. This art installation combining AI and architecture will be displayed in the museum. Interactive: We will place the camera in the market and use Yolo_v3 algorithm carries out human shape detection in real time. Through stepping motors and art devices in the museum, we conveys the same sense of crowding to exhibitors as in Shekou market. This sense of crowding will also be preserved in the form of data.
SHEKOU Design Society Museum © Maki and Associates
Shekou Market since 1956
Design Society Museum since 2017
CAMERA LOCATION Three Cameras in Shekou Market to record People's activities
OVERVIEW Market + Algorithm + Museum
ALGORITHM YOLO_v3 Object Detection
DBL×5
Darknet-53 without FC Layer DBL
Res1
Res2
Res8
Res8
Res4
DBL
DBL
416×416×3
DBL
conv
Up Sampling
concat
DBL
DBL
conv
DBL×5
Yolo_v3_Structure
DBL
Up Sampling
concat
DBL
=
conv
BN
Res_unit Leaky relu
res unit
Y2 26×26×255
DBL
DBL
conv
DBL×5
Darknetconv2D_BN_Leaky
Y1 13×13×255
Y3 52×52×255
Resblock_body
=
DBL
DBL
Add
resn
=
Zero Padding
DBL
res unit Res_unit×n
YOLO_v3 Algorithm Structure
Video of Shekou Market
Yolo_v3 Loss Function ©Keyird
STRUCTURE Elastic Fabric + Stepper Motor + Prefabricated Frame
EXHIBITION Exhibition from December 2020 to February 2021
MOTION Controled by Programmable Logic Controller (PLC)
PLC Control the movement of the Stepper Motor
Market - Museum Movement in each time period
2D Plan Drawing of PLC