No more Discrimination : Cross City Adaptation of Road Scene Segmenters

No more Discrimination : Cross City Adaptation of Road Scene Segmenters

한줄요약

Domain Adversarial Adaptation(DANN)으로 Segmentation문제에서 Domain Adaptation을 풀어보았다. Adaptation을 위해서 Latent space에서 Source와 Target간의 차이를 줄이도록 Adversarial하게 학습 시킴.

풀고자하는 문제는 Road scene segmentation 문제인데, segmentation에서는 항상 annotation data를 얻기가 힘드니깐, 이미 존재하는 dataset의 annotation을 이용해서 domain adaptation(unsupervised way)으로 segmentation 문제를 풀고자 함.

작년엔가..나왔던 Domain Adversarial NeuralNetworks를 이용함. 각 Domain끼리 구분하는걸 discriminator한테 시키고, feature extractor는 domain을 구분하지 못하도록 계속 latent space를 조정함.
여기서도 거의 유사하게 적용함.

회색은 feature extractor, 초록색은 global-domain discriminator, 오렌지색은 class-wise domain discriminator, 주황색(?)은 predictor로 구성됨.

global adversarial학습에서는 각 domain에서 얻은 feature를 어떠한 domain에서 온건지 구분하게 됨.
eq(2)에선 기존의 GAN처럼 minmax로 loss를 구성했는데, discriminator가 local minima에 빠지게 된다는데 무슨 소린지 잘 이해안됨. alternative 하게 학습 시키면 문제없을 것 같은데?
아무튼 그래서 자기들은 이걸 2개의 loss로 나누고 iterative하게 학습시켰다고 함. eq(4), eq(5) 참고. 의미상으로는 eq(2)와 동일함. feature map을 grid level로 나누어서 함
class-wise adversarial 학습에서는, 앞의 global adaptation이 끝나고 나서 predictor로 source와 target의 label을 prediction한다. 여기서 target의 prediction 결과를 pseudo label로 사용함. 이렇게 pixel-wise label을 구하면 이걸 grid level의 label로 바꾸고(eq.6,7,8 - 그냥 해당 grid에 젤 많은 class로 부여), 그리고 이걸 class-wise adversarial learning시킴. 즉, 클래스가 10개면 10의 discriminator를 만들고, input이 source인지 target인지 맞출려고 함. 동일하게 여기서도 discrinator와 feature extractor를 iterative하게 학습시킴(eq 11)
4.3은 동일한 위치의 로드뷰에서 과거사진과 현재사진을 비교해서 static object를 super-pixel segmentation으로 구한 뒤, 이 결과를 target image prediction refine에 이용하는거. -> 크게 의미가 있을 것 같진 않고, 효율성도 그닥으로 보여짐.

전체적인 느낌은 한마디로 discriminator와 feature extractor를 adversarial하게 학습시켜서, source와 target의 latent space를 target에서 잘되게 조정시켜보겠다. 라는 것으로 요약할 수 있을듯.

결과는 Table1을 보면 adaptation하기전

한 뒤(GA는 Global adaptation, CA는 Class adaptation, full method는 GA+CA+Static, pre-trained는 adaptation전, finetune은 supervised로 학습)

자기들 데이터말고 SYNTHIA -> Cityscapes

measure tool은 다 mean IoU로함.
baseline 튜닝을 얼마나 한지 모르겟지만(생각보다 너무 낮게 나오는 것 같음), 아무튼 성능이 올라가기는 함.

저작자표시 (새창열림)

'Computer Vision' 카테고리의 다른 글

Amulet: Aggregation Multi-level Convolutional Features for Salient Object Detection (0)	2018.04.02
Prediction Deeper into the Future of Semantic Segmentation (0)	2018.04.02
HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis (0)	2018.04.02
[Object Detection] Soft-NMS -- Improving Object Detection With One Line of Code (0)	2018.04.02
[Super..Resolve?] Learning to Super-Resolve Blurry Face and Text Images (0)	2018.04.02

Tech. and Life

No more Discrimination : Cross City Adaptation of Road Scene Segmenters

'Computer Vision' 카테고리의 다른 글

티스토리툴바

No more Discrimination : Cross City Adaptation of Road Scene Segmenters

'Computer Vision' 카테고리의 다른 글

'Computer Vision' Related Articles

티스토리툴바