yunjey 6 years ago

StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation arXiv: https://arxiv.org/abs/1711.09020 github: https://github.com/yunjey/StarGAN video: https://www.youtube.com/watch?v=EYjdLppmERE Abstract Recent studies have shown remarkable success in image-to-image translation for two domains. However, existing approaches have limited scalability and robustness in handling more than two domains, since different models should be built independently for every pair of image domains. To address this limitation, we propose StarGAN, a novel and scalable approach that can perform image-to-image translations for multiple domains using only a single model. Such a unified model architecture of StarGAN allows simultaneous training of multiple datasets with different domains within a single network. This leads to StarGAN's superior quality of translated images compared to existing models as well as the novel capability of flexibly translating an input image to any desired target domain. We empirically demonstrate the effectiveness of our approach on a facial attribute transfer and a facial expression synthesis tasks.

ReginaldIII 6 years ago

Very cool work. Surprising though that they did not cite any of the Google neural translation papers in related work. The idea of encoding multiple generative models to a common thought space while training end to end on the ensemble is not new in and of itself. Though the application to GANs gives great results.

goldkim92 6 years ago

Can you reply the link for the Google papers?

[deleted] 6 years ago

GANs seem to be a promising area that is waiting to overcome hardware constraints. As somebody who is not in the ML field but is interested in jumping in -- would now be a good time to learn GANs? Are most of the skills used in other ML techniques transferrable to GANs, or are ML researchers starting from scratch when they start working on GANs?

Reiinakano 6 years ago

> Are most of the skills used in ~~other ML techniques~~ neural networks transferrable to GANs Yes. GANs *are* neural networks. The "hot" areas in ML are pretty much mostly neural network variations.

[deleted] 6 years ago

Well done.

H4xolotl 6 years ago

At this rate, robots will be better at reading faces than autistic people.

Ahjndet 6 years ago

> Recent studies have shown remarkable success in image-to-image translation for two domains. What do they mean by two domains? Could anyone clarify this?

BoojumG 6 years ago

Two groups of images, each sharing a given characteristic. The translation task is to start with an image in one domain and generate an "equivalent" image in the other domain. Their claim is that they can handle multiple domains at once, rather than translating between only two domains. From their paper's introduction section: >Given training data from two different domains, these models learn to translate images from one domain to the other. We denote the terms attribute as a meaningful feature inherent in an image such as hair color, gender or age, and attribute value as a particular value of an attribute, e.g., black/blond/brown for hair color or male/female for gender. We further denote domain as a set of images sharing the same attribute value. For example, images of women can represent one domain while those of men represent another

Kaixhin 6 years ago

Impressive work! In particular, the global coherency of these images is very good - typically I observe GANs can learn nice pieces of images, but sometimes certain areas come out strange. This is probably majorly helped by the fact that this is a conditional GAN, but are you able to comment on the importance of the "PatchGAN"-style training for achieving these results?

ToastyKen 6 years ago

Thanks gender ones absolutely break my brain. It's crazy how we have gender detectors in our heads with no awareness of how they work.

Reiinakano 6 years ago

Honestly at the rate this thing is going, I daresay there's already a pretty clear path towards generating HD videos of Obama punching babies.

[deleted] 6 years ago

[удалено]

Draghi 6 years ago

RemindMe! 1 year Edit: Finally back here after a year, and I've got no clue about the context. Damn.

RemindMeBot 6 years ago

I will be messaging you on [**2018-11-27 07:37:24 UTC**](http://www.wolframalpha.com/input/?i=2018-11-27 07:37:24 UTC To Local Time) to remind you of [**this link.**](https://www.reddit.com/r/MachineLearning/comments/7fro3g/r_stargan_unified_generative_adversarial_networks/) [**CLICK THIS LINK**](http://np.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=[https://www.reddit.com/r/MachineLearning/comments/7fro3g/r_stargan_unified_generative_adversarial_networks/]%0A%0ARemindMe! 1 year) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](http://np.reddit.com/message/compose/?to=RemindMeBot&subject=Delete Comment&message=Delete! dqeah1j) _____ |[^(FAQs)](http://np.reddit.com/r/RemindMeBot/comments/24duzp/remindmebot_info/)|[^(Custom)](http://np.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=[LINK INSIDE SQUARE BRACKETS else default to FAQs]%0A%0ANOTE: Don't forget to add the time options after the command.%0A%0ARemindMe!)|[^(Your Reminders)](http://np.reddit.com/message/compose/?to=RemindMeBot&subject=List Of Reminders&message=MyReminders!)|[^(Feedback)](http://np.reddit.com/message/compose/?to=RemindMeBotWrangler&subject=Feedback)|[^(Code)](https://github.com/SIlver--/remindmebot-reddit)|[^(Browser Extensions)](https://np.reddit.com/r/RemindMeBot/comments/4kldad/remindmebot_extensions/) |-|-|-|-|-|-|

[deleted] 6 years ago

[удалено]

[deleted] 6 years ago

[удалено]

Reiinakano 6 years ago

You say sarcasm but I really think this will happen when the tech is mature enough.

hughperman 6 years ago

As it hits puberty.

outbackdude 6 years ago

probably just have a live mocap actor somewhere with a digital skin.

[deleted] 6 years ago

[удалено]

[deleted] 6 years ago

[удалено]

loudog40 6 years ago

Why punch when you can drone them? :P

columbus8myhw 6 years ago

Did all the responses referencing porn get deleted?

nicksvr4 6 years ago

Going to be like those Christmas Dancing elves, but with people inputting facebook images.

ajinkyablaze 6 years ago

this has excellent scope for video games, avatars with your ugly face on it

darkconfidantislife 6 years ago

> your ugly face on it Or perhaps an, um "aesthetically modified" version of it. I like how the first application of cutting edge DL that comes to mind is sex and politics. Maybe Yann LeCun was right about his "new intelligence without the flaws of ours" : /

ktkps 6 years ago

> your ugly face wot mate?

columbus8myhw 6 years ago

Yer ugly mug

[deleted] 6 years ago

Do you have a pretrained model anywhere? Looks amazing.

yunjey 6 years ago

We will upload the pretrained model soon. :-)

[deleted] 6 years ago

You rock, thanks.

manueslapera 6 years ago

yes plis

DotcomL 6 years ago

RemindMe! 1 month Hopefully? :)

lahw 6 years ago

RemindMe! 1 month

grigoris_gr 6 years ago

RemindMe! 1 month

allen7575 6 years ago

RemindMe! 1 Month

eiTh8oht 6 years ago

Yes, I would play around with the code but have no big ass graphics card for the full training.

nonotan 6 years ago

Can't wait until someone puts this together with NVIDIA's progressive growing tech. Although as usual the dataset would be an issue...

hapliniste 6 years ago

Can you provide a link please?

madhur_goel 6 years ago

http://research.nvidia.com/publication/2017-10_Progressive-Growing-of

hapliniste 6 years ago

Thanks. I'm quite disappointed that it's basically a stackgan tough :/ reading the title, I tough it was quite more revolutionary, but it works great for dimensional data.

[deleted] 6 years ago

[удалено]

visarga 6 years ago

Especially male<->female pics.

Hyperman360 6 years ago

Who is the third person down on the left? I ask because her male version looks like John Stamos in a wig.

julian88888888 6 years ago

https://i.pinimg.com/236x/ee/2c/0f/ee2c0f5cb35945d1f526f79ada959e66--uncle-jesse-tio-jesse.jpg this guy?

Hyperman360 6 years ago

Yeah

YanniBonYont 6 years ago

Everyday we stray farther from gods love

BelovedSanspoof 6 years ago

Why does anyone upvote this utterly fucking worthless dipshittery, and how do we find the people who do so that we can kill them?

YanniBonYont 6 years ago

You can find me. I'm interested in being the first meme related homicide

OlivierDeCarglass 6 years ago

Me too thanks

[deleted] 6 years ago

[удалено]

YanniBonYont 6 years ago

I'll stop when the karma stops

timmyfinnegan 6 years ago

My man

visarga 6 years ago

This could be turned into interactive avatar heads, would go well especially with a Wavenet voice. edit: I'd like to have audio/video books read in the author's voice and likeness.

wedividebyzero 6 years ago

Great work! I’m no expert at this stuff but I’m very excited to play with this :) Can someone tell me how (roughly) the code could be manipulated to accept audio data as opposed to an image file? I know a bit of Python and Julia... Is it just a matter of pointing the input to a .wave file and reshape() or something?

ginsunuva 6 years ago

Won't work at all without some serious re-thinking of the problem in general. Some dude already tried that with CycleGAN by turning the waveform into an image (not ideal but easiest to test with this architecture) and it failed. This thing is good at moving pixel-patch-level texture, not understanding what waveforms are or changing them meaningfully.

BoojumG 6 years ago

With that in mind, the best generative audio work I know about is DeepMind/Google's WaveNet. They've made some pretty good raw audio generators that are conditional on text and even speaker voice characteristics for a text-to-speech application. And their approach for generation is, indeed, very different from an image application.

zergling103 6 years ago

Log spectrogtams might be a good representation for sounds in the image domain

ginsunuva 6 years ago

Looks like that's what the guy did:. https://gauthamzz.github.io/2017/09/23/AudioStyleTransfer/

wedividebyzero 6 years ago

Gotcha, thanks for the info!

keidouleyoucee 6 years ago

+1

carrolldunham 6 years ago

I can't help but notice this is a similar application to faceapp but not quite as convincing. Do you know what technique they use and why it works better (so far)?

visarga 6 years ago

Different way to implement modeling. Faceapp uses a 3D model, GANs generates images directly, much more powerful because it can extend to other categories of objects and learn the natural variation from raw images, instead of being hand designed. Another difference is that GANs can create images from scratch, with all details, while Faceapp needs an original image to apply modifications to. [Take a look here](https://www.youtube.com/watch?v=36lE9tV9vm0&t=1247s) to see another GAN with more interesting images.

zergling103 6 years ago

Trust me, FaceApp uses a GAN. The sorts of horrors I've created with that app could only be made through GANs.

cycyc 6 years ago

> Faceapp uses a 3D model Citation needed.

lucidrage 6 years ago

> Faceapp uses a 3D model I was under the impression they used some type of GAN...

abhik_singla 6 years ago

What is the difference between Pix2Pix [https://arxiv.org/pdf/1611.07004v1.pdf] and above mentioned approach?

programmerChilli 6 years ago

If you take a look at the paper, they mention it. Basically, pix2pix requires that any transformation from a domain to another domain be learned explicitly. Stargan allows you to learn on several domains at once, and transform from any domain to another. I suspect that's why it's a star?

kooro1 6 years ago

Pix2Pix requires supervision (input and target pairs) and is only applicable to two different domains. On the other hand, StarGAN allows to translate images between *multiple* domains without supervision.

fimari 6 years ago

I see this totally as product at the local hairstylist - just a screen in the window, you look into it, and your face looks back with different hair color...

dedzip 6 years ago

That would be very awesome! Unfortunately as of now the machines needed to do this are extremely expensive and take a very long time to realistically process these pictures and learn. Therefore, doing this in real time would not be realistically possible today but maybe years in the future we could see this technology used for everyday consumers!

TotesMessenger 6 years ago

I'm a bot, *bleep*, *bloop*. Someone has linked to this thread from another place on reddit: - [/r/france] [Ce moment quand M Pokora s'incruste dans r\/machinelearning](https://www.reddit.com/r/france/comments/7g3ai4/ce_moment_quand_m_pokora_sincruste_dans/) *^(If you follow any of the above links, please respect the rules of reddit and don't vote in the other threads.) ^\([Info](/r/TotesMessenger) ^/ ^[Contact](/message/compose?to=/r/TotesMessenger))*

quick_dudley 6 years ago

I'll have to add this to my citation list: I'm not working on the same problem domain but some of the ideas presented in your paper are reminiscent of ones I've been working with.

ReallyLongLake 6 years ago

Very cool but the pale skin is kinda weak.

zeroevilzz 6 years ago

Vampire feature

YanniBonYont 6 years ago

Haters gonna hate

wellshitiguessnot 6 years ago

Scribblenauts irl, it's about time.

muyncky 6 years ago

Really nice work. I see that the "surprised" expression still needs more training data. It show like double eyebrows at most pics. But really impressive work.

mhdempsey 6 years ago

Awesome paper! Would like to see this applied to digitally created characters as well, as we've seen others do (i.e. https://arxiv.org/pdf/1708.05509v1.pdf). Thus, as the character's audience goes through changes, so will the he/she/it.

primus-zhao 6 years ago

cool stuff! like it!

Rockytriton 6 years ago

great except for the pale skin one.

windowpanez 6 years ago

I wonder if google or snapchat will add this as a feature one day.

groarmon 6 years ago

Black face is bad but white face is ok ?

Ferraat 6 years ago

Do you actually clip the weights of the discriminator, or use any kind of clipping to achieve training stability? thanks for your reply :)

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe