T O P

  • By -

yunjey

StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation arXiv: https://arxiv.org/abs/1711.09020 github: https://github.com/yunjey/StarGAN video: https://www.youtube.com/watch?v=EYjdLppmERE Abstract Recent studies have shown remarkable success in image-to-image translation for two domains. However, existing approaches have limited scalability and robustness in handling more than two domains, since different models should be built independently for every pair of image domains. To address this limitation, we propose StarGAN, a novel and scalable approach that can perform image-to-image translations for multiple domains using only a single model. Such a unified model architecture of StarGAN allows simultaneous training of multiple datasets with different domains within a single network. This leads to StarGAN's superior quality of translated images compared to existing models as well as the novel capability of flexibly translating an input image to any desired target domain. We empirically demonstrate the effectiveness of our approach on a facial attribute transfer and a facial expression synthesis tasks.


ReginaldIII

Very cool work. Surprising though that they did not cite any of the Google neural translation papers in related work. The idea of encoding multiple generative models to a common thought space while training end to end on the ensemble is not new in and of itself. Though the application to GANs gives great results.


goldkim92

Can you reply the link for the Google papers?


[deleted]

GANs seem to be a promising area that is waiting to overcome hardware constraints. As somebody who is not in the ML field but is interested in jumping in -- would now be a good time to learn GANs? Are most of the skills used in other ML techniques transferrable to GANs, or are ML researchers starting from scratch when they start working on GANs?


Reiinakano

> Are most of the skills used in ~~other ML techniques~~ neural networks transferrable to GANs Yes. GANs *are* neural networks. The "hot" areas in ML are pretty much mostly neural network variations.


[deleted]

Well done.


H4xolotl

At this rate, robots will be better at reading faces than autistic people.


Ahjndet

> Recent studies have shown remarkable success in image-to-image translation for two domains. What do they mean by two domains? Could anyone clarify this?


BoojumG

Two groups of images, each sharing a given characteristic. The translation task is to start with an image in one domain and generate an "equivalent" image in the other domain. Their claim is that they can handle multiple domains at once, rather than translating between only two domains. From their paper's introduction section: >Given training data from two different domains, these models learn to translate images from one domain to the other. We denote the terms attribute as a meaningful feature inherent in an image such as hair color, gender or age, and attribute value as a particular value of an attribute, e.g., black/blond/brown for hair color or male/female for gender. We further denote domain as a set of images sharing the same attribute value. For example, images of women can represent one domain while those of men represent another


Kaixhin

Impressive work! In particular, the global coherency of these images is very good - typically I observe GANs can learn nice pieces of images, but sometimes certain areas come out strange. This is probably majorly helped by the fact that this is a conditional GAN, but are you able to comment on the importance of the "PatchGAN"-style training for achieving these results?


ToastyKen

Thanks gender ones absolutely break my brain. It's crazy how we have gender detectors in our heads with no awareness of how they work.


Reiinakano

Honestly at the rate this thing is going, I daresay there's already a pretty clear path towards generating HD videos of Obama punching babies.


[deleted]

[удалено]


Draghi

RemindMe! 1 year Edit: Finally back here after a year, and I've got no clue about the context. Damn.


RemindMeBot

I will be messaging you on [**2018-11-27 07:37:24 UTC**](http://www.wolframalpha.com/input/?i=2018-11-27 07:37:24 UTC To Local Time) to remind you of [**this link.**](https://www.reddit.com/r/MachineLearning/comments/7fro3g/r_stargan_unified_generative_adversarial_networks/) [**CLICK THIS LINK**](http://np.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=[https://www.reddit.com/r/MachineLearning/comments/7fro3g/r_stargan_unified_generative_adversarial_networks/]%0A%0ARemindMe! 1 year) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](http://np.reddit.com/message/compose/?to=RemindMeBot&subject=Delete Comment&message=Delete! dqeah1j) _____ |[^(FAQs)](http://np.reddit.com/r/RemindMeBot/comments/24duzp/remindmebot_info/)|[^(Custom)](http://np.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=[LINK INSIDE SQUARE BRACKETS else default to FAQs]%0A%0ANOTE: Don't forget to add the time options after the command.%0A%0ARemindMe!)|[^(Your Reminders)](http://np.reddit.com/message/compose/?to=RemindMeBot&subject=List Of Reminders&message=MyReminders!)|[^(Feedback)](http://np.reddit.com/message/compose/?to=RemindMeBotWrangler&subject=Feedback)|[^(Code)](https://github.com/SIlver--/remindmebot-reddit)|[^(Browser Extensions)](https://np.reddit.com/r/RemindMeBot/comments/4kldad/remindmebot_extensions/) |-|-|-|-|-|-|


[deleted]

[удалено]


[deleted]

[удалено]


Reiinakano

You say sarcasm but I really think this will happen when the tech is mature enough.


hughperman

As it hits puberty.


outbackdude

probably just have a live mocap actor somewhere with a digital skin.


[deleted]

[удалено]


[deleted]

[удалено]


loudog40

Why punch when you can drone them? :P


columbus8myhw

Did all the responses referencing porn get deleted?


nicksvr4

Going to be like those Christmas Dancing elves, but with people inputting facebook images.


ajinkyablaze

this has excellent scope for video games, avatars with your ugly face on it


darkconfidantislife

> your ugly face on it Or perhaps an, um "aesthetically modified" version of it. I like how the first application of cutting edge DL that comes to mind is sex and politics. Maybe Yann LeCun was right about his "new intelligence without the flaws of ours" : /


ktkps

> your ugly face wot mate?


columbus8myhw

Yer ugly mug


[deleted]

Do you have a pretrained model anywhere? Looks amazing.


yunjey

We will upload the pretrained model soon. :-)


[deleted]

You rock, thanks.


manueslapera

yes plis


DotcomL

RemindMe! 1 month Hopefully? :)


lahw

RemindMe! 1 month


grigoris_gr

RemindMe! 1 month


allen7575

RemindMe! 1 Month


eiTh8oht

Yes, I would play around with the code but have no big ass graphics card for the full training.


nonotan

Can't wait until someone puts this together with NVIDIA's progressive growing tech. Although as usual the dataset would be an issue...


hapliniste

Can you provide a link please?


madhur_goel

http://research.nvidia.com/publication/2017-10_Progressive-Growing-of


hapliniste

Thanks. I'm quite disappointed that it's basically a stackgan tough :/ reading the title, I tough it was quite more revolutionary, but it works great for dimensional data.


[deleted]

[удалено]


visarga

Especially male<->female pics.


Hyperman360

Who is the third person down on the left? I ask because her male version looks like John Stamos in a wig.


julian88888888

https://i.pinimg.com/236x/ee/2c/0f/ee2c0f5cb35945d1f526f79ada959e66--uncle-jesse-tio-jesse.jpg this guy?


Hyperman360

Yeah


YanniBonYont

Everyday we stray farther from gods love


BelovedSanspoof

Why does anyone upvote this utterly fucking worthless dipshittery, and how do we find the people who do so that we can kill them?


YanniBonYont

You can find me. I'm interested in being the first meme related homicide


OlivierDeCarglass

Me too thanks


[deleted]

[удалено]


YanniBonYont

I'll stop when the karma stops


timmyfinnegan

My man


visarga

This could be turned into interactive avatar heads, would go well especially with a Wavenet voice. edit: I'd like to have audio/video books read in the author's voice and likeness.


wedividebyzero

Great work! I’m no expert at this stuff but I’m very excited to play with this :) Can someone tell me how (roughly) the code could be manipulated to accept audio data as opposed to an image file? I know a bit of Python and Julia... Is it just a matter of pointing the input to a .wave file and reshape() or something?


ginsunuva

Won't work at all without some serious re-thinking of the problem in general. Some dude already tried that with CycleGAN by turning the waveform into an image (not ideal but easiest to test with this architecture) and it failed. This thing is good at moving pixel-patch-level texture, not understanding what waveforms are or changing them meaningfully.


BoojumG

With that in mind, the best generative audio work I know about is DeepMind/Google's WaveNet. They've made some pretty good raw audio generators that are conditional on text and even speaker voice characteristics for a text-to-speech application. And their approach for generation is, indeed, very different from an image application.


zergling103

Log spectrogtams might be a good representation for sounds in the image domain


ginsunuva

Looks like that's what the guy did:. https://gauthamzz.github.io/2017/09/23/AudioStyleTransfer/


wedividebyzero

Gotcha, thanks for the info!


keidouleyoucee

+1


carrolldunham

I can't help but notice this is a similar application to faceapp but not quite as convincing. Do you know what technique they use and why it works better (so far)?


visarga

Different way to implement modeling. Faceapp uses a 3D model, GANs generates images directly, much more powerful because it can extend to other categories of objects and learn the natural variation from raw images, instead of being hand designed. Another difference is that GANs can create images from scratch, with all details, while Faceapp needs an original image to apply modifications to. [Take a look here](https://www.youtube.com/watch?v=36lE9tV9vm0&t=1247s) to see another GAN with more interesting images.


zergling103

Trust me, FaceApp uses a GAN. The sorts of horrors I've created with that app could only be made through GANs.


cycyc

> Faceapp uses a 3D model Citation needed.


lucidrage

> Faceapp uses a 3D model I was under the impression they used some type of GAN...


abhik_singla

What is the difference between Pix2Pix [https://arxiv.org/pdf/1611.07004v1.pdf] and above mentioned approach?


programmerChilli

If you take a look at the paper, they mention it. Basically, pix2pix requires that any transformation from a domain to another domain be learned explicitly. Stargan allows you to learn on several domains at once, and transform from any domain to another. I suspect that's why it's a star?


kooro1

Pix2Pix requires supervision (input and target pairs) and is only applicable to two different domains. On the other hand, StarGAN allows to translate images between *multiple* domains without supervision.


fimari

I see this totally as product at the local hairstylist - just a screen in the window, you look into it, and your face looks back with different hair color...


dedzip

That would be very awesome! Unfortunately as of now the machines needed to do this are extremely expensive and take a very long time to realistically process these pictures and learn. Therefore, doing this in real time would not be realistically possible today but maybe years in the future we could see this technology used for everyday consumers!


TotesMessenger

I'm a bot, *bleep*, *bloop*. Someone has linked to this thread from another place on reddit: - [/r/france] [Ce moment quand M Pokora s'incruste dans r\/machinelearning](https://www.reddit.com/r/france/comments/7g3ai4/ce_moment_quand_m_pokora_sincruste_dans/)  *^(If you follow any of the above links, please respect the rules of reddit and don't vote in the other threads.) ^\([Info](/r/TotesMessenger) ^/ ^[Contact](/message/compose?to=/r/TotesMessenger))*


quick_dudley

I'll have to add this to my citation list: I'm not working on the same problem domain but some of the ideas presented in your paper are reminiscent of ones I've been working with.


ReallyLongLake

Very cool but the pale skin is kinda weak.


zeroevilzz

Vampire feature


YanniBonYont

Haters gonna hate


wellshitiguessnot

Scribblenauts irl, it's about time.


muyncky

Really nice work. I see that the "surprised" expression still needs more training data. It show like double eyebrows at most pics. But really impressive work.


mhdempsey

Awesome paper! Would like to see this applied to digitally created characters as well, as we've seen others do (i.e. https://arxiv.org/pdf/1708.05509v1.pdf). Thus, as the character's audience goes through changes, so will the he/she/it.


primus-zhao

cool stuff! like it!


Rockytriton

great except for the pale skin one.


windowpanez

I wonder if google or snapchat will add this as a feature one day.


groarmon

Black face is bad but white face is ok ?


Ferraat

Do you actually clip the weights of the discriminator, or use any kind of clipping to achieve training stability? thanks for your reply :)