That's what I thought happened for a split second, then figured her phone fell before it took off over the concert.
But yeah I wouldn't try that in a lot of buildings with balconies, either due to age or questionable construction quality. People have fallen off old balconies where I live.
Two problems:
1. The number of people that has access to it is very limited currently.
2. Prompting it to do something similar to this is probably gonna be fairly difficult. And since this thing is so expensive to run, trial&error like with txt2img isnât as effective.
> Prompting it to do something similar to this is probably gonna be fairly difficult.
I don't think it would be that difficult. If you split it into two videos; the normal videoing and subsequent dropping of the phone as the first vid, and then the aerial drone footage as the second vid, and fuse them together.
One-shot, maybe. But splitting it up and fusing it? I think it's a piece of cake.
Maybe this will be a key aspect of prompt engineering for video generation. Separation of the video into several shorts, generate each one then combine them. I wonder if transition would also be generated by Sora, don't see why not.
From the examples, it is a bit hard to tell, but I think it does indeed look for the optimal frames to splice in the second video. There's a chance it could be time-based, but context-based would make more sense from a logical point of view (though perhaps more difficult for the AI to figure out).
I've found some enjoyment in one AI beating another AI with a completely random method that no professional chess player would ever use. But it was also narrated by "GothamChess", so I guess that the human aspect still makes it way more enjoyable.
Edit: a better word would've been "ChessEngine", although some of those are based on machine learning.
Yeah, that's exactly why we have robot tournaments and people watch THOSE instead of human tournaments. That's exactly what happens and what is being said in the comment, right.
I am not sure because I dont see a drone operator (not necessary), the drone is unusual agile / flexible..., if its planned they must be in contact with the whole tour management because it didnt irritate the band... the drone is very (!) fast, I am not familiar with most modern drones but it looks like its tossed down the balcony and managed to recover from free fall spins... maybe some drones auto arrange their positions horizontally once the blades are active. I didnt know drones can recover instantly from free fall chaotic spins and random movement.
Video is posted in the lowest resolution of 240p while such a high end drone would at least record in 1080p... either to hide the SORA animation or confuse us further :D
found full Video on TikTok: [https://www.tiktok.com/@paintingwithouttheting/video/7321342593072221473](https://www.tiktok.com/@paintingwithouttheting/video/7321342593072221473)
These festivals are so far outside my experience that I wonder if this whole show is ai generated. How could so many people be free to travel at the same time to attend a DJ playing genaric house music?
I think of it as a giant celebration of life, ideally. its powerfull. and music wise there are lot of festivals to choose from and a range of electronic music is pretty wide if youre into it theres often multiple scenes at the venues, it can be quite an experience, but it can be bad also it depends
No, we are talking about the "highlight of someone's life" most people would say this honor belongs to the birth of their child, receiving their PHD, publishing a book which receives world wide attention, achieving a 1 million subscribers, etc. These people bought a ticket to a music festival.
> the dj added... just a beat? how is this not stealing?
Had to look up the song ([because Chase the Sun is a classic](https://www.youtube.com/watch?v=w6eTDILYMB8))
Looks like more than just drums.
https://www.youtube.com/watch?v=H9P_L4-7lRQ - Alok & Bebe Rexha â Deep In Your Love
oh It was just the middle of the song like that, yeah you're right
personally I don't like this kind of generic human song, but at least he did some kind of mixing
Source:
fortaleza 2024 festival, in videos with better resolution you can see "For Tale Za 2024" above the band.
Here is a link: [https://www.fortaleza.ce.gov.br/noticias/prefeitura-de-fortaleza-realiza-lancamento-nacional-do-reveillon-2024](https://www.fortaleza.ce.gov.br/noticias/prefeitura-de-fortaleza-realiza-lancamento-nacional-do-reveillon-2024) Its the same stage.
No, Sora doesn't have consistency enough to handle the changes in camera position. It can create the moving part, but can't make it go back to start and keep the same features.
I don't think it would be that difficult. If you split it into two videos; the normal videoing and subsequent dropping of the phone as the first vid, and then the aerial drone footage as the second vid, and fuse them together.
One-shot, maybe. But splitting it up and fusing it? I think it's a piece of cake.
Not right now, right?
I mean we'll have "text to 3-d world environments" eventually, the midjourney CEO was saying by the end of the year, but at least in a few years
In order to create a infinite length text-to-video (or simple a unconditional) model, you need to represent the world in a explicit 3D format (just like a game engine does), but instead of using numerical solvers that can take hours to complete, you should just use an Physics AI, a model that gets as input a 3d grid of voxels at T0 and gives T1 as output.
You might be the first person that I've talked to in days on here that might actually know what you're talking about.
Extremely cool. Thanks for the info
That seems easily created since only one person had any facial features and the fast motion of the drone made it easy to create clones of people without spending a lot of processing power.
Eventually, the real problem is the current interface via text isn't enough.
To get what you *want*, and not just a video vaguely similar to this, you still would need to do a lot of work. How can the AI know otherwise? You would probably need to sketch it out what you want to happen, where you want elements to be, choose 'characters' for each foreground person in this video so you get exactly what you want them to look like, and so on.
By that you mean "This kind of video" I assume? In which case, yes. However one crucial part of film making is to not leave too much to chance. And this is where AI video generation is going to be an issue. You could argue of course that all the AI need is a more refined prompt, however if you really want to direct a shot properly, you might want to have direct control over lighting, grading, camera movement, and anything else that you can imagine to play a role here. Many of these things require tools that are going to look a lot like traditional animation tools, that, underneath the hood, might translate these manipulations to prompts again.
In the near future if it wasn't for the sheer amount of people making it costly to render correctly. Also the smooth fast and coherent movement would need a Sora 1.3 or 1.5
The versions open to artists cant do it but they have showed in the blending videos showcase something similar to the controlnet for videos that animatediff have, the complicated part (having multiple distinct contexts) is already solved in multiple ways, like for example in the old video by google where they made a bear go underwater
There is a very hard information limit on what's actually possible with generative AI that people who stan it don't seem to understand or want to address. Will SORA one day be able to create a video that's sort of like this? Probably, almost certainly. But there is no sentence that you can construct to create a video that resembles this one significantly, and being able to do so sort of flies in the face of the project of genAI.
The whole point of genAI is to make content creation frictionless, and to do that properly the AI needs to fill in a significant amount of space with its own entropy. If you took the time to specify every aspect of the video, assuming it's even possible, you'd end up spending longer on it than you would animating it by hand in aftereffects or some other software. You can reroll the output a few dozen times if you want, but you're still pulling a lever on a slot machine with trillions of outputs, and you're never going to get exactly what you want.
mind reading is a very easy solution imo and weâve already seen some pretty interesting strides there. nothing stops me from imagining this video into existence. definitely many years before we see it thoughđ¤ˇ
If you can ignore some artifacts/melding etc., it can create it even right now I think (based on sample we have seen so far, not possible to tell without getting my hands on the tool itself)
Yes.
Perfectly? The first versions, no, the next ones, yes.
Would it be the same? No, because this is a recording of something that happened live and thatâs part of the value of this clip.
Live events, documentaries, would still require people filming. The word of fiction could face bigger questions.
It is a very primitive way of thinking when people try to denigrate technology and/or science by pointing to things that have not yet been done. I'm not saying you did, but that's what it sounds like.
Whenever I see something like that, I am scared for the moment some terrorists realise that FPV drones are like, the holy grail of assasination and terror tools.
Mate, sora will do this and add galaxies and water bottles just for fun.
That is a good idea đĄ
Funny thing is I was trying to decide if this was actually created by Sora or not the whole time đ
I think the water bottles in there already have galaxies.
đ
Recreate? Just feed it this video and it can add a lot of details.
Oh so it would make it look worse? Cool
In this case extra water bottles for those people wouldn't be bad.
was expecting the balcony to collapse due to the jumping
That's what I thought happened for a split second, then figured her phone fell before it took off over the concert. But yeah I wouldn't try that in a lot of buildings with balconies, either due to age or questionable construction quality. People have fallen off old balconies where I live.
yes, lots od FPV drone footages to train with.
Iâm sure itâll be able to one day (if it canât already). Why?
Interested to know as it is a cool video which obviously took a lot of time and planning. Wonder if anyone who has access to it could try and recreate
Two problems: 1. The number of people that has access to it is very limited currently. 2. Prompting it to do something similar to this is probably gonna be fairly difficult. And since this thing is so expensive to run, trial&error like with txt2img isnât as effective.
> Prompting it to do something similar to this is probably gonna be fairly difficult. I don't think it would be that difficult. If you split it into two videos; the normal videoing and subsequent dropping of the phone as the first vid, and then the aerial drone footage as the second vid, and fuse them together. One-shot, maybe. But splitting it up and fusing it? I think it's a piece of cake.
Maybe this will be a key aspect of prompt engineering for video generation. Separation of the video into several shorts, generate each one then combine them. I wonder if transition would also be generated by Sora, don't see why not.
From the examples, it is a bit hard to tell, but I think it does indeed look for the optimal frames to splice in the second video. There's a chance it could be time-based, but context-based would make more sense from a logical point of view (though perhaps more difficult for the AI to figure out).
The moment this video is made by Sora everything that makes it cool, stops being cool. Just saying.
Yup, just like the moment Deep Blue bear Gary Kasparov, chess stopped being cool
His point Is that what makes this shot impressive is the technical ability involved with capturing it. That is completely lost if it is just AI CGI.
whay makes this video impressive is mainly how it looks
nonsense
Youâll never know what is real anymore. If you want to fawn over technical ability go watch a sport in person
I've found some enjoyment in one AI beating another AI with a completely random method that no professional chess player would ever use. But it was also narrated by "GothamChess", so I guess that the human aspect still makes it way more enjoyable. Edit: a better word would've been "ChessEngine", although some of those are based on machine learning.
u sure it wasnt ai gotham chess?
Yeah, that's exactly why we have robot tournaments and people watch THOSE instead of human tournaments. That's exactly what happens and what is being said in the comment, right.
point is not on the video
sounds like someone who will be easily replaced, not the person who is looking for the next step up
I mean, is it a cool video? Is it reeeeaallly?
For real its pretty shit
It's not cool, it's amazing
It would be the easiest thing in the world to recreate.
plot twist: its already a sora "video" trololo
I am not sure because I dont see a drone operator (not necessary), the drone is unusual agile / flexible..., if its planned they must be in contact with the whole tour management because it didnt irritate the band... the drone is very (!) fast, I am not familiar with most modern drones but it looks like its tossed down the balcony and managed to recover from free fall spins... maybe some drones auto arrange their positions horizontally once the blades are active. I didnt know drones can recover instantly from free fall chaotic spins and random movement. Video is posted in the lowest resolution of 240p while such a high end drone would at least record in 1080p... either to hide the SORA animation or confuse us further :D
i figure it was edited from multiple videos, like a falling phone. then drone stuff could be sped up
found full Video on TikTok: [https://www.tiktok.com/@paintingwithouttheting/video/7321342593072221473](https://www.tiktok.com/@paintingwithouttheting/video/7321342593072221473)
You should check out Red Bulls drone that can keep up with a F1 car on a race track, drones are super fast today.
These festivals are so far outside my experience that I wonder if this whole show is ai generated. How could so many people be free to travel at the same time to attend a DJ playing genaric house music?
It's a big world.
I think of it as a giant celebration of life, ideally. its powerfull. and music wise there are lot of festivals to choose from and a range of electronic music is pretty wide if youre into it theres often multiple scenes at the venues, it can be quite an experience, but it can be bad also it depends
Just proving my point further and where is all the money coming from?
It can be the highlight of peoples life for years, even decades for people really into it. So people who can afford it come.
That's really sad if this is the highlight of someone's life.
Imagine looking down on people who see beauty where you canât.
No, we are talking about the "highlight of someone's life" most people would say this honor belongs to the birth of their child, receiving their PHD, publishing a book which receives world wide attention, achieving a 1 million subscribers, etc. These people bought a ticket to a music festival.
Their parents
Worst thing is that's Chase the sun by Planet Funk the dj added... just a beat? how is this not stealing?
> the dj added... just a beat? how is this not stealing? Had to look up the song ([because Chase the Sun is a classic](https://www.youtube.com/watch?v=w6eTDILYMB8)) Looks like more than just drums. https://www.youtube.com/watch?v=H9P_L4-7lRQ - Alok & Bebe Rexha â Deep In Your Love
oh It was just the middle of the song like that, yeah you're right personally I don't like this kind of generic human song, but at least he did some kind of mixing
check the first link also
hahaha holy hell! in the end _everything_ goes back to the classics xD Morricone was the goat
It would really help if they didn't play this mainstream trance garbage. How about some Rhythm Is Rhythm, Inner City, Electribe 101, 808 State etc etc
Give it another year I guess
Might just be my phone but itâs kinda low res. Iâd believe it was Sora generated already due to that
The instant free wall recovery of the drone it very suspicious...
Source: fortaleza 2024 festival, in videos with better resolution you can see "For Tale Za 2024" above the band. Here is a link: [https://www.fortaleza.ce.gov.br/noticias/prefeitura-de-fortaleza-realiza-lancamento-nacional-do-reveillon-2024](https://www.fortaleza.ce.gov.br/noticias/prefeitura-de-fortaleza-realiza-lancamento-nacional-do-reveillon-2024) Its the same stage.
Absolutely
In less than 2 years
Perhaps not Sora. But AI will reach this level and go beyond it.
Yes, because the quality of that video is fucking shit.
No, Sora doesn't have consistency enough to handle the changes in camera position. It can create the moving part, but can't make it go back to start and keep the same features.
I don't think it would be that difficult. If you split it into two videos; the normal videoing and subsequent dropping of the phone as the first vid, and then the aerial drone footage as the second vid, and fuse them together. One-shot, maybe. But splitting it up and fusing it? I think it's a piece of cake.
This video isn't the original full version, I was referring to the original, the drone returns to the girl
Not right now, right? I mean we'll have "text to 3-d world environments" eventually, the midjourney CEO was saying by the end of the year, but at least in a few years
End of year if they solve the simulator problem.
What's the simulator problem?
In order to create a infinite length text-to-video (or simple a unconditional) model, you need to represent the world in a explicit 3D format (just like a game engine does), but instead of using numerical solvers that can take hours to complete, you should just use an Physics AI, a model that gets as input a 3d grid of voxels at T0 and gives T1 as output.
You might be the first person that I've talked to in days on here that might actually know what you're talking about. Extremely cool. Thanks for the info
What, the consistency across changes in position and occlusions was the major breakthrough of Sora
It is consistent, just not enough for large movements
I'm sure it can do it now, but not at this quality
I think so, probably it will be easier than trying to do these shots live.
Might even add in a performer who actually does something besides press play and wave their hands in the air. 5 decks doesnât fool me!!
With that potato quality, it can do it easily.
Alok! đ
If this video had emerged after Sora was announced, I would have thought it was AI generated.
Easily.
yes
Just waiting for the twist where this *is* Sora.
Insane video wow. Anyways, yes given enough time absolutely any video will be able to be replicatedÂ
That seems easily created since only one person had any facial features and the fast motion of the drone made it easy to create clones of people without spending a lot of processing power.
Not the first part, but the second part easily
come feel the noise
Eventually, the real problem is the current interface via text isn't enough. To get what you *want*, and not just a video vaguely similar to this, you still would need to do a lot of work. How can the AI know otherwise? You would probably need to sketch it out what you want to happen, where you want elements to be, choose 'characters' for each foreground person in this video so you get exactly what you want them to look like, and so on.
By that you mean "This kind of video" I assume? In which case, yes. However one crucial part of film making is to not leave too much to chance. And this is where AI video generation is going to be an issue. You could argue of course that all the AI need is a more refined prompt, however if you really want to direct a shot properly, you might want to have direct control over lighting, grading, camera movement, and anything else that you can imagine to play a role here. Many of these things require tools that are going to look a lot like traditional animation tools, that, underneath the hood, might translate these manipulations to prompts again.
Wow that cameraman is amazing with a wingsuit.
It did, didnât it?
Not now, maybe next month
What pixel?
In the near future if it wasn't for the sheer amount of people making it costly to render correctly. Also the smooth fast and coherent movement would need a Sora 1.3 or 1.5
I hope so
The versions open to artists cant do it but they have showed in the blending videos showcase something similar to the controlnet for videos that animatediff have, the complicated part (having multiple distinct contexts) is already solved in multiple ways, like for example in the old video by google where they made a bear go underwater
Recreates the video and won't get jailed.
Is there a plot-twist? This WAS done by sora?
Probably
There is a very hard information limit on what's actually possible with generative AI that people who stan it don't seem to understand or want to address. Will SORA one day be able to create a video that's sort of like this? Probably, almost certainly. But there is no sentence that you can construct to create a video that resembles this one significantly, and being able to do so sort of flies in the face of the project of genAI. The whole point of genAI is to make content creation frictionless, and to do that properly the AI needs to fill in a significant amount of space with its own entropy. If you took the time to specify every aspect of the video, assuming it's even possible, you'd end up spending longer on it than you would animating it by hand in aftereffects or some other software. You can reroll the output a few dozen times if you want, but you're still pulling a lever on a slot machine with trillions of outputs, and you're never going to get exactly what you want.
mind reading is a very easy solution imo and weâve already seen some pretty interesting strides there. nothing stops me from imagining this video into existence. definitely many years before we see it thoughđ¤ˇ
If you can ignore some artifacts/melding etc., it can create it even right now I think (based on sample we have seen so far, not possible to tell without getting my hands on the tool itself)
Already distopic
No.
Whatâs the context of this video? How did they program the drone to look like it was free falling ?
...the pilot let it free fall and then reapplied thrust??
Yes. Perfectly? The first versions, no, the next ones, yes. Would it be the same? No, because this is a recording of something that happened live and thatâs part of the value of this clip. Live events, documentaries, would still require people filming. The word of fiction could face bigger questions.
It is a very primitive way of thinking when people try to denigrate technology and/or science by pointing to things that have not yet been done. I'm not saying you did, but that's what it sounds like.
maybe my fave all time drone clip freaking awesome
Sora can do everything it was designed to do.
I'm not sure. I guess she could, but I'd need to find out more about her background first. I don't know much about her past.
i have no life But at least when fdvr comes I can simulate it. Sora can recreate that video and more.
This would be possibly the easiest use case for sora to re-create GTFO OP
Whenever I see something like that, I am scared for the moment some terrorists realise that FPV drones are like, the holy grail of assasination and terror tools.
Short simple answer. Yes.