Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not yet, but I can definitely imagine a future where these tools get more capable and refined, to the point where all the shortcomings listed above will be overcome. Knowledge about cameras and scene composition are already encoded in the networks to some degree, it just needs to become more accessible. There's probably also a better way to seed new images than by starting with random noise, so we could get similar variations easier. We have already made the big step towards creativity and real world understanding of objects and their lighting, the remaining issues are more technical and unless we are incredibly unlucky and run into a true show-stopper, we'll probably all have access to a high quality digital artist that can reduce production times dramatically.


You need to give some information about the scene to the network.

Camera settings is just a short hand to describe the field of view and depth of focus (at the very least). If you make that implicit you'd still need to give the network the steradians, focal length, circle of confusion, etc. etc. etc. that you want your image to use.

You'd need to understand everything in Hecht's Optics to tweak all the parameters of an AI generated image.


That's an implementation problem, not a technical or conceptual one. Diffusion models have shown that they can learn practically all of these things if you make them sufficiently big.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: