Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Promising looking tool. It would be useful to add a performance section to the readme for some ballpark of what to expect even if it is just a reference point of one gpu.

I've been considering building something similar but focused on static stuff like watermarks so just single masks. From that diffueraser page it seems performance is brutally slow with less than 1 fps on 720p.

For watermarks you can use ffmpeg blur which will of course be super fast and looks good on certain kinds of content that are mostly uniform like a sky but terrible and very obvious for most backgrounds. I've gotten really good results with videos shot with static cameras generating a single inpainted frame and then just using that as the "cover" cropped and blurred over the watermark or any object really. Even better results with completely stabilizing the video and balancing the color if it is changing slightly over time. This of course only works if nothing moving intersects with the removed target or if the camera is moving then you need every frame inpainted.

Thus far all full video inpainting like this has been so slow as to not be practically useful for example to casually remove watermarks on videos measured in tens of minutes instead of seconds where i would really want processing to be close to realtime. I've wondered what knobs can be turned if any to sacrifice quality in order to boost performance. My main ideas are to try to automate detecting and applying that single frame technique to as much of the video as possible and then separately process all the other chunks with diffusion scaling to some really small size like 240p and then use ai based upscaling on those chunks which seems to be fairly fast these days compared to diffusion.



Good point — I’ll add that to the README.

Masking is fast — more or less real-time, maybe even a bit faster.

However, infill is not real-time. It runs at about 0.8 FPS on a 3090 GTX at 860p (which is the default resolution of the underlying networks).

There are much faster models out there, but none that match the visual quality and can run on a consumer GPU as of now. The use case for VideoVanish is more geared towards professional or hobby video editing — e.g., you filmed a scene for a video or movie and don’t want to spend two days doing manual in painting.

VideoVanish does have an option to run the infill at a lower resolution. Where it fills only the infilled areas using the low-resolution output — that way you can trade visual fidelity for speed. Depending on what’s behind the patches, this can be a very viable approach.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: