Playing with NeRF
Attention: Images/Videos have been removed for privacy reasons.
I had already had some experience with nerfstudio for a research project, so I knew how easy and fast it is to create NeRFs. Procrastinating very important work, I decided to create a simple NeRF as a fun demo to show my less technically inclined friends/family what NeRFs are all about, and to give some context of how approachable they are.
Neural Radiance Fields is a relatively recent technique for rendering photorealistic 3D scenes from a set of images. If you’re familiar with Structure-from-Motion or Photogrammetry, it’s kind of like a deep-learning version of that. Although it’s an implicit radiance field representation rather than point clouds, to the lay-person it creates similar (but way better) results as a densification (multi-view stereo aka MVS) step. Lots of information available online so I won’t go into further details.
FWIW, nerfstudio is amazing - highly recommend. The only tricky part is getting tiny-cuda-nn installed properly, but I highly recommend toughing through it since it’s pretty much mandatory to get fast-training NeRFs for all the popular NeRF packages. It cuts training time down from ~2.5 hours to ~15 minutes.
I highly recommend using conda/pip and letting it figure out the cuda dependencies and such (as opposed to requiring sudo and stuff for cuda versions). As long as nvidia-smi doesn’t give errors, then conda will be able to install the correct cuda versions etc. I ran with cuda 11.7 and pytorch 1.13.1 (in nerfstudio instructions).
Note that (like me) you might get an error during tiny-cuda-nn install that the cuda version doesn’t match the version pytorch was installed with:
The detected CUDA version (10.1) mismatches the version that was used to compile PyTorch (11.7). Please make sure to use the same CUDA versions.
Assuming that the PyTorch version (11.7 in this case) is the version of cuda that you want to be using (and 10.1 would be the system cuda version, e.g. in
/usr/lib/cuda/version.txt), this is because the conda/pip package for tiny-cuda-nn isn’t smart enough or is missing some dependency or something.
The fix: you need to
conda install cuda-nvcc=11.7 -c nvidia
and you might also need
conda install pytorch-cuda=11.7 cuda-toolkit=11.7.1 -c nvidia as well. If it’s still not working, you may have a linking issue where pip build from source is using system cuda compiler instead of conda cuda compiler, but I didn’t have to face this so I don’t know how to fix it, but it shouldn’t be too difficult (maybe as a hack add conda lib path to
LD_LIBRARY_PATH or something).
The input video I used is shown below:
Note that I shot this on my iPhone at 240fps (“slow-mo”) to try to reduce motion blur. I also have another video shot at the usual 30fps, but didn’t test to see if the results were better/worse, though I doubt it makes much of a difference. Then I downsampled the 240fps down to 30fps with
ffmpeg -i input.mp4 -r 30 output.mp4 and preprocessed using nerfstudio’s
ns-process-data video \ --data data/nerfstudio/VA_SB_pier/IMG_8527_30fps.mp4 \ --output-dir data/nerfstudio/VA_SB_pier \ --num-frames-target 50
I tried both with the full 305 frames (default) and with only 50 frames (as in the command above).
Left: 50 frames, Right: 305 frames
Left: 50 frames, Right: 305 frames
Comparing the left (every 6th frame) and the right, the only difference appears to be the amount of clouds/whispies, presumably because more frames means more 3D supervision. Probably also has something to do with eval images (90/10 train/val split).
Generated using nerfstudio.
Left: Ground Truth, Right: NeRF view synthesis (305 frames)
Observe that the NeRF result is a bit pixelated / not max quality. Probably somewhere I forgot to change a default setting and it’s using downsampled images, or the network config I’m using is just too small, or the non-centered scene is causing poor numerical issues since the majority of the “interesting” bits of the scene are actually outside the [-1, 1] scene box.
Training (on RTX 3080) achieved decent results after just a few seconds, very good results after just 1 min, and basically full quality results after 3-4 minutes. I ran them for 10 minutes but they didn’t get much better. Using full resolution or larger network may have achieved better results.
NeRF is very cool and open-source tools have made it super easy to generate great-looking NeRFs from very standard videos! You should try if you haven’t already :)
All the videos/pointclouds/etc. are backed up on google drive.