Implicit Function:
function that defined 3D points to a scalar value
implicit surface:
S = {x ! f(x) = fi}
Occupancy function:
f: R**3 -> [0, 1] If the value is 0, the point is completely transparent. If not, there is an object at that posiiton.
Rendering an implicit surface: Rendering implicit surfaces with PyTorch: Camera, 3D image location. Shoot array that goes through the center of the camera and passes thoruhg the image. Sample multiple 3D points at uniform intervals. For each point, evaluate teh occupancy function to check whether that locaiton is occupied or not. Also use the coloring function that uses the ray points. Raymarching adds up all the color values to get the final value.
RaySampler:
How Ray Tracers work: A ray is shot down each pixel output of a camera. For every pixel, a ray is passed through the image and it hits the image. Considering that the image is in a 3D environment, data points are sampled along the ray in the area. Each point along the ray is a (x, y, z) point and contains a viewing direction. This data point is passed to the MLP network and it outputs a RGB color and the density at that specific location. Takes care of any of the rays from the camera and decides which points are going to be sampled.
Second object: Ray marching object: decides on the algorithm that is used to convert the colors and the occupancies given the colors.
Implicit surface model - differentiable model based on nn.Module.
forward pass through the renderer. It takes the implicit function and the cameras and renders the image.
NSCGridRaySampler()
MonteCarloRaysampler - memory heavy architectures
Ray Marching:
EmmissionAbsorptionRaymarcher() Before implementing this, we use the raysampler to first get the points that we want. The points are (x, y, z) that are going to be passed to the MLP network. Since Ray crosses the image in two places, the graph shows the highest occupancy at those two locations. First ray-surface intersection is where the occupancy function crosses it which is the first point that the camera ray hits the object in the image.
Neural Radiance Fields use the same architecture
Ray Sampler and Ray Marchers use the same techniques. Tricks: Transform the raw 3D coordinates into high dimension representation vector for networks. Representing the occupancy with an MLP Ray direction input to c. Monte Carlo photometric loss.
Volume Rendering:
To get the color, traditional volume rendering techniques are used. Volume Rendering is inherently differential. Colors is weighted combination of alpha values and colors in the world. T values are cumulutive product of the transition behind it. How much do the alpha values differ is 1 - e**alpha values.
Positional Encodings:
Positional encodings are used in NeRFs because ablation studies show that networks perform poorly at representing high-frequency variation in color and geometry when trained directly on xyz inputs along with direction and viewing angle. However, if the inputs are mapped to a higher dimensional space by using high frequency functions like sin and co-sin, the performance of the network is better as it fits the data that contains high frequency variation. Images naturally contain high frequency variations in terms of densities, xyz locations, colors and hence, encoding the inputs with positional data has a significance improvement on the model performance.
For the purposes of neural scene representations through MLP models, the function is redefine from F to F * r, where r F is the learned function and r represents the mapping from R into a higher dimensional space of R*2L. The r function is:
The function is applied to xyz coordinates separately, which are normalized to lie in [-1,1].