Passionate – Dedicated – Breathtaking
Holmenkollen 2 000 gigapixel 360°
Holmenkollen 2000 Gigapixel 360° is a world record breaking, monumental and inspiring visual creation that captures an extensive panorama of beautiful Oslo, Norway. Captured from one of the most breathtaking panorama locations in Norway. It is characterized by its extraordinary size, resolution, and level of detail, pushing the boundaries of technology and optical resolution.
Discover Oslo in detail
The panoramic image provides an engaging, immersive experience that invites people to explore the city from new perspectives. You can zoom in on streets, landmarks, and hidden corners, revealing both the well-known sights and the small details that make Oslo unique.
This project is dedicated to showcasing the beauty, culture, and everyday life of Norway’s capital in a way that encourages curiosity and exploration.
Current world record
Current world record is held by Centre for Content Creation under Limkokwing University of Creative Technology. Their breathtaking PanaXity Kuala Lumpur image is 846 gigapixel big. The image was created by a 17-person crew over months of preperation, seven days of shooting followed by 3 months of post production. If you see the night image, click on the moon to see high-res day image. Click here for their making of video. The goal of Holmenkollen360 is to break the 1 000 gigapixel size, but the final image may even be much bigger.
Feel free to visit Wikipedia for more breathtaking images.
Interested?
Contact us for an exclusive preview of our work-in-progress.
Technique
The heart of upscaling images lies in an algorithm called sub-pixel alignment, which aligns several nearly identical images on top of each other and attempts to restore the resolution lost when using only one picture. This is similar to Pixel Shift algorithms used in several cameras today. However, the problem with pixel shift algorithms is that the camera sensor is moved by a fraction of a pixel. This requires the camera not to move at all between shots. This method is ineffective with long focal lenses in windy environments, where the image through the lens moves by tens of pixels in every direction, even if the sensor remains stationary. Moving the sensor, in such conditions, by fractions of pixels is meaningless.
Movement by tens of pixels in sub-pixel alignment poses no problem. Sub-pixel alignment is applied after rough alignment, rotation, and cropping of the images. Unpredictable movement in every direction is welcomed. Good results can even be achieved with an 800mm handheld lens. My tests have shown that about 20 images are sufficient to increase the resolution and enhance the image quality.
This method is not new and has been proven to work by many others over the last decades. Sub-pixel alignment is performed by self-made CUDA software, optimized for speed and quality, which aligns and scales up 20 images in about 15 seconds. The software is running on fast computer utilizing dual RTX 4090 GPUs.
Questions and answers
About 4 days of shooting. Due to technical issues it took some time to get started and some areas are captured twice to be on the safe side.
This project has been ongoing for approximately three years. Most of the time has then been used for building up enough hardware capable of processing the vast amount of data, many hours researching and developing CUDA software used for processing images, preperation of raw-data for rendering and of course rendering and post processing.
I can imagine that the public perception of this project focuses on the challenge of saving all the images to a hard drive and then figuring out how to stitch them together. While that is indeed part of the project, it is actually one of the easier tasks. The true essence of this project lies in solving the unexpected problems that arise, and they appear at every turn. Some of the challenges include:
- Designing a rigid rig capable of holding the heavy lens and camera without excessive wobbling.
- Capturing 20 RAW images per second, hour after hour, couping with moving parts, cables, betteries, memory cards etc.
- Developing a storage system capable of continuously storing and moving hundreds of thousands of small files non-stop for several days.
- Keeping a computer running at full capacity, utilizing two RTX 4090s at it’s full speed.
- Developing software which is capable of processing more than 380.000 images of all kind. Some algorithms struggle when there are no details in the images that can be processed.
- Correcting colors and removing haze from images captured on several days with warying light condition is a challenge.
- This panorama is made of 4 smaller panoramas. Most work in post was done on each of the 4 panoramas. But even editing 1/4th of the panorama of this size is quite a challenge. Almost every picture had to be corrected. To fit images side by side from 1/4th of the panorama you need 8K display to fit most of the images and you even need to run virtual desktop which makes your workspace even bigger.
- Working with images of this size requiers a vast amount of local fast storage. There actually is practial limit how much fast storage you can fit in a standard PC and this is a challenge. The cost for stepping up to pro grade servers utilizing huge amount of fast local storage is astronomical. This solution is not for our wallets.
These are just few of the problems but there are many more. As we see in the final result, all of the problems can be solved.
-
Group the images into sub-panoramas based on optimal suitability.
-
Divide the images into groups, with each group dedicated to creating a 180MP super-resolution image.
-
Develop RAW files into 16-bit TIFF format.
-
Correct colors and remove haze for improved clarity.
-
Sort the images within each group by sharpness, discarding the two least sharp images.
-
Identify features to for a rough alignment of the images.
-
Use the Enhanced Correlation Coefficient (ECC) method to align the images with sub-pixel precision.
-
Compute the median and enhance sharpness to create a master image for optical flow correction.
-
Apply optical flow correction to correct atmospheric turbulence.
-
Compute the median of the corrected images.
-
Upscale the images by 4x using Tensorflow EDSR (Enhanced Deep Residual Network).
-
Apply an unsharp mask to restore fine details.
-
Merge four panoramas into a single, large 360° panorama.
-
Merge inn the sky and render the final panorama.
Here are only the processing time estimates.
- About one week to develop 360.000+ images from raw to tif.
- About one week to merge and upscale images to about 14.000 180MP super resolution images.
- About one week to render the final panorama.
- About 3 days to convert from one huge image to tiles which could be publish to webserver.
Some of this processing was done several times to get best results.
Before the processing could start, the images needed to be prepared. Correcting colors, haze removal and preparing for stiching took most time (weeks/months).
Each image is composed of approximately 20 nearly identical images stacked on top of each other. Capturing one sequence takes about one second. The “ghosts” you see are likely caused by people moving during this time, as is the case with moving cars. Occasionally, you might notice a person’s foot without the rest of their body. This happens because the foot was the only part that remained stationary during the exposure, while the rest of the body was removed by the algorithm used for stacking the images.
We are a two man crew.
The method, known as sub-pixel alignment, has been used in astrophotography for decades. Imagine a perfect scenario where a ray of light strikes the exact center of a pixel. In this case, that pixel would receive 100% of the light’s intensity, while the surrounding pixels would receive nothing. Now, consider the same ray of light hitting the border between two pixels; in this situation, each pixel would receive 50% of the light’s intensity. This is kind of the reverse process of anti-aliasing.
Next, imagine capturing 20 images of the same object, with slight camera movements between each shot. By combining this information, you can reconstruct a super-resolution image, recovering some of the detail that would otherwise be lost if you relied on just a single exposure.
No, you cannot achieve infinitely high resolution by taking an infinite number of images and analyzing how light falls between pixels. The resolution is always limited by the lens’s ability to resolve details. This limitation is fundamental to the glass and construction of the lens. For instance, if you have a 300mm lens and the image is blurry, sub-pixel aligning multiple images will not make the result any sharper.
Additionally, beyond the lens’s limitations, one must also consider atmospheric effects. The air is constantly in motion, refracting light in all directions. This limitation is especially noticeable when photographing toward the horizon.
We had limited access to the jump tower, restricted to its opening hours. To make the most of this valuable time, we chose to focus on capturing the lower part of the panorama rather than the sky. Since clouds are constantly in motion, it’s challenging to achieve a cohesive and visually appealing sky, especially when capturing it takes several hours and is spread across multiple days. To overcome this, we captured the sky early in the morning at other location, before significant cloud formation, resulting in a clean and aesthetically pleasing sky with minimal cloud coverage. This was actually done two times at two different location. The last attempt is the one which gave the best result.
There are several methods for increasing the resolution and enhancing an image using AI.
Let’s start by considering this scenario: Suppose you have a very low-resolution image of a face, such as a thumbnail, no larger than perhaps 10×10 pixels. AI can use this low-resolution thumbnail to generate a high-resolution image of a face that closely matches it. This is a fact. But the question is: have you really identified the correct person based on that 10×10 pixel thumbnail? The answer, of course, is no.
It has been mathematically proven that you cannot increase the details in an image without introducing more information. For example, you can slightly improve resolution by applying a sharpening algorithm, as it corrects known errors that are predictable. However, you cannot keep sharpening an already sharpened image and expect to achieve better results.
Now, let’s consider a different scenario: You can create a system that learns how objects like houses, trees, and grass look. It can learn the shape of a building’s corner, or recognize repeating patterns. Even without being explicitly trained to recognize letters, the system can learn about the alphabet. By training this system on real-life images, the algorithm will perform well when it needs to predict what colors to use for a pixel. In this case, more information is being added to the process by AI algorithms trained by you, which goes beyond mere guessing.
There are models designed to scale up images by 2x, 4x, and 8x. However, just like with sharpening algorithms, applying the scaling process multiple times does not lead to better results.