Digital Filmmaking Basics 101 Part 001 - Knowing your tools.

Hollywood has had a huge impact on all of us. It has become such a huge part of a lot of cultures that it's difficult to separate our lives from it. Sometimes, there is a voice inside of us that says, “That’s what I want to do.” and we’re lucky. Technology is making it easier for anyone to become a film-maker today, the entry threshold, the barrier holding us back from making and telling great stories, is dissolving. The tools of the trade are plentiful, easy to operate and use, and the quality to cost ratio is getting lower.

This website is oriented toward the beginner, the amateur film-maker. There are a lot of books and websites which discuss and cover similar and supplementary subjects. What I’d like to offer you here is the basics, what I think you ought to know in order to get started making moving pictures and living the dream you had that day when someone else’s story had that impact on you, (and me.) I’m going to start off with some very technical information, which will empower your decision making process as you progress and have to make choices about the equipment you are going to use, or are stuck using for now, and how to navigate around the jargon.

Part of this independent film scene is fostering the up and coming, contributing to the success of those that follow you, as those that came before you hopefully did for you, and you know who you are.

Let’s get started.

Bandwidth

Bandwidth is an expression used to describe a pipeline. Imagine you have a 1 inch diameter pipe three inches long. You find that your three inch 1” diameter pipe can fill a one gallon bucket with water in 60 seconds at maximum pressure. The bandwidth of this pipe could be described as being 1/60 of a gallon per second. This is a finite expression.

You find that a 3 inch long, 3 inch diameter pipe carries as much water at the same pressure in 20 seconds. It has an inherent higher bandwidth than the 1” pipe and you would want to upgrade your pipe if your demand for water increased and there was not enough time in the day for the 1” diameter pipe to provide for you.

Your internet connection is the same. You found that a 56k modem was fine at first until larger and more abundant images, and video and sound began to take longer to download due to their increased size and number. So what did you do? You increased the bandwidth of your internet connection. Compression was used to reduce the file sizes of the music, images, and video that you were downloading in a counter-effort to increase flow through your data channels.

Your computer processor’s speed can be described as having throughput or bandwidth limitations. The higher frequency of the processor means that more information can be processed in a given time. We are constantly upgrading processors in response to the greater demands we have on our computers. The PCI bus has bandwidth limitations, as do your peripherals on other busses. All of these are important factors to consider when deciding on a workflow with digital video. Throughput and Quality are determined by compression, resolution, and frame rate as compounded they make up data rate. Let’s take a look at data rates and how they are expressed.

Data Rates

Data Rates are measured by the second. Kb/sec or Kilobits per second. You may notice this is bits per second. I’m sure you heard the term bytes per second as well, and yes, there is a difference. For the record, there are 8 bits in a byte and bits are represented in data rates with a lowercase ‘b’ and bytes with an uppercase ‘B’. Data rates are expressed in levels, multiplied by 1000. There are ‘roughly’ 1000 Kilobytes in one Megabyte, again using the correct notation, 1000 KB is equal to ‘roughly’ 1 MB.

Here is a conversion chart and I know this can be confusing at first because due to marketing practices, bytes can be confused with bits. You buy an 11 Megabit wireless system and you think “Wow 11 MegaBytes, that’s fast!”, because your hard drives are marketing using that scale. Mb or MB? megabit or megabyte? There is almost a factor of 10 dividing them.

TERM

ABBR

APPROXIMATE

ACTUAL

Byte

B

1 byte

1 byte

Kilobyte

KB

1,000 bytes

1,024 bytes

Megabyte

MB

1,000,000 bytes

1,048,576 bytes

Gigabyte

GB

1,000,000,000 bytes

1,073,741,824 bytes

TERM

ABBR

APPROXIMATE

ACTUAL

Bit

b

1 bit

1 bit

Byte

B

1 Byte

1 byte or

8 bits

Kilobits

Kb

1,000 bits

1024 bits

Kilobytes

KB

1,000 bytes

1024 bytes or

8000 bits

Lastly, one bit is expressed as either a ‘0’ or a ‘1.’ Everything is the digital world is expressed using these two digits. A byte of information would look like ‘01001001.’ I don’t know what that string represents, but that’s as far as I’ll go into this.

The reason I go this deep into the basics of digital expression is that digital video, and audio, both have a data rate, and it’s important for you to understand this very basic concept clearly, because it is part of the formula that determines quality. When it comes to distribution, data rate is very important, as it determines a lot of things including the size of your finished product.

A 20 minute Short Film or Documentary with a file size of 20 Gigabytes (20GB) would be very hard to distribute on the internet. That’s roughly a 1GB per minute of information, or a 17.9 MB/sec data rate and with most broadband connections delivering data rates in the hundreds of Kb/sec range, that’s a lot of data to transmit and it would take an extremely long time to download a 20GB video file.

What can you do in this case? You want to be able to share your film on the internet. There are a few factors which determine data rate, and subsequently file size. Let’s take a look at those now.

Aspect Ratio

This is another source of confusion. What we are living with is the legacy of the difference between two competing and now symbiotic industries, cinema and television.

Cinema came first. Television adopted a couple of things from the cinema, one being the aspect ratio of the image being projected in theatres. In an effort to bring it’s audience back into the theatre and away from their television sets, cinema introduced wide-screen films with a variety of aspect ratios and mechanics/optics to draw these crowds. The introduction of the home video market created a synergy between the two businesses and films were presented on the fairly square television screens in a variety of ways. Let’s look at the basic shapes and solutions for a second.

I’ve made all four examples in this example 150 pixels high for comparison. These are the four most common aspect ratios used.

Classic 4:3

Wide-Screen 16:9, 1.85:1, 2.35:1

Now, I am going to show you these four different ratios with the same width of 400 pixels.

4 to 3 is used by both early American Cinema and also by Standard Definition Television. This is also the aspect ratio of the 35mm negative.

16 by 9 is the aspect ratio adopted by the High-Definition Television technology we have here in the United States at the present.

1.85 to 1 was established by anamorphic Panavision lenses, widely used in American Cinema. This aspect ratio was also used in the theatre by projecting a chopped or ‘matted’ cutting off the top and bottom of the film image and later would be shown on television in ‘full frame.’

2.35 to 1 was established by an anamorphic system called CinemaScope. Anamorphics are ways of optically ‘squeezing’ one shaped image onto another, a process later reversed during projection.

So What!!?!? Yeah, so what? How do you show a 1.85 to 1 image on a 4:3 television? You’ve got a couple of paths to choose from.

First, you do what is called a ‘letterbox.’ We’ve all seen this, black bars at the top and bottom of the screen. The advantage of a letterbox is that the composition and width of the original image is preserved.

There is also what is called a ‘pan and scan’ method, which fills the full screen of the television, but which changes the composition and sometimes creates confusion when the full frame is used to show or create distance between two subjects or characters.

The pan and scan is achieved by moving the 4:3 target area around the 2.35:1 image to reveal what the editor feels is most important in the frame. This widescreen format was never used without intention and the aspect ratio is an integral part of the art of the image. In film’s like Laurence of Arabia, the emptiness and isolation of the vast wasteland of the desert is enhanced and felt when the image of Laurence is juxtaposed and contrasted against it. Some of this impression is lost when almost half of the image is removed in transfer to video and Laurence is shown with a much smaller desert as the backdrop. My point is aspect ratio is an artistic decision and this is also something you will have to decide on as well.

There is another method people are using to fill the screen now that the shape of our televisions has changed. It is the stretch. Our airwaves are being shared right now with both High-Definition, HD, and Standard Definition, SD programming. The HD shows are natively in a 16:9 aspect ratio, and the SD programming is broadcast in the 4:3 aspect ratio. What happens when people are watching SD programming on a HD capable monitor or television is that the original 4:3 image does not fill the screen and there are grey or black bars on the screen, similar to the pan and scan illustration above. These TV’s are capable of filling the entire screen utilizing a couple of different methods; full, which stretches the 4:3 image across the 16:9 plane, distorting it to a certain degree, and zooming in on the image, filling the 16:9 plane both vertically and horizontally, cutting off the top and bottom of the 4:3 image. This method is rarely used but is one solution for showing letterboxed 4:3 programming without distorting the image.

These two 4:3 images have the same vertical resolution as the 16:9 example, they are the same height, but now the same width. Of course with these pictures here, it’s easy to imagine having to stretch the 4:3 image horizontally now in order to get the image to fill the 16:9 screen.

Allow me to illustrate the zoom technique.

These two images show the monitor and its source at the same magnification level and in their native aspect ratios. Remember, when you zoom into the letterboxed 4:3 image so that they have the same width, not height, the top and bottom are going to be offscreen and not shown on the monitor.

These images are all the same width, which is what you achieve when you zoom in using the controls on the telelvision.Personally, I feel this is the best way to watch letterboxed 4:3 source material as it retains the original aspect ratio and does not distort the image by stretching it. Due to the fact that some 4:3 programming, like the news, uses the area cut off in a zoom, it is not necessarily the best way to view 4:3 TV on a 16:9 monitor.

FYI: On some SD consumer cameras, you have the option of ‘squeezing’ an anamorphic 16:9 image onto the 4:3 chip. Tests have shown that using this option has a level of detail higher than that of the 4:3 image being letterboxed and zoomed into. You can then distribute both an anamorphic, natively 16:9 widescreen image for HDTVs, and a letterboxed 4:3 version for SDTVs.

I am presenting this information now so that you will remain end focused. We’d all like our movies to be distributed on DVD’s and shown in our audience’s homes. There are a variety of screen sizes, shapes, and resolutions to consider when distributing our films nowadays, and I’d like you to know what you’re options are, as well as what you can do to retain as much control over the final image, so as to preserve your artistic intentions. Sadly, a great part of the population is uninformed about these differences and are turned off at the lack of simplicity when viewing, resulting in distortions like using the stretch feature when viewing a letterboxed 4:3 image on a 16:9 screen, instead of the zoom.

Resolution

In print, resolution is described as DPI, or dots per inch. You’ve probably had to use a scanner and when you were scanning your picture or document, you had to choose a resolution to scan with. Common resolutions are 150, 300, or 1200 dpi. You probably noticed that by changing raising the dpi you scanned at, that this would also affected your file size.

A 5” by 7” picture scanned at 150 dpi would contain 787500 pixels, (0.787 Megapixels.)

Here is the math behind that:

5 inches at 150 dots per inch is 750 dots or pixels horizontally.

7 inches at 150 dots or pixels per inch is 1050 total pixels vertically.

The image, at this size and at this resolution, would measure 750 pixels by 1050 pixels, or 750x1050, a total of 787,500.

The same image scanned at 1200 dpi would have 50,400,000 pixels, (50.40 Megapixels,) over 60 times as many pixels, and would result in a raw file size that many times greater.

Film has been measured to have a resolution of between 4000 to 6000 dpi or pixels per inch. It’s an organic medium, and unlike digital files, the size of the grain, or pixel used to capture the subject is neither uniform or consistent.

The file of a 35mm negative scanned at 4000 dpi would contain approx. 24,000,000 pixels. Uncompressed, it would result in a rather large file and the dimensions of the image expressed in pixels would be 6000x4000.

High-Definition Video has two standard resolutions: 1920x1080 and 1280x720. They both have an aspect ratio of 16:9.

Standard Definition Video has many standards. There are approximately 525 lines of resolution. DV, a popular format uses a resolution of 480 lines and produces sharp and vivid images. It is a format with a lot of advantages and is easy to use for beginners and is what I would recommend working with when you start.

The following illustrations do not represent true resolution or size. Computer screens are designed to display graphics at 72 dpi and the largest of these images would be far too big to display on your monitor, So To first make my point I have chosen to reduce them to 25% of their native resolution for comparison.

High-Definition

HDTV 1080

HDTV 720

Standard Definition - NTSC

D1

DV NTSC

Actual Size : DV NTSC

Actual Size : HD 1280 x 720

In this world, resolution and display size are relative. 720x480 is exactly that big on your computer monitor at that size at that resolution. The same frame at that resolution displayed on a 13” television would be a different size and so on. Another standard for theatrical presentation is being developed as well. Images with 2000 lines of horizontal resolution are being projected on 50 foot screens.

Now that we’ve touched on a couple of different SD and HD resolutions, let’s talk a little bit about the displays that we’ve got to choose from, what’s most popular, and what kind of image they produce.

Displays

CRT – Cathode Ray Tube

CRT’s were, for a while there, the only way we had to view our televisions and computers. They have certain characteristics which are hard to find in the other forms of monitors. Since the CRT has been around a long time, many picture technologies which have been developed which make the image on a CRT difficult to compete with, as the signal was designed for it.

Pros: Rich deep blacks. High contrast ratio; the grades available between what we see as black and what we see as white. Sharp detailed images without pixelation. Truer color representation. Inexpensive when comparing products of the same size. Can display interlaced images.

Cons: Size and weight. SD CRT televisions grew to about 36 inches diagonally. This TV would weigh about 250 lbs. and be 30 inches deep. You could get an SD rear projection unit up to 60” with about the same depth or footprint. HD CRTs with a 16:9 ratio are available up to 34” diagonally. Again, they weigh about 250 lbs. + and are quite deep when compared with the plasma and LCD offering with the same screen size.

They do offer one of the best pictures with the truest colors and highest level of detail. Although they seem to be on the way out, I recommend using a CRT for montitoring your output when editing or compositing. Most of your audience watches movies on these in SD. As with all consumer television sets, these systems will overscan the image, cutting off about 5% of the sides and top of the image. This is normal, and can be monitored while in workflow.

LCD – Liquid Crystal Display

LCDs have come a long way since we were introduced to them with digital watches and hand-held Donkey Kong in the 80’s. Slowly, over the last few years, they have replaced CRTs as the default display for the personal computer. When your project hits the home video market, chances are, it will be viewed on one of these monitors. A lot of younger people use their computers to watch movies and television. This presents some problems, as IMHO, LCD reproduces the worst ‘film-like’ image of our four choices. Since they are used for gaming and computing, the way that most people set them up is hot and bright with super-saturated color. This is not the film look.

Before we look at the pros and cons, let’s step out of this display discussion for a second and talk about achieving the look of film with video.

Film is an amazing medium. Comparing it to Video, film has twenty different levels for white to every one video has to use. This is called latitude. The same is true for the dark parts of the image, and it’s recognizable when viewed also due to the tonal values it reproduces when it reproduces colors. Video can be adjusted to closer mimic the look and values of film.

When I worked as a high-end home theatre installer, we always went through a sort of ritual when setting up the screen and projector. The room was always darkened, with minimal light, just like a theater. That way the image didn’t have to compete with any ambient or other light sources. This is important. It also sets the stage. Then we turned on the set, or projector and let it warm up, 15-30 mins. This is for CRTs and may not be applicable to other types of monitors or displays. With a CRT, it allows the colors and levels to stabilize, otherwise your setting will drift and you’ll be adjusting for days…

We then removed all of the color information from the image. All the way down to zero. With NTSC, color was an afterthought and rides on its own frequency, away from the rest of the image.

We then adjusted the contrast and brightness settings to reproduce a pleasing Black and White image. We would use color bars from a Criterion Laserdisc to set the tonal range with Brightness controlling your blacks and Contrast controlling your whites and greys. Setup your screen now using this image. You may or may not have control over color saturation or hue due to your video controller and/or monitor. Make sure you can see the two greys and the black in the second box from the right on the bottom of the image. White is the second box from the left on the bottom.

At this point we would reintroduce color back into the image. We like to hit a wall, then bring it back. I noticed working in a lot of homes, that our color saturation setting was about 75-65% of what they would set it at. It’s tempting to really saturate the image, but working with film as a photographer, I knew film didn’t ‘do’ color like video does. You’ve got to cut it back to get it right. Take the color up until the bars start to bleed and lose their separation, and then take it back a few steps, to taste. Then take it back some more. You may grow to like the change if this is different than how you normally view TV. You may see things you didn’t see before. If not, you can always crank it up later.

There is a flip side to this. Video has properties you can exploit, like saturated colors, and bright highlights. Many films develop looks now that exploit the properties of video. Look at Hollywood films and you’ll see a shift in production values and looks going away from the classical pallete of the sixties and seventies (The Good, The Bad, and the Ugly), the dark and the electric colors of the eighties (Blade Runner, TRON, E.T.) and the Technicolor of the fifties (Any Elvis film) towards using the very properties of video we’re trying to cut back on here. There is no absolute truths here, only subjectivity and majorities to contend with.

Back to LCD.

Pros: Inexpensive. Small footprint. Good level of detail. Saturated colors. Hooked up to 99% of computers in the USA and they all have a DVD drive by now. Available in 4:3 or 16:9 widescreen.

Cons: Low contrast ratio. Colors tend to bleed when saturated. Sharpness can be extreme. Interlaced images have to be de-interlaced in order to be shown properly, added cost.

Overall, the cost to size ratio of these displays is very tempting. Ultimately, you will want to see your work on as many different displays as possible, at different sizes and at different resolutions.

DLP – Digital Light Processing

This is a technology that is kind of specialized when compared to the other technologies presented here. DLPs are used in projectors and can be found in rear projection sets as well as being used in professional applications such as Digital Cinema.

Pros: Excellent cost/performance ratio in rear projection sets. 10% marketshare. Smooth, jitter free images. Excellent brightness. Low light source repair cost.

Cons: Larger footprint than plasma or LCD panels. Softer image than CRT or Plasma. Fan noise. Off axis viewing difficult in rear projection, there is a sweetspot for viewing rear-projection systems. Mechanical, lots of moving parts.

Plasma – everyone’s favorite

Everyone of us has gone to the electronics store and stared at one of these for ten minutes only to remember $7000.00 is a lot of money.

Pros: Beautiful rich colors, nice contrast levels, big picture.

Cons: Expensive, rumored to have a short life span compared to other technologies.

CRT’s are most abundant and SDTV is not going away any time soon. Plan on your work being seen on a 4:3 CRT or a 16:9 plasma or LCD. At this point, as a film-maker, I am beginning to originate all material in 16:9. I’ve found the kind of stories I tell lend themselves to the language of film and wide-screen production is an aspect of this. Shooting this way allows me to finish a native 16:9 version for display on HDTVs and a 4:3 letterbox version for SDTV. I am lucky to have a CRT, a Plasma, and LCD to reference at home.

Frame Rate

Frame rate is the frequency at which images are shown, expressed by the second. There are three major frame rate/formats in use in the US today, 60i, 30p, and 24p. The small ‘i’ stands for interlaced and the small ‘p’ stands in for progressive. In Europe, they use the different standards of 50i and 25p. Allow me to explain the reason for the difference.

The electrical grid here in the US has a voltage of 110v and a cycling frequency of 60 Hz. When television was standardized in the US post WWII, it had been researched to show that higher resolution image could be produced using signifigantly lower bandwidth by interlacing 525 lines at 60Hz, or 60 times a sec. The European PAL system operates at 50i or 50 Hz, due to the 50 cycle electrical grid they use.

Interlaced images are created by first scanning the odd lines and then the even lines. The scan of the image starts at the top left of the frame and repeats. American NTSC interlaced video begins with the lower field first and then alternates between upper and lower.

Progressive images are created by scanning the complete frame, line by line, and then repeated.

They both have visual language attached to them through history. Interlaced video images lend themselves to sports coverage, reality television, and pornography.

One of my first long-form video projects was shot using 60i. No matter what I did, when viewed, my cameraman kept saying it looked like a porno. There were several things I had to do to the video, including de-interlacing, to take the visual language away from the familiarity of video production and make it view more like a film.

Film is a progressive medium. Full frames are created and are played one after the other. Another difference is the rate. Film is shot and shown at 24 frames per second, or 24fps. Recent technology created a video medium at this frame rate called 24p, created to emulate the progressive nature and speed of film.

Motion depiction is on of the differences between video and film. Scanning 60 ‘half frames’ per second and 24 progressive frames per second makes a difference at how the motion is captured and reproduced when viewed. If you created a flip-book cartoon with 60 frames, the motion would appear smoother, than a flip-book showing the same animation with only 24 frames.

24fps was actually decided on and adopted by the studios due to economics. It was shown that 24fps was the minimum frame rate that could be used to show smooth motion. Why shot at a higher rate and use more film that you had to? More frames per second meant more film per second, which meant more money per second.

30p is a progressive frame rate popularized by music video and is an alternate frame rate you can use, simulating the look of film.

60i is an interlaced format creating 60 fields per second video and is most common in television reality show production and sports due to its cost and excellent display of motion.

24p has been developed for video producers to emulate the progressive look and frame rate of film camera production. It is a format, as well as being a frame rate. This is the arena is where less is really more, and there are 4 major 24fps video formats to choose from; 24p, Panasonic’s 24pA, Canon’s 24F, and Sony’s Cineframe24 technology. Words of warning; when choosing a format to shoot with, research the production workflow online. Some of the newer formats are unsupported and require expensive workarounds. More on the different digital video formats later.

50i and 25p are video formats used in Europe. 25p production is sometimes substituted for 24p production, where the 25p footage is slowed down to match 24p’s frame rate. It is a workaround for US 24p producers.

12fps or 15fps are used by animators depending on whether they are shooting film or video. It allows them to work twice as fast drawing half of the frames ‘live action’ production would use. You can use these frame rates to work in the visual language of animation as well as the language of antique motion pictures made before 24fps was standardized. It makes things choppy, like a Harold Lloyd film.

Some cameras will allow you to shoot 12fps progressive. This is called under-cranking and it’s best use would be to create fast motion. 12fps shown as 24fps would double the speed of the original shot and show the motion more quickly.

Similarly, over-cranking, or shooting at higher frame rates than that of your project, 30p and 60p, and then showing them at the slower rate of the project, 24p, would create slow-motion effects. 30p shown at 24p is a nice slo-mo, similar to the scene in Dazed and Confused when Wooderson is walking through the pool hall saying ‘hi!’ to everyone. 60p shown at 24p would be great for a punch to the face when the spit flies out and the face jiggles, like in ‘Rocky’ or any action film. Don’t get too excited, there are few cameras out there available to the ‘no-money’ indie producer that shoot variable frame rates. A lot of these effects can be created on the cheap in post-production, although the motion is not as smooth for the slo-mo effects. They pass for the real thing and you see them all of the time on TV. They’re easy to spot. The post-effects have a juddering quality, while the camera effects are fluid.

There is one more use for frame rate manipulation, file size. Although there are many codecs, or compression schemes available to you for internet distribution, one quick way to make your internet distributed film smaller is by changing the frame rate. Some codecs won’t allow you to, and depending on your project’s motion translation needs, you can distribute a 60i or 30p file at 15fps and a 24fps file at 12. Experiment and see what you can get away with, you’ll be surprised and a 512K file downloads in half of the time of a 1MB file.

Timecode

The best analogy for understanding timecode is the leap year. Due to the difference between the orbit of the planet and the calendar we use, we have to throw an extra day in the calendar every four years in order to have union between the two.

Timecode is the same type of compromise and was a product of the offline video editing days. An ‘hour’ of timecode at 30fps and an hour of clock time differs by 3.59 seconds, almost a minute and a half over the course of a day. Drop frame timecode is used to compensate for the difference and allow for frame and time accurate editing of video over time. On average once every 1000 frames, a frame number is ommited, not a frame. This compensates for the difference.

There are plenty of in-depth technical article on the web describing the how and why of timecode. In the non-linear editor, or NLE, drop frame timecode use is transparent as the timecode is embedded in the original video file at the time of shooting. DV NTSC uses 29.97 drop frame timecode and DV 24p uses 23.97.

Color – RGB & bit depth

The color space, or mode your television and monitor uses is called RGB. It is an additive color process. Red, Blue, and Green are mixed with different values of white to make up the colors of the spectrum.

In 8-bit per channel RGB, the most common, each color shade is made by mixing three values in each color channel. There are 256 shades of each color represented by an integer of 0 to 255. 24 bit RGB is also known as True Color and when all of the color combinations are added together there are a total approximately of 16.7 million colors. Red is listed first, followed by Blue, and then by Green in the color shade notation. The RGB notations for the six colors in the picture above are given below.

(0,0,0) is black – zero white added to each color

(255,255,255) is white – 100% white (255) is added to each channel

(255,0,0) is red – 100% white added to red with no other value added

(0,255,0) is green – 100% white added to green with no other value added

(0,0,255) is blue – 100% white added to blue with no other value added.

(255,255,0) is yellow – 100% white added to both red and green

(0,255,255) is cyan – 100% white added to both green and blue

(255,0,255) is magenta – 100% white added to both red and blue.

If you are confused, look at where the color circles overlap and make note of the values in the table. The primary colors we were taught to paint with are red, yellow, and blue. It too is an additive process, adding blue to red and to get purple, yellow to red to make orange. It’s the same concept here, only we’re mixing light rather than tints.

The way that the television works is that each pixel has three elements, a red, a blue, and you guessed it, a green. By varying the brightness of each color element in proximity to each other, the combination process produces these colors.

Bit-depth per channel is a factor in image quality. You will find later on in production that some cameras capture and/or process color information in 8,10, or 12 bits per channel. By increasing the bit depth per pixel, it does increase the size of the file. Let’s look at a couple of examples to gain some clarity, remembering that bandwidth is finite and that by increasing one value we may have to employ another method to ensure smooth and consistent data throughput.

An SD, 24bit True Color 720x480 frame would have 8 bits per pixel and 345,600 pixels. With 8 bits per byte and three channels of color information, the raw image would be 1,036,800 bytes in size or 1.0368 Megabytes.

Let’s look at the same uncompressed frame at 10 bits per pixel. 10 bits per channel with three channels of color information at 345,600 pixels big is 1.296 Megabytes, a 30% increase in information. Take a look at http://www.cineform.com/technology/Demo10bitVS8bit.jpg in a separate window for me, for a second, please. What this image shows, I believe, is one of the advantages of working in a 10 bit color space over an 8 bit. After the gamma change, it’s possible that the banding produced in the 8bit uncompressed YUV file is a limitation of the color space. There are other factors to consider, but the limitation of the bit depth in that color space is a factor to producing the unwanted artifacts and presenting the possible advantage to using the higher bit depth is what I wanted you to consider.

Now, let’s take a look at http://www.kenstone.net/fcp_homepage/review_color_finesse.html in another window please. This is review for another product which allows you to work with the colors in your video image in 16 or 32 bits per channel, even if they originated as 8 bit per channel files.

I think the ”Color Finesse HSL Saturation at 1000 at 16-bit” image when compared to the “Adobe After Effects Saturation at 100% at 8-bit” image tell the story here and show the advantages to increasing the bit depth per pixel when making changes to an image. The banding you saw in the Cineform image above is happening again in 8 bits with a simple increase in the saturation of the image.

These tools that we use to manipulate our work have characteristics built in as well. Comparing the “Adobe After Effects "Brightness & Contrast" plugin at 100% at 16-bit” image to the one below it, the “Color Finesse HSL Brightness at 100% at 8-bit or 16-bit” image, the standard AfterEffects tool shows uneven processing across the color spectrum, blooming white into blues more than any other color and destroying the gradual shift from white to color at the top of the gradient.

Tools we take for granted to be uniform are not, and that effect of each action will differ from platform to platform and from application to application. I would recommend auditioning tools in side-by-side comparisons of images processed by different actions in each program. On the Macintosh you are limited in the applications you can work with, but most Window’s editing and image processing applications and plug-ins have watermarked demo versions or limited duration tryout periods that you can download, install and tryout. This may sound like a pain in the ass, but believe me, as you continue to work with this medium you will reach some of it’s limitations.

Most professional non-linear editing programs are in the $500 plus price range. You may also want to invest in both a professional still and a moving image manipulation application. Many software developers are packaging multiple applications together with considerable savings. To start off with, use I-Movie or Windows Movie Maker. Both apps are great for getting your legs in Digital Video and by the time you would be ready to use all of the features you purchased with your package, there’s an update ready and another couple of hundred dollars to shell out. Remember, baby-steps…

I hope I’ve been able to show you the benefits of working with higher bit depths with higher color resolutions. We’ve covered RGB here, YUV, another color space was mentioned in the Cineform article, and the Color Finesse review mentioned HSV, another color space we can work in. AfterEffects is a post-production tool that manipulates images and video and creates motion graphics and we will get into it’s uses and purpose in another article.

HSV

Still on color, let’s look into the HSV color space next. It is another way of defining colors and the letters in the name stand for Hue, Saturation, and Value. It is also known as HSB ( Hue, Saturation, Brightness.)

I’m going to link you to another page with some excellent visualizations and illustrations. Again, please open this up in another window or tab.

http://en.wikipedia.org/wiki/HSV_color_space.

The triangle in the middle of the wheel expresses the Saturation and Value of the color on a two point plane. As the triangle it turns, it points to a specific value for a Hue or color. Any color then can be created by combining the value of the Hue that the triangle points to, the Saturation Level of that Hue on a scale of 1-100, and a Brightness Value again on a scale of 1 – 100. The value for the hue can be described on either a normalized scale of 1 to 100 or by a degree in the range of 0 – 360.

Converting from RGB to HSV is easy. The computer has all of the information that it need to make the conversion for you.

HSL

There is another color space, HSL, ( Hue, Saturation, Luminosity) and it is not to be confused with HSV or HSB. It is widely used and can be found in the Microsoft Windows System Color Picker and also in Microsoft Paint.

As you can see in this example, the Hue is determined on the Horizontal axis and Saturation on the Vertical. The separate slider on the right, in grayscale in this image, determines the Luminosity or Luma value. In the color picker example shown, a screen shot from Microsoft Paint, the Hue, Sat, and Lum values all scale from 0 – 239.

Provided is a RGB conversion for translating values from that color space while in this one. Showing is (0,0,0) or black. In the HSL color space there are almost 60,000 ways to express black as it will always be black as long as the Luminosity value is 0.

There are some excellent illustrations and explanations at http://en.wikipedia.org/wiki/HSL_color_space.

YUV

YUV is a color space defining itself in terms of luminance and two color or chrominance components. Analog YPbPr component and digital YCbCr are derivative of YUV and digital video uses the components of the YUV model to describe pixel color.

One of the advantages YUV presents is that some of the color information can be throw away, reducing the bandwidth necessary to transmit to signal. The human eye is more sensitive to green and with the Y being derived from the Green component of RGB, it carries the most information. With the eye being less sensitive to blue and red, NTSC takes advantage of this and discards the majority of the information in these channels to reduce the bandwidth of the signal.

Y is the sum of all three channels multiplied by different factors. It contains elements of both R and G, with B.

U is the product of Blue – Y.

V is the product of Red – Y.

Y carries the most information as everything else in the signal is referenced around it. As long as you have Y, and B, and R, you can extrapolate G, which contains the most color information.

Digital video takes the same approach and most formats use a color subsampling technique to reduce the bandwidth of the information at the expense of color resolution, but with the advantage of having higher pixel resolution at the same data rate. This sampling of the colors in the image is a form of compression. Compression is a way of taking notes about the image rather than having to describe each pixel individually.

Color Sampling and Resolution - 4:4:4

 

Let’s start by converting an 8 bit per channel RBG image to YCrCb. Without any color processing, each pixel is still represented by 24 bits of color information. When there no compression used, as in this example, the sampling rate is noted as 4:4:4. The first 4 stands in to note that the standard frequency of 13.5MHz was used to digitize the analog video information. The next two numbers stand in for the rate at which Cb and Cr were sampled in relation to the luma. In this case also 100%, or 13.5Mhz, or 4. The resulting color space uses 3 bytes per pixel to store the information when 8 bits per channel are used, uncompressed. MPEG-2 supports 4:4:4 sampling.

4:2:2

In ITU-R BT.601 4:2:2 both the Cr and Cb are sampled at a horizontal resolution of half that of the Luma, every other pixel, with every second Y using the chroma info from the previous sample. The sampling looks like this: Y1, Cb1, Cr1, Y2, Y3, Cb3, Cr3, Y4, Y5, Cb5, Cr5, Y6 and the mapping for the six pixels would read [Y1, Cb1, Cr1] [Y2, Cb1, Cr1] [Y3, Cb2, Cr2] [Y4, Cb3, Cr3] [Y5, Cb5, Cr5] [Y6, Cb5, Cr5].

I’ve also seen online an unnamed 4:2:2 sampling scheme where the encoding bitstream looks like this; Y1, Cb1, Y2, Cr2, Y3, Cb3, Y4, Cr4, and so on. This differs from the example above as the chroma info alternates between Cr and Cb samples on alterating pixels. The mapping of this form of 4:2:2 encoding reads [Y1, Cb1, Cr2] [Y2, Cb1, Cr2] [Y3, Cb3, Cr4] [Y4, Cb3, Cr4].

A 4:2:2 color space stores all of the pixel color information using only 4 bytes between two pixels, a 33% decrease in bandwidth.

Digital Video formats using the 4:2:2 color space include Digital Betacam, and DVCPRO50.

4:2:0

The 0 in 4:2:0 means that the chroma subsampling takes place at half the vertical resolution as the luma, as well as half of the horizontal resolution of the luma as designated by the 2. Chroma subsampling takes place in a four pixel matrix instead of the two pixel sample, reducing the bandwidth requirements of 4:2:0 by 50% over using 4:4:4uncompressed; taking up 6 bytes of space per 2x2 megapixel for the color information.

There are differing chroma sampling techniques used in 4:2:0. MPEG II 4:2:0 Cr and Cb sampling for the pixel block takes place on the first two vertically adjacent pixels and the average of the two is used to calculate the values for all four pixels using the individual Y values that were sampled.

The conversion from RGB to YCrCb 4:4:4 is considered lossless and the math looks like this:

RGB to YUV

Y = 0.299R + 0.578G + 0.114B

U = 0.147R – 0.289G + 0.436B

V = 0.651R – 0.515G - 0.100B

It is when you change to the 4:2:0 color space that the actual color saved is different from the one sampled. The colors used in the MPEG II 4:2:0 pixel squares have been accurately converted from RGB to the MPEG II 4:2:0 color space. Note the slight difference to the shades when compared to the original colors, although the difference in luminance is virtually the same.

In the MPEG I, H.261 4:2:0 megapixel, Cr and Cb are calculated by averaging the chroma values of all four pixels and then using that average to calculate the pixel color using the four Y values in the sample.

In the PAL DV 4:2:0 example below, Cr, or U, is first determined by averaging the values of first two vertically adjacent pixels on the first horizontal axis. Cb is then determined by averaging the next two vertically adjacent pixels, adjacent to the right. Y values are taken from each of the four pixels and the Cr and Cb averages taken are used for all four pixels to determine their color in this space.

Digital video formats that use the 4:2:0 sampling routine are PAL DVCAM and DV, HDV, and all forms of MPEG encoding including DVD. Some instances of MPEG 4 allow the use of the higher quality sampling of 4:4:4 color space.

4:1:1

NTSC DV uses a 4:1:1 sampling routine where the four pixel megapixel is shaped differently than the 2x2 4:2:0 megapixel. In NTSC DV 4:1:1, a chroma sample is taken from the first of four horizontally adjacent pixels, a 4x1 pixel matrix, and those values are then used for the remaining three.

DVCPRO, NTSC DVCAM, and NTSC DV use the 4:1:1 color sampling routine for encoding and decoding digital video.

The reason you need to understand and know this is that some schemes are more difficult to work with than others in certain situations. For example, color keying a sharp mask from DV footage is more difficult than one using lower sampling like 4:2:2 or 4:4:4. This is due to the high sampling rate and the resulting recorded image. A four pixel sample across an edge would average the differences and make the edge soft in the recorded image.

It helps bring cost down while maintaining resolution, but color-subsampling is re-interpretation and can be problematic at times. 4:1:1 also is characteristically noisy in underexposed areas and is difficult to resample and work with. Research and Trial and Error are the best ways to understand these tools.

Audio

By now, we’re all accustomed to CD quality audio, and the good news, most cameras are capable of recording this quality. Starting out, most of your problem areas when you come to finishing your product will be in the arena of sound and audio if you do not focus as much effort into designing and making sure you have quality audio to work with.

Using the built-in microphone may be ok at first, while you are learning, but ASAP you want to invest in an outboard microphone and a camera capable of interfacing properly with it or an outboard mixer.

The reason you don’t want to use the built-in is noise. It’s a cheap mic, and it’s attached to the motor that runs your tape mechanism and it will pick it up.

The nice thing about digital video is sync sound. It makes it easy to edit and work with.

Compression schemes are sometimes applied to the audio you record. Be aware of your equipment and what it records and how that information is stored.

The audio is half of the image you’re planning on projecting. As a director, plan on spending some time doing sound design during pre-production. It is one area of the film where you can add production value by spending a little time considering the final experience of your audience and manipulating it to make the whole experience have more depth.

We will cover audio more in-depth in another article.

Video Compression

Compression is what makes affordable, desktop digital video possible. It’s the computer’s way of taking notes. Color Subsampling is one way of compressing the total amount of information needed create the image. Video also uses spatial and temporal compression to reduce overall data rates of video files.

Spatial compression works much like color subsampling in that it will subsample surrounding pixels in an effort to find similarities and take notes.

Temporal compression takes notes across differing frames, and the notes are taken and shared across time. No sense in saving all of the information for two different frames if they are 98% identical.

There are many different forms of compression and some are better than others for distributing digital video, and some are better for working with digital video. Some are better for internet distribution and some are better for archival. Like I said, there’s a lot to learn.

I’m not going to go into the different compression schemes, or codecs. I’ll save that for another article when I can take the time and show you the differences.

Note: Looks like Adobe beat me to it. Check this .pdf out, http://www.adobe.com/designcenter/premiere/articles/DV_Compression_Primer/DV_Compression_Primer.pdf for the skinny on video compression.

Video Formats

The formats you’re going to be using have various combinations of the things we have discussed in this article. NTSC DV has a resolution of 720x480, a color space subsampled at 4:1:1, either a 16x9 or 4x3 aspect ratio, uses a video compression scheme that is both temporal and spatial, records uncompressed 12 or 16 bit audio, and has a data rate of 25Mbps.

Some other digital video formats you may work with could be DVCAM, DVCPro, HDV, etc… They all have their up and downsides. We’ll go a bit more in depth with the various formats when we get into cameras.

Quality

I agree, this is all getting very technical, and you may be wondering what this has to do with you, but believe me, this is information you need to know now in order to make the informed decisions about production that you’re going to have to make later on.

Remember, we’re working for ourselves here, producing our own content, and planning on distribution, cutting out a lot of experts in the process. We have to be the expert and our quality goals should be oriented towards producing what is considered “broadcast quality,” if you want to make some money doing this.

Broadcast quality is a subjective term, but we all know what it is. I strongly believe any indie producer has got to be focused on quality and has to understand the factors that are going to determine the quality of what they are working on, and working towards. In order to explore the established markets we have now as well as to help determine the guidelines on the emerging markets we have such as the internet today, we need to understand the mechanics of the craft as well as the spirit of the art.

That said, production values are not an end in themselves. I’ve watched movies with terrible production values that moved me. In this business, content is king, and I’ve watched and walked out on movies with high-production values because the quality of the material was poor and no amount of money or lighting or special effects could have kept me in the theatre or on that channel. So what is quality if it’s not production values?

So, I’m contradicting myself here and not. Erring on the side of caution again, I personally make every effort to educate myself on production values/technical limitations as well as the methods employed to achieve the various levels of quality that I’ve seen and enjoyed. It’s refreshing to hear about a super-low budget movie hitting it big, it’s kind of punk, kinda rebellious. Sure, money helps when you want to add value to the work by using a higher resolution format and getting the most out of it. But, you can also add value by learning the photographic theories and history behind basic image composition. This isn’t going to cost you anything. You can also add value by adding depth and realism to the characters you employ by doing background research and learning the craft. There are so many elements making up this art form that these numbers and facts and concepts I’m presenting will only make up so much of the foundation you need in order to understand quality in digital video and for you to work with it and add value to your project. As Robert Plant claims in Kashmir, “all will be revealed.”

If you are wondering what is the most important factors in determining image quality, I would have to say data rate and color space. You will want to work in the highest data rate and color space you can afford. This will give you the percieved iamge quality you desire.

If you have any questions, feel free to write me an email at los.mongolitos “at” gmail.com. Thanks for reading and understanding.

Brad

 

Copyright 2007. Angelis Digital Studio. All Rights Reserved.