Network Consumption of IP Cameras

How much network consumes an IP camera?

This is undoubtedly one of the questions I hear most from our customers when we talk about using the existing TCP / IP network for some CCTV project.This should also be a concern when dimensioning the network exclusively for the CCTV system.

This question does not have a simple answer, because there are many factors that influence the “size” or “weight” of the images, some we can predict with ease because we can control, but others are more complicated and are beyond our control.

The factors that influence the “weight” of the image and that we can control are:

– Resolution (Image size);

– Number of fps (frames per second);

– Compression used.

The factors we can not control are:

– Amount of movement in front of the camera;

– Quantity of information in the image.

Well, let’s talk one by one.

Resolution is the size of the image in pixels, that is, how many pixels that image has.

A VGA image has approximately 300,000 pixels

A 1Mega pixels image, it has 1 million pixels … a 2 Megapixel guess … it has 2 million pixels, so it’s obvious that the higher the resolution the bigger the weight of that image.

Number of fps (frames per second).A video broadcast or a movie, are formed by a sequence of photos that we call frames (each picture is a frame), the photos displayed in sequence are that give the sensation of movement and the more frames per second the better this feeling.

In general IP cameras have the capacity to generate up to 30 fps (frames per second) but in practice we use much less than this, depending on the purpose of the system.

Compression used.An image is too large or too heavy to simply be transmitted over the network, so we use a compression algorithm to reduce that weight.

The most common are JPEG, MJPEG, MPEG4 and H264.

Of these, JPEG is the least efficient in compression and H264 is the most efficient in compression.

Maybe you ask, because camera makers do not only use the H264.Because better compression is not always the best option for a particular system, H264 compression is the best compression but for this it consumes a lot of camera processing to compress and then more NVR or CMS processing to decompress while a JPEG image has a compression can be compressed or uncompressed without consuming too much camera or NVR and CMS processing.

Amount of movement in front of the camera.Most IP camera models have the option of motion detection which makes it possible to create a simple rule that causes the camera to transmit the images over the network only when there is movement in front of the camera, thus saving a lot of network consumption , only sharp images will be transmitted when something is happening in front of the camera.

This we can not control, and can vary greatly from one camera to another.

For example, imagine a camera in the reception of a commercial condominium, there will probably be movement all the time in front of you.Now imagine a camera in the air-conditioned room of the same condo, there will probably only be movement in front of you when someone enters the room to make some movement.

Quantity of information in the image.For me this is the most interesting and the hardest to predict.

To understand better, see these 3 images, both of the same size (76,800 pixels).

In figure 1 we have a black image.This image “weighs” 16Kb.

In figure 2 we have a black image with a red detail.This image “weighs” 24Kb.

In figure 3 we have a black image with colored details.This image “weighs” 40Kb.

What does that mean?That every second will generate images with different “weight”, in a moment we will have a person with a white shirt in front of the camera, another time we will have two people with colored shirts, and so on.

Take a test, get your camera and take a photo of the wall, then put something like a vase or a person between the wall and the camera and take another photo, you will have two images with very different weights.

Well, I’ve said a lot of things but I have not answered what is the network consumption of an IP camera.

The calculation is done like this:

(resolution x fps x compression) we will have the approximate consumption with the continuous transmission, that is, without detection of movement.

To make it easier, camera manufacturers offer tools to calculate the network consumption of their cameras, it is logical that they are approximate values, but to make it easier, below are approximate values ​​of each image:

VGA image (640 x 480px)

MJPEG = 0.28 Mbps (each frame)

MPEG4 = 0.11 Mbps (each frame)

H264 = 0.06 Mbps (each frame)

Image 1 megapixel (1280 x 800px)

MJPEG = 0.95 Mbps (each frame)

MPEG4 = 0.36 Mbps (each frame)

H264 = 0.22 Mbps (each frame)

Image 2 megapixel (640 x 480)

MJPEG = 1.92 Mbps (each frame)

MPEG4 = 0.73 Mbps (each frame)

H264 = 0.44 Mbps (each frame)


In an installation with 4 IP cameras of 1 Megapixel and compression H264, where we will transmit the images to 5 fps.

0.22Mbps x 5 fps = 1.1Mbps

1.1 Mbps x 4 cameras = 4.4Mbps.

4.4 Mbps will be the network consumption during the transmission of the 4 cameras.

Leave a Reply