BMP File Encryption

An algorithm for encrypting and decrypting bmp files. The encrypted images have a uniform distribution of pixel values in each color channel, so as to hide the uneven distribution of pixel values in the original image that can provide relevant statistical information.

Compile and run

encryption_script.c

To compile: gcc encryption_script.c encryption_functions.c -o script

To run: ./script

encryption_verifier.c

To compile: gcc encryption_verifier.c -o verifier

To run: ./verifier

Name of encrypted file to test: output/BMPencrypted.bmp

Output for the encryption_script.c

BMPencrypted.bmp	BMPdecrypted.bmp

Project theme

Cryptography is a very important area of mathematics and computer science. The project theme is the encryption and decryption of a message (customized in this project to an image).

The scenario to be implemented in this project is the following: person A sends to person B an image I encrypted using an encryption algorithm. B can decrypt the encrypted image received from A thus obtaining the initial image I.

In this project we will be working with color images, which we can manipulate in C language as binary files. Images are in BMP (bitmap) format. Unlike other formats (JPG, JPEG, PNG, etc.) the BMP format does not compress images, but stores 3 bytes per pixel for color images. This feature makes the BMP format suitable for this project as we can explicitly access the pixel intensity values that make up the color image. To view the images on our personal computer we must have a specific program installed (IrfanView, Paint, Gimp, Preview, ImageJ etc.).

The project will be structured in one module that deals with the problem of encryption / decryption of a message.

BMP format

BMP format (bitmap) is a file format used to store two-dimensional digital images of arbitrary width, height and resolution, monochrome or color. In this project you will only work with color images. Basically, in BMP format, the image is viewed as a pixel array, and each pixel is a full value. The first 3 bytes of a pixel representation are the intensity of the 3 color channels R (Red), G (green), B (blue). Consequently, the intensity of each color channel R, G, B is given by a natural value between 0 and 255. For example, a pixel with values (0, 0, 0) represents a black pixel, and a pixel with a black color the values (255, 255, 255) represent a white pixel. At RGB Color Codes Chart you can find the correspondence between RGB triplets and colors.

The BMP format is described here: BMP file format in a comprehensive way. The BMP format includes a fixed-size data area, called a header, and a variable-size data area that contains the pixels of the image itself. The header, which occupies the first 54 bytes of the file, contains information about the BMP format, as well as information about the image size, the number of bytes used to represent a pixel, etc.

The size of the image in bytes is specified in the header by an unsigned integer value, so it is stored on 4 bytes, starting with the byte with the order number 2. The image size in pixels is expressed as W × H, where W represents the number of pixels per width, and H represents the number of pixels per height. The image width expressed in pixels is stored on four unsigned bytes starting with the 18th byte in the header, and the height is stored on the next 4 unsigned bytes in the header, respectively starting with the 22nd octet in the header.

For fast processing of images in reading and writing, BMP images have the property that each line has a number of bytes that is a multiple of 4. This is achieved by adding padding bytes so that the total number of bytes on a line becomes multiple of 4. The number of bytes of a line is 3 × width (3 bytes per pixel on a line). Thus, if an image has 11 pixels in width (as with the templates we will work with) the number of bytes of padding is 3 (3 × 11 = 33 bytes per line, so at the end of each line, 3 padding bytes will be added so that we have 33 + 3 = 36 multiple of 4 bytes). Usually the padding bytes added are 0.

Because the encoding of a BMP image in a binary file follows the little-endian standard, the bytes corresponding to the 3 color channels R, G, B are stored from right to left, that is, in order B, G, R.

The encryption / decryption module

Cryptography is a branch of mathematics that deals with the secure transmission of information. One of the desires of cryptography is to ensure the confidentiality of data, so that certain information can only be accessed by authorized persons. We achieve this goal by using an encryption process by which intelligible information (a text, an image, an audio file, etc.) is transformed into an unintelligible one. The inverse transformation, by which the encrypted information is returned to the original shape (intelligible) is performed by means of a decryption process.

A symmetrical figure consists of a pair of algorithms that ensure the encryption and decryption processes, as well as a common secret key (known only to people who want to communicate with each other in a secure way) that controls the two processes.

Basically, the common secret key represents an entry date for encryption and decryption algorithms, so not knowing its exact value will lead to the impossibility of decrypting that message by an unauthorized person.

In the following, we will describe a simple symmetric cipher that can be used to ensure the confidentiality of an image. First, we will present some notations:

Be I = ( I_{i, j}), 0≤ i < H, 0≤ j< W, a color image with the width of W pixels and the height of H pixels in matrix form. The linearization of the image I implies the creation of a one-dimensional array L by aligning the lines of the two-dimensional picture I, from top to bottom.

matrix form	linearized form

considering a color image I in linear form, the value of each pixel I_k will be a triplet of unsigned bytes I_k = (I_k^R, I_k^G, I_k^B).
we will note with ⨁ the operation or-exclusively (XOR) between 2 unsigned bytes.
for 2 pixels P₁ = (P₁^R, P₁^G, P₁^B) and P₂ = (P₂^R, P₂^G, P₂^B) we will consider P₁⨁P₂ to be the pixel (P₁^R⨁P₂^R, P₁^G⨁P₂^G, P₁^B⨁P₂^B)
for a pixel P = (P^R, P^G, P^B) and a 32-bit unsigned integer X consisting of unsigned bytes (X₃, X₂, X₁, X₀) we will note with P⨁X the pixel (P^R ⨁X₂, P^G⨁X₁, P^B ⨁X₀), so the most significant byte X₃ of X will not be used.
a pseudo-random number generator is a deterministic algorithm that generates a sequence of numbers having statistical properties similar to those of a perfectly random sequence of numbers (that is, a sequence of numbers for which the probability of occurrence of a given value is independent of all values previously generated) starting from a seed value. For example, the XORSHIFT32 generator proposed by George Marsaglia in 2003 generates unsigned 32-bit integers with very good pseudo-random character, using shift operations on bits and XOR Xorshift.

Encrypting steps:

Using the above notations, the encryption algorithm of a color image P (plain image) of size W × H in linear form P = (P₀, P₁ , ..., P_{W * H - 1}) using a secret key S is the following:

generate R: a 2 * W * H - 1 long sequence consisting of 32-bit unsigned random integers, using the XORSHIFT32 generator initialized with a non-zero value R₀
generate a random permutation σ of size W × H, using Durstenfeld's algorithm - the modern version of the Fisher-Yates algorithm and the pseudo-random numbers R₁, ..., R_{W * H - 1}
the pixels of the image P are permuted according to the permutation σ, obtaining an intermediate image P'. For example, considering the image P = (P₀, P₁ , P₂, P₃) and the permutation σ = (⁰₃¹₀ ²₂ ³₁) we will obtain the image P' = (P₁, P₃ , P₂, P₀) because the pair (k, σ (k)) indicates that the pixel at position k in the image P will be moved to the position σ(k) in the image P', that is P'_σ(k) = P_k for any k ∈ {0,1, ..., W ∗ H - 1}
the encrypted image C = (C₀, C₁, ..., C_{W * H - 1}) is obtained by applying the following relation of substitution to each pixel of the image P' = P₀, P₁ , ..., P_{W * H - 1}:

where SV (starting value) is a 32-bit non-zero integer.

Shared secret key of this cipher is composed of the R and SV, both non-zero unsigned 32-bit integers. To ensure a high level of security of this figure, the common secret key must be changed before each use of the figure with one that has not been used before!

Decrypting steps:

The cipher being symmetrical, the decryption algorithm is complementary to the encryption. Thus, the decryption algorithm of a ciphered color image C of size W × H in linear form C = (C₀, C₁, ..., C_{W * H - 1})using a secret key S is the following:

generate R: a 2 * W * H - 1 long sequence consisting of 32-bit unsigned random integers, using the XORSHIFT32 generator initialized with a non-zero value R₀
generate a random permutation σ of size W × H, using Durstenfeld's algorithm - the modern version of the Fisher-Yates algorithm and the pseudo-random numbers R₁, ..., R_{W * H - 1}. Then, calculate its inverse σ^-1
apply to every pixel in the encrypted image C = (C₀, C₁, ..., C_{W * H - 1}) the inverse of the substitution relation used in the encryption process, obtaining an intermediate image C' = (C'₀, C'₁, ..., C'_{W * H - 1}):

decrypted image D = (D₀, D₁, ..., D_{W * H - 1}) is obtained permuting the pixels of image C' according to the permutation σ^-1 , respectively D_σ(k)^-1 = C'_k for any k ∈ {0,1, ..., W ∗ H - 1}

The correctness of the decryption algorithm is very easy to prove, using the properties of XOR!

The χ² test

On an encrypted image, an unauthorized person (an attacker who does not know the secret key) can perform several types of attacks to find either the secret key or the image in clear (in many cases, even determining only part of the image can provide very important information to the attacker). One of the simplest attacks is the frequency attack, whereby the attacker will first determine, on each color channel, the frequencies of the pixel values, after which he will try to associate their values, in the decreasing order of the frequencies, with the values he considers most likely to be the right ones.

For example, if the attacker predicts that original image contains several warships on missions, he can assume that most pixels are blue (or shades of blue), so they have RGB values of the form (0, 0, 255) . Consequently, he will replace, in all pixels, the maximum frequency values on the R and G color channels with 0, and the maximum frequency value on the color channel B with 255. After that, he can assume that the rest of the pixels represent military ships, so the respective pixels may be white (or another color specific to warships) and the previous procedure will resume. Even if in this way he will not be able to reconstruct the original image, it is possible to partially reconstruct the outlines of the ships, so he will find out how many ships are the image - a very important information!

To withstand such statistical attacks, a cipher must produce, regardless of the secret key used, encrypted images with a uniform distribution of pixel values in each color channel, thus hiding the uneven distribution of pixel values in the initial image that can provide relevant statistical information.

The most commonly used visual analysis tool to study the distribution of pixel values of a color image is the color histogram, which graphically represents the frequencies of the pixel values of an image, separately for each color channel.

In the figure below you can see a clear image and the corresponding encrypted image, together with their histograms:

It is noted that following the clear image encryption process, having a highly uneven color distribution, an encrypted image was obtained with a uniform distribution of pixel values for each color channel, so a potential attacker will not be able to extract information of statistical nature regarding the initial image or the secret key used.

The visual analysis of the uniformity of the distribution of the pixel values using the color histogram presents several disadvantages, one of them being that the accuracy of the evaluation depends in a decisive way on the visual acuity of the human assessor and on the graphical quality of the histogram. The disadvantages of visual analysis can be eliminated by using a statistical tool to evaluate the uniformity of the distribution of values in a row, respectively the χ² test (the chi-square test).

The χ² test value corresponding to an image of size m × n is calculated using the following formula:

where f_i is the frequency of the value i (0 ≤ i ≤ 255) on a color channel of the image, and f^- = (m*n)/256 is the theoretically estimated frequency of any value i. A test value less than or equal to the theoretical threshold value χ₀²= 293.25 indicates a uniform histogram on the respective color channel, while a larger value indicates a non-uniform histogram.

For the two images above, the χ² test values on the RGB color channels are the following: • for the clear image: 211104.79, 348462.19, 197839.49; • for encrypted image: 264.48, 251.41, 330.49.

From the previous numerical values it can be observed that the image in clear has strongly uneven distributions of pixels values on all the color channels (an easy fact to observe from the visual analysis of its histogram), while the encrypted image has a uniform distribution of the values pixels on the R and G channels, but show a slight unevenness of the distribution on the B channel (a fact almost impossible to detect by visual analysis!).

Requirements:

Implement the XORSHIFT32 pseudo-random number generator as a function or macro definition.
Write a function that loads a BMP image into internal memory in linear form. The function will have as a parameter the path of the BMP image.
Write a function that saves in the external memory a BMP image stored linearly in the internal memory. The function will have as a parameter the path of the BMP image.
Write a function that encrypts a BMP image using the encryption algorithm presented. The function will have as parameters the path of the initial image, the path of the encrypted image and the path of a text file containing the secret key.
Write a function that decrypts an encrypted BMP image using the presented decryption algorithm. The function will have as parameters the path of the initial image, the path of the encrypted image and the path of a text file containing the secret key.
Write a function that displays the χ² test values for a BMP image on each color channel. The function will have as a parameter the image path.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
input_resources		input_resources
output		output
Documentation in romanian.pdf		Documentation in romanian.pdf
README.md		README.md
encryption_functions.c		encryption_functions.c
encryption_functions.h		encryption_functions.h
encryption_script.c		encryption_script.c
encryption_verifier.c		encryption_verifier.c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

input_resources

input_resources

output

output

Documentation in romanian.pdf

Documentation in romanian.pdf

README.md

README.md

encryption_functions.c

encryption_functions.c

encryption_functions.h

encryption_functions.h

encryption_script.c

encryption_script.c

encryption_verifier.c

encryption_verifier.c

Repository files navigation

BMP File Encryption

Compile and run

encryption_script.c

encryption_verifier.c

Output for the encryption_script.c

Project theme

BMP format

The encryption / decryption module

Encrypting steps:

Decrypting steps:

The χ² test

Requirements:

About

Releases

Packages

Languages

Natasa-C/BMP-file-encryption

Folders and files

Latest commit

History

Repository files navigation

BMP File Encryption

Compile and run

encryption_script.c

encryption_verifier.c

Output for the encryption_script.c

Project theme

BMP format

The encryption / decryption module

Encrypting steps:

Decrypting steps:

The χ2 test

Requirements:

About

Resources

Stars

Watchers

Forks

Languages

The χ² test