SimpleCrypt algorithm details

From Qt Wiki
Revision as of 10:09, 16 June 2015 by NetZwerg (talk | contribs) (Cleaning up, Tables, DocLinks, Layout...)
Jump to navigation Jump to search

SimpleCrypt algorithm and format

This page explains in more detail the format used by the Simple_encryption code. The encryption and decryption code works from a QByteArray to a QByteArray. The options to use QString are just provided for convenience. The first two sections detail how the stings and byte arrays correspond for both the plaintext and the cyphertexts. Then, the format of the binary QByteArray cyphertext is detailed.

String plaintext encoding

If a QString is used as the plaintext to encrypt, this string is encoded to a QByteArray by using the UTF8 codec. If a QString is requested as the result if a decryption action, the binary plaintext resulting from the decryption is interpretted using the same codec to construct a QString.

String cypher text format

SimpleCrypt can work with both QString and QByteArray cypher texts. The QString format is build on top of the QByteArray format. The QString is constructed from the QByteArray by creating a base64 encoded version of the binary cypher text, and using the ASCII codec to create a QString from this.

For decryption, the reverse happens. The QString is translated into a binary format using the ASCII codec, and then the resulting QByteArray is decoded using the base64 codec into a binary cyper text.

Binary cypher text format

The binary cypher text (represented in a QByteArray), consists of a header and a payload section:

bytes description
0-1 Header
2-N Payload. This section is encrypted.

Header

The header of the cypher text contains two fields: a version number, and a field with flags.

bytes description
0 Version
1 Flags

Version

The version number is a char. The current version number is 2.

Flags

The flags is a single byte bitfield. Each bit represents a flag that describes how the payload is to be interpreted.

bit value description
0 0x01 Compression. If set, compression has been applied to the plaintext before encryption
1 0x02 Protection Checksum. If set, a CRC16-CCITT checksum, encoded in a quint16, has been used to protect data integrity.
2 0x04 Protection Hash. If set, an SHA-1 cryptograhpic hash (20 bytes long) has been used to protect data integrity.

Bits 1 and 2 should not be set at the same time. If bit 1 has been set, bit 2 is ignored.

Bits 3 to 7 are reserved for future use.

Payload

The payload data block's contents are encrypted with a eight byte (quint64) key. They key is split up in eight chars, which are used consecutively starting again from key char 0 after key char 7 has been used. The QByteArray is encrypted by replacing each byte in the array with the value of that byte itself, XOR-ed with the key char, XOR-ed with the result of this operation for the previous byte in the QByteArray or 0 for byte 0 of the array. This operation can be reversed by XOR-ing the resulting cypher text byte again with both bytes that were used for the encryption phase.

This results in the following encryption schema:

byte key byte XOR value 2
0 0 the value 0
1 1 the result of this operation for byte 0
2 2 the result of this operation for byte 1
7 7 the result of this operation for byte 6
8 0 the result of this operation for byte 7
9 1 the result of this operation for byte 8
N N mod 8 the result of this operation for byte N-1

The idea is that using the result of the previous byte as part of the key for this byte makes it impossible to use a simple attack that analyzes the cyphertext as 8 separate cyphertexts, each with its own 1-byte key. This is an attack possible on the basic vigenere-type cipher.

Decrypted payload

Once the payload data has been decrypted, it may need further processing before the plaintext is available and verified. What processing is needed, depends on the flags that are set in the header of the cyphertext.

In case the Protection Checksum flag has been set, the layout of the decrypted payload looks like this:

bytes Value
0 A random one byte number. This number is ignored after decryption.
1-2 A quint16 containing a CRC16-CCITT checksum value of bytes 3-N.
3-N (Compressed) plaintext.

In case the Protection Hash flag has been set, the layout of the decrypted payload looks like this:

bytes Value
0 A random one byte number. This number is ignored after decryption.
1-20 A 20 byte SHA-1 crypographic hash of bytes 21-N.
21-N (Compressed) plaintext.

On decryption, the checksum or hash value must be checked by recalculating it for the (compressed) plaintext, and comparing that value with the data in the corresponding bytes of the decrypted payload. If they do not match, an error flag is set, and the decryption algorithm must return an empty byte array. If a wrong key is used, it is very unlikely that a checksum will still match. It is far more unlikely that the SHA-1 hash will still match. Note that the random leading number is not used for calculating the checksum or the SHA-1 hash.

If neither of the Protection flags were set, the layout is simply:

bytes Value
0 A random one byte number. This number is ignored after decryption.
1-N (Compressed) plaintext.

Rationale for using a leading random number

The SimpleCrypt algorithm uses the result of the encryption of the previous character as part of the encryption of the current character. That results in a situation that even if you have the same character at byte 0 and byte 7 of the plaintext, they will be most likely be encrypted to a different cyphertext. That makes it much harder to figure out the key used in the encryption if the attacker has the opportunity to feed in his own plaintext and can then see the resulting cyphertext. For byte 0 of the bytestream however, this is not possible. A (known) 0 value is used instead of the value of the previous byte. That leads to a weakness, especially if no Protection flags and no compression are used. The number of characters that are likely to be at the beginning of a password is not so big, thus reducing the amount of possibilities. That leads to an effective reduction in the key strength of several bits. Note that that is a loss of about an order of magnitude in strength.

Putting a random number in front of the string makes it effectively impossible to use such heuristics to reduce the key strength, because there is no predictor for the random number in front (though, the situation is less than ideal, because of the use of the pseudo-random number generator. Achieving true randomness is very hard and well outside of the scope in SimpleCrypt.)

Note that also the use of either of the Protection modes and the use of compression add to the security of the cypher, as both decrease the predictability of the plaintext.

Compression

If the Compression flag is set, the data in the examples above is compressed using {{{1}}}. The used compression level is 9 (maximum). On decryption, the data must then be uncompressed using {{{1}}}.

Result

The resulting binary plaintext is the data retreived from the decrypted payload or, if used, by the (de) compression step above.