Encrypt your data and software with Python

Utpal Kumar   5 minute read      

Say you want to scramble a file so it isn’t readable at a glance, and then unscramble it later with a password you chose. In this post we build a tiny Python script that does exactly that — using nothing but the XOR (exclusive-or) operation — and by the end you can download a ready-to-use command-line version. Along the way you’ll see why the same function both locks and unlocks the file, and — just as importantly — where this trick stops being real security.

Key idea — XOR is its own inverse. Flip a byte with value ^ key and you get scrambled bytes. Flip the result with the same key and the original byte comes straight back, because (v ^ k) ^ k == v. That single property is why “encrypt” and “decrypt” here are literally the same operation — no separate reverse algorithm needed.

XOR is its own inverse XOR-ing the plaintext with the key gives the ciphertext; XOR-ing the ciphertext with the same key restores the plaintext. Plaintext bytes your file ⊕ key Ciphertext scrambled bytes ⊕ same key Plaintext bytes recovered encrypt and decrypt are the same operation
XOR the file with your key to scramble it; XOR the scrambled bytes with the same key to get the original back.

Encrypt a file

The logic is simple. We read any file in binary mode (hence 'rb'), turn the raw bytes into a mutable bytearray, and apply the XOR operation to each byte using a character from the key. The XOR (“exclusive or”) compares two bits: same bits give 0, different bits give 1. The key is your passphrase — you’ll need the exact same key to get the file back. Finally we write the XOR-ed bytes to a new file.

Write an encryption function

def Encrypt(filename, key, pref):
    #reads the input file
    file = open(filename, "rb")
    data = file.read()
    file.close()

    #converts into an arrray of bytes
    data = bytearray(data)
    for kk in key:
        kk = ord(kk)
        for index, value in enumerate(data):
            #apply XOR operation
            data[index] = value ^ kk
    #writes the output file
    file = open (pref + "-" + filename, "wb")
    file.write(data)
    file.close()

ord(kk) turns each key character into its integer code point so it can be XOR-ed against a byte. The output is written with a prefix (e.g. CC-) so the original file stays untouched.

Decrypt a file

Because XOR is its own inverse, decryption is the same byte-flipping loop with the same key — there’s no separate “reverse” math to write.

Write a decryption function

def Decrypt(filename, key, pref):
    #reads the input file
    file = open(filename, "rb")
    data = file.read()
    file.close()

    #converts into an arrray of bytes
    data = bytearray(data)
    for kk in key:
        kk = ord(kk)
        for index, value in enumerate(data):
            #apply XOR operation
            data[index] = value ^ kk
    
    #writes the output file
    defilename = filename.split("-")[-1] #remove the encryption prefix
    file = open (pref + "-" +defilename, "wb")
    file.write(data)
    file.close()

The only new line strips the encryption prefix off the filename so the recovered file gets a clean name.

Quick check: Why doesn’t Decrypt need a different formula from Encrypt?

  • Because Python reverses the byte order automatically on read
  • Because XOR is self-inverse: applying the same key twice cancels out — (v ^ k) ^ k == v
  • Because the prefix CC- stores the original bytes
  • Because bytearray remembers the file’s previous contents

Important — this is a learning exercise, not real security. A repeating-key XOR (a Vigenère-style stream cipher) is trivially breakable: because the same key bytes repeat over the file, an attacker can recover the key with known-plaintext or frequency analysis, and short keys fall to brute force. Do not use this to protect anything that actually matters (passwords, personal data, licensed software). It’s a great way to understand XOR and file I/O — nothing more.

For real encryption, use a vetted library instead of rolling your own:

  • cryptography — its high-level Fernet recipe gives you authenticated symmetric encryption (AES-128 in CBC mode + HMAC) in a few lines:

    from cryptography.fernet import Fernet
    key = Fernet.generate_key()          # store this safely
    f = Fernet(key)
    token = f.encrypt(open("secret.txt", "rb").read())
    plain = f.decrypt(token)             # raises if tampered with
    
  • Derive a key from a password (not the raw password) with a slow KDF such as Scrypt or PBKDF2HMAC, also in cryptography.
  • For general-purpose file encryption on the command line, age and GnuPG are well-reviewed, modern choices.

Complete Script

You can download the complete script from my github repository.

Usage

$ python endecrypt_with_python.py -h
usage: endecrypt_with_python.py [-h] [-encpref ENCRYPT_PREFIX]
                                [-decpref DECRYPT_PREFIX] [-en] [-de] -inp
                                INPUTS [INPUTS ...] -k KEY

Python utility program to encrypt/decrypt multiple files (by Utpal Kumar, UC
Berkeley, 2021/11)

optional arguments:
  -h, --help            show this help message and exit
  -encpref ENCRYPT_PREFIX, --encrypt_prefix ENCRYPT_PREFIX
                        prefix for the encrypted files
  -decpref DECRYPT_PREFIX, --decrypt_prefix DECRYPT_PREFIX
                        prefix for the decrypted files
  -en, --encrypt        encrypt the files
  -de, --decrypt        decrypt the files
  -inp INPUTS [INPUTS ...], --inputs INPUTS [INPUTS ...]
                        input files to encrypt or decrypt
  -k KEY, --key KEY     key to encrypt or decrypt the files

enter filename(s) and key to encrypt and decrypt

Example

Encrypt files

$ python endecrypt_with_python.py -inp "myscript.py" "helloworld.py" -k "password" -en 

Decrypt files:

$ python endecrypt_with_python.py -inp "CC-myscript.py" "CC-helloworld.py" -k "password" -de 
Try it: prove XOR round-trips

Run this in a REPL to convince yourself the same key undoes itself, byte for byte:

data = bytearray(b"seismic waveform")
key  = "quake"
enc  = bytearray(v ^ ord(key[i % len(key)]) for i, v in enumerate(data))
dec  = bytearray(v ^ ord(key[i % len(key)]) for i, v in enumerate(enc))
print(enc)          # scrambled, unreadable bytes
print(dec.decode()) # -> 'seismic waveform'

Then change one character of the key on the decrypt line and watch the output turn to garbage — that’s the whole “you need the right key” idea in one experiment.

Recap

  • XOR scrambles and unscrambles with one operation. value ^ key hides the byte; the same ^ key brings it back, because XOR is self-inverse.
  • The key is symmetric. Encrypt and decrypt must use the identical key string — lose it and the file is stuck scrambled.
  • File handling matters. Read/write in binary ('rb'/'wb') so the trick works on any file — images, executables, PDFs — not just text.
  • It’s a teaching cipher, not protection. Repeating-key XOR is easy to break; reach for cryptography’s Fernet, age, or GnuPG when the data genuinely needs to stay secret.

Where to go next

Disclaimer of liability

The information provided by the Earth Inversion is made available for educational purposes only.

Whilst we endeavor to keep the information up-to-date and correct. Earth Inversion makes no representations or warranties of any kind, express or implied about the completeness, accuracy, reliability, suitability or availability with respect to the website or the information, products, services or related graphics content on the website for any purpose.

UNDER NO CIRCUMSTANCE SHALL WE HAVE ANY LIABILITY TO YOU FOR ANY LOSS OR DAMAGE OF ANY KIND INCURRED AS A RESULT OF THE USE OF THE SITE OR RELIANCE ON ANY INFORMATION PROVIDED ON THE SITE. ANY RELIANCE YOU PLACED ON SUCH MATERIAL IS THEREFORE STRICTLY AT YOUR OWN RISK.


Leave a comment