Hi guys. It’s been a long time since I wrote in here. I’m busy with numerous other projects, but I was thinking about something the other day that I just couldn’t pass up the opportunity to discuss. Now it should be noted before I begin that I’m not an expert in any of the topics I am discussing, but I’m offering the subjects up to see what you guys think. So don your philosophical hats, and let’s take a weird trip into the essence of existence.
Let’s start with a simple concept. I have some data. This data is the word ‘happy’. Now, if I have that data stored somewhere and I can retrieve it, it could be said that I possess that data. Hopefully you’re all happy with this (pun intended). If someone else takes a copy of the same data. They could be said to possess the same data as me. If they have done this without my consent, we could naturally assume that they have stolen this data, since they possess the same data as me.
As previously mentioned, I’m not an expert in any of these fields, but this seems to be born out by the current state of law and affairs. If someone copies a piece of software, or a video file, without the consent of the owner of the data, and let us note here that the owner is not necessarily the possessor of the data, then they are liable to be prosecuted for stealing the data.
Speaking in riddles
Now that we have that nailed down, let’s mix things up a little. Consider that we have our word, ‘happy’, and that we encrypt it with a very simple encryption algorithm. This algorithm is implemented in the following manner. First, turn each individual letter into a numeric counterpart, a-z > 1-26. This leads to a = 1, b = 2 etc.
Now we come up with an encryption “key”. This key will be the same length as the data we are trying to encrypt. In this case let’s use the word ‘weird’. We use the same numeric conversion on our “key” and add the two words together, numerically, one letter at a time and then turn the resulting numbers back into letters via the same numeric mapping.
h a p p y == 8 1 16 16 25
w e i r d == 23 5 9 18 4
? f y ? ? == 31 6 25 34 29
The problem here is that some of our numbers are over 26 and so can’t be represented. To rectify this we’ll make it so that if we go over 26, we’ll just subtract 26 from the result. The final encrypted version then becomes.
e f y h c == 5 6 25 8 3
Though that was laborious for some of you, it was necessary to proceed to the next step in my musings. So we now have ‘efyhc’. Anyone looking at that “word” isn’t going to have a clue what it means. That’s the purpose of encryption right? To “hide” the data.
Possession is 9/10ths of the law
What’s interesting now though is that if I only hold the encrypted version of the data. Does the data still exist? Harking back to the whole, if a tree falls in the forest and no one is around does it make sound? argument, it’s actually surprisingly similar. Without the encryption key, the string of data ‘efyhc’ is essentially just random data.
What is it that separates it from actual random data? The goal of an encryption algorithm is to make the cipher text undecipherable. In essence to make it as random as possible, so that no patterns exist. To all intents and purposes we could call this a random string. After all it could turn up in a random string quite easily.
That sad little man inside me, wanted to satisfy this, and wrote a little python script. After processing the script several times, my trusty precious data took between 1 second and 1 minute to turn up. Now remember that in a truly random number generator my word could have turned up first. It could also have never turned up at all, no matter how long I ran it. That my friends is the beauty of random.
Going back to our idea of the existence of the data. Imagine now that we destroy our encryption key. In our example, the real data is pretty easy to remember, but let’s now assume it isn’t. Assume it is a large document. If we destroy our encryption key, does the data still exist?
Build me a wardrobe squire
So do you have an answer? It’s a little different to it’s physical analogy. Often people refer to encryption as locking something away, ensuring people don’t get access to it. Though the end goal is the same, securing the data from people who we don’t want to have access, the mechanisms are quite different. In the physical world, if we put a padlock on something, the physical item always still exists. We may be denied access to it, but it still “physically” exists. What this means is, given the effort we can gain access to it, via cutting off the lock, blowing a hole in a safe or some other means. We may not know what the physical object even is, but we know it exists.
In the digital world, the “object” for want of a better word, our data, is transformed into something different. Something which bears no resemblance to the original at all, at least if we use a good enough encryption algorithm. We spoke earlier of the effort to make our data look like a good old random number sequence, in the physical world, it’s interesting to consider how one would actually implement encryption of physical objects. Perhaps the best example I can think of is flat-packed furniture. Here, the parts in the pack would be the cipher text and the instructions would be the key. We shouldn’t be able to make the furniture (gain access to the data), without the instructions.
The difference between the two, encryption and flatpack furniture then really just becomes the difference between physical and virtual worlds, which we are all already aware of.
Give me back my stuff!!
So let’s turn our attention back to possession again for a while. Let’s consider the duality of the physical and virtual worlds together. If I possess just the flat pack furniture, do I possess a wardrobe, if I don’t have the instructions? Similarly if I just possess the cipher text, do I really possess the data, if I don’t have the encryption key?
Well, here is where some differences lie. And we are very shortly going to leave our faithful mahogany companion behind and concentrate on more “virtual” things. If we possess the cipher text, and do not have, or have destroyed the encryption key, the data can not exist. As we said stated earlier, the cipher text is just random characters.
The cipher text surely holds the best chance possible of being able to get the data back again? We just need to know what the secret key is? Well, let’s think about that for a second. How secret is that key. If we model the cipher text as a random string of characters, and the encryption key as a random string of characters. Then something interesting happens. The key to possessing the data, isn’t the fact of knowing the key. It’s knowing which key and which cipher text go together.
But that’s not fair!!
Remember our super secret word earlier? Shhhh don’t tell anyone….’happy’. When we paired it with our secret choice of encryption key ‘weird’, we came up with our cipher text, ‘efyhc’. We rested. Our secret was safe. No one would be able to get our data back, without first knowing our encryption key, our secret choice of encryption key. Correct?
Well, as it turns out not exactly. Consider the following.
w o z d t == 22 14 25 3 19
o n j n u == 14 13 9 13 20
h a p p y == 8 1 16 16 25
Hold the phone!!! That’s our very secret data right there in the open. But how? As it turned out I wrote a small program to take two random strings of data, and use one as the cipher text, and one as the key. I then did the reverse as in our first example, ie. I decrypted the data. Et Voila!! Without knowing anything either the key or cipher text, I have managed to come up with the real data.
Turning this back to the real problem at hand, I also wrote a script to take the known cipher text and generate a random string of characters that I used as a “key”. I quickly also found my secret key, and then subsequently my original data. This should be of no surprise to us. As I mentioned before, neither the cipher text, nor the key are secret. We already know them to be random strings of data.
So the problem isn’t so much as being able to generate the data, the astute reader will have realised by now that we could just use a random string to obtain the original data, without any kind of encryption process. No, the real problem is being able to validate the data. The real problem is that without the original, we can’t be sure that our version is an accurate copy of the original. So in short, to possess the data, we need to possess the data.
Existence and possession Part 2
Now, the revelation there may actually appear not to be so much of a revelation. To be able to possess the data we must possess the data. Seems pretty obvious. Let’s just think for a second though. What we really mean is this, that the encrypted form of the data on it’s own, doesn’t contain the data. It is part of a random string, which when “energised” with the “key”, produces the original data. One must possess both pieces of the puzzle to possess the original data.
With this in mind it’s interesting to think about theft a little more. If I steal an encrypted version of Apple’s flagship product, but I don’t possess the decryption key, then really, I possess nothing more than a random string of data. Otherwise, the fact that I possessed a random number generator would be enough to make me liable to prosecution.
I’ll leave you with a final thought though, if you possess a true random number generator, and was capable of running for infinity, and you printed everything it produced. You’d be in possession of everything that ever has, is and will be produced. Mind blowing 🙂