Cryptographic obfuscation and ‘unhackable’ software

I have a thing for extraordinary cryptanalysis headlines — by and large because I enjoy watching steam come out of researchers ’ ears when their cultivate gets wholly misrepresented. And although I ’ ve seen quite a few thoroughly ones, last week WIRED managed a doozy .
The headline in doubt, Cryptography Breakthrough Could Make Software Unhackable, managed to accomplish something that few cryptography headlines do. It sent its own protagonist, Amit Sahai, into the comments section to perform cerebral garbage cartridge .
In contrast to the headline, which is quite bad, the article is actually reasonably decent. still, the discussion around it has decidedly led to some confusion. As a solution, many now think an amazing breakthrough has taken place — one that will last make software secure. They ’ rhenium probably right about the first part. They may be disappointed about the pillow.

The truth, as usual, is complicated. There is, indeed, something very neat going on with the new bewilderment results. They ’ re fair not likely to make software ‘ unhackable’ anytime soon. They might, however, radically expand what we can do with cryptanalysis. Someday. When they ’ ra ready. And in ways we don ’ triiodothyronine in full understand so far .
But before I go into all that, it ’ second credibly helpful to give some background .

Program obfuscation

The Wired article deals with the subject of ‘ program obfuscation ‘, which is a term that software developers and cryptographers have hanker been matter to in. The motivation here is pretty simpleton : find a manner that we can give people programs they can run — without letting them figure out how the programs work .
eminence that the last separate necessarily covers a batch of ground. In principle it includes aspects ranging from the nature of specific secret algorithm used — which may be proprietary and thus worth money — to secret information like passwords and cryptanalytic keys that might be hardcoded into the broadcast .
For a childlike case, consider the play along everyday :
// Check a password and print out top secret information if it ’ randomness right
SuperSecretPasswordProtectedStuff ( string passwd ) {
if ( password == “ 0xt438fh27266629zn28366492923aai3jnqobbyc4t ! ” ) {
print ( “ Congratulations. here ’ s some ace confidential private information : ….\n ” ) ;
} else {
print ( “ Wrong password, fool.\n ” ) ;
If you ’ re like me, you probably wrote a plan like this at some point in your life. You may have finally realized how ineffective it would be — against a smart attacker who could dump the program code. This is because most programs ( and even compiled binaries ) are reasonably easy to read. An attacker could just look at this program to recover the mystery password .
Program mystification is motivated by the theme that many useful programs would benefit if we could somehow ‘ stop ’ people from doing this, while still letting them own and run the code on their own computers .
In real populace software systems, ‘ obfuscation ’ normally refers to a collection of ad-hoc techniques that turn nice, sensible programs into a mire of GOTOs and spaghetti code. sometimes important constants are chopped up and distributed around the code. Some portions of the code may flush be encrypted — though only temporarily, since decoding keys must ship with the platform so it can actually be run. Malware authors and DRM folks love this kind of obfuscation .

A chunk of ‘birken’, one of the winning entries in the 2013 obfuscated C contest.

While these techniques are nice, they ’ ra not very ‘ unhackable ’ or even secure against reverse-engineering in the hard feel we ’ d like. Given enough time, brains and tools, you can get past most common software mystification techniques. If you have, say, Charlie Miller or Dion Blazakis on your side, you credibly do it promptly. This international relations and security network ’ deoxythymidine monophosphate equitable because those guys are smart ( though they are ), it ’ south because the techniques are not mathematically rigorous. Most virtual obfuscation amounts to roadblocks — designed to slow reverse-engineers down and make them give up .

So what does it mean to securely ‘obfuscate’ a program?

The hapless timbre of existing software mystification set cryptographers up with a bang-up problem. specifically, they asked, could we define a strong definition of program bewilderment that would improve on, to put it politely,   the crap people were actually using ? Given such an obfuscator I could hand you my obfuscate program to run while provably protecting all fond information — except for the legitimate inputs and outputs .
If this could be done, it would have an incredible phone number of applications. stock traders could obfuscate their proprietary deal algorithm and send them to the cloud or to end-customers. The result programs would still work — in the sense that they produced the right results — but customers would never learn anything about ‘ how they worked ’. The internals of the plan would be privy sauce .
unfortunately, when the first researchers started looking at this problem, they ran into a very serious problem : nobody had any idea what it meant to  securely ‘obfuscate a program in the first place.

Do think about it for a irregular before you roll your eyes. What do you think it means to obfuscate a program ? You ’ re credibly thinking something like ‘ people shouldn’t learn stuff ‘ about the course of study. But can you explain what stuff ? And does ‘ stuff ‘ depend on the course of study ? What about programs where you can efficiently learn the whole program from sending it input and seeing the results ? What does it mean to obfuscate those ?
clearly before advancement could begin on solving the problem, cryptographers needed to sit down and figure out what they were trying to do. And indeed, several cryptographers immediately began to do precisely this .

Black-box cryptographic obfuscation

The first base definitions cryptographers came up addressed a very herculean type of obfuscation called ‘ virtual black box obfuscation ‘. Roughly talk, it starts from the following intuitive thought experiment .
Imagine you have a course of study P, deoxyadenosine monophosphate well as some code obfuscation proficiency   you ’ ve develop. It ’ south authoritative that the mystification be efficient, meaning it doesn ’ thymine slow the program down excessively a lot. To determine whether the obfuscation is successful, you can conduct the following experiment .

  1. Give Alice a copy of the obfuscated program code Obf(P), in a form that she can look at and run on her own computer.
  2. Take the original (unobfuscated) program P, and seal it in a special computer located inside an armored ‘black box’. Let a user Sam interact with the program by sending it inputs and receiving outputs. But don’t let him access the code.

roughly address, what we want from secure obfuscation is that Alice should learn ‘no more ‘ data after seeing the obfuscate program code, than could some corresponding drug user Sam who interacts with the lapp plan locked up in the black box .
What ’ mho nice here is that this is we have the beginnings of an intuitive definition. In some feel the obfuscation should render the plan itself basically opaque — Alice should not get any more information from seeing the obfuscate program than Sam could get simply by interacting with its input/output interface. If Sam can ‘ learn ’ how the platform works barely by talking to it, tied that ’ s ok. What ’ s not all right is for Alice to learn more than Sam .
The problem with this intuition is, that of course, it ’ randomness hard to formulate. One thing you may have noticed is that the exploiter Alice who views the obfuscate program truly does learn something more than the user Sam, who lone interacts with it via the black corner. When the experiment is over, the drug user who got the obfuscated program  still has a copy of the obfuscated program.  The user who interacted with the black box does not .
To give a practical exercise, let ’ s say this platform P is a certificate bless broadcast that has a digital sign key hard-coded inside of it — say, Trustwave ’ s — that will happily attach valid digital signatures on certificates ( CSRs ) of the exploiter ’ randomness devising. Both Alice and Sam can formulate certificate sign requests and send them into the program ( either the obfuscate imitate or the adaptation in the ‘ black box ’ ) and get valid, signed certificates out the early end .
But hera ’ s the thing : when the experiment is over and we shut down Sam ’ s access to the black box, he won ’ thymine be able to sign any more  certificates. On the other bridge player, Alice, who actually who learned the obfuscate program Obf(P), will however have the program ! This means she can keep signing certificates on and evermore. Knowing a broadcast that can sign arbitrary certificates is a reasonably mend significant ‘ piece ’ of information for Alice to have !
Worse, there ’ s no way the ‘ black box ’ user Sam can ever learn a similar slice of information — at least not if the signature system is any good. If Sam could   ask the black box for a bunch together of signatures, then somehow learn a program that could make more  signatures, this would imply a fundamental weakness in the digital signature scheme. This cuts against a basic place we expect of secure signatures .
Barak et al. proposed a clever manner to get around this trouble — rather than outputting an arbitrary musical composition of data, Alice and Sam would be restricted to outputting a single bit at the end of their experiment ( i, computing a predicate function ). This helps to avoid the problems above and seems generally like the “ right ” definition .

An impossibility result

Having proposed this nice definition, Barak et al. went on to do an irritating thing that cryptographers sometimes do : they proved that even their definition doesn ’ triiodothyronine work. specifically, they showed that there exist programs that merely can’t be obfuscated under this definition .
The reason is a fiddling rickety, but I ’ molarity going to try to give the season for it below .
imagine that you have two programs A and B, where A  is exchangeable to our password program above. That is, it contains a hard-coded, cryptanalytic secret which we ’ ll announce by password. When you run A( x ), the program checks whether ( x == password ) and if therefore, it outputs a second gear cryptographically strong password, which we ’ ll cleverly denote by password_two. If you run A on any other stimulation, it doesn ’ metric ton output anything .
now imagine the second program B  works similarly to the first one. It contains both password and password_two hardcoded within it. A major remainder is that B  doesn ’ t take a string as input. It takes another computer program.   You feed a program in equally input signal to B, which then :

  1. Executes the given program on input password to get a result r.
  2. If (r == password_two)it outputs a secret bit.

It should be apparent to you that if you run the platform B  on input the program A — that is, you compute B(A) — it will constantly output the hidden bit. * It doesn ’ t even matter if either B or A has been obfuscated, since mystification doesn ’ deoxythymidine monophosphate change the behavior of the programs .
At the same meter, Barak et al. pointed out that this trick only works if you actually have code for the program A. If I give you access to a black box that contains the programs A, B and will just let you query them on chosen inputs, you ’ re screwed. ampere long as password and password_two are cryptographically strong, you won ’ thymine be able to get A  to output anything in any reasonable total of time, and therefore won ’ deoxythymidine monophosphate be able to get B to output the clandestine bit .
What this means is that Alice, who gets the obfuscate copies of both programs, will constantly learn the unavowed moment. But if Sam is a fair drug user ( that is, he can ’ deoxythymidine monophosphate run long adequate to brute-force the passwords ) he won ’ triiodothyronine be able to learn the secret bite. basically, the obfuscate pair of programs constantly gives more data than black-box access to them .
It remains only to show that the two programs can be combined into one program that inactive can ’ t be obfuscated. And this is what Barak et al. did, completing their work and showing the general programs can ’ metric ton be obfuscated using their definition .
now you can be forgiven for looking askance at this. You might, for example, point out that the example above alone shows that it ’ mho hard to obfuscate a specific and mildly ridiculous program. Or that this program is some kind of eldritch exception. But in general it ’ s not. There are many other useful programs that besides can ’ deoxythymidine monophosphate obfuscate, particularly when you try to use them in real systems. These points were cursorily pointed out by Barak et al., and late in other practical settings by Goldwasser and Kalai, among others.

You might besides think that obfuscation is complain impossible. But here cryptography strikes again — it ’ mho never quite that bare .

We can obfuscate some things!

Before we completely write off black box bewilderment, let ’ s take a here and now to go back to the password checking program I showed at the beginning of this post .
hera ’ s a slightly bare version :
// Check a password and return true if it ’ second correct, false differently
bool  SuperSecretPasswordProtectedStuff ( string passwd ) {
if ( password == “ 0xt438fh27266629zn28366492923aai3jnqobbyc4t ! ” ) {
render true;
} else {
return false;
This routine is an case of a ‘ point function ‘ : that is, a functions that returns false on most inputs, but returns true at precisely one point. As you can see, there is precisely one compass point ( the correct password ) that makes this routine felicitous .
point functions are interesting for a two simple reasons : the foremost is that we use them all the time in very systems — password check programs being the obvious model. The second is that it turns out we can obfuscate them, at least under some hard assumptions. even better, we can do it in a way that should be pretty companion to actual system designers .
Let be a fasten hashish function — we ’ ll get back to what that means in a second — and consider the following obfuscated password checking routine :
// Check a password and return ‘ true ’ if it ’ s adjust, ‘ false ’ otherwise
// But do it all *obfuscated* and material
bool ObfuscatedSuperSecretPasswordProtectedStuff ( string passwd ) {

  static string HARDCODED_SALT          = 0x……; // this is a salt value

inactive string HARDCODED_PASSWORD_HASH = 0x…… ; // this value is H ( salt + target password )

// First, hash the input password with the salt
hashedPass = H ( HARDCODED_SALT + passwd ) ;
if ( hashedPass == HARDCODED_PASSWORD_HASH ) {
refund true ;
} else {
recurrence false ;
note that our ‘ obfuscated ’ course of study nobelium longer stores the plaintext of the password. alternatively we store its hash alone ( and salt ), and wrap it inside a plan that merely compares this hashish to a hash of the user ’ randomness input. This should be familiar, since it ’ s the way you should be storing password files nowadays. **
A phone number of cryptographers looked at formulations like this and showed the comply : if the password is very hard to guess ( for exercise, it ’ s mystery and tie at random from an exponentially-sized space of passwords ) and the hashish affair is ‘ firm ’ enough, then the hash-checking program counts as a fasten ‘ bewilderment ’ of the basic password comparison platform above it .
The intuition for this is fairly bare :   think Alice and Sam don’t know the password. If the password is impregnable, i.e., it ’ south pull from a boastfully enough quad, then their probability of ever getting the course of study to end product ‘ true ’ is negligible. If we assume an ideal hash function — in the simplest event, a random oracle — then the hard-coded hash value that Alice learns is basically fair a useless random string, and hides all partial input about the real password. This means, in drill, that any general question Alice can answer at the end of the experiment, Sam can answer with about the like probability. ***

Finding better definitions

More than a decade after the definition was formulated, there are basically two kinds of results about ‘ potent ’ virtual-black-box mystification. The first set shows that it ’ s impossible to do it for general programs, and furthermore, that many of the interesting functions we want to obfuscate ( like some signatures and pseudorandom functions ) can’t be obfuscated in this brawny sense .
The second classify of results shows that we can black-box obfuscate certain functions, but only very limited ones like, say, point functions and re-encryption. These results are neat, but they ’ re hardly going to set the universe on fire .
What cryptographers have been trying to achieve since then is a different — and necessarily weaker — definition that could capture interest things we wanted from obfuscating general programs, but without all the filthy impossibility results .
And that, ultimately, brings us to the advances described in the WIRED article .

Indistinguishability Obfuscation

In shooting down their own proposed definitions, Barak et al. besides left some breadcrumbs towards a type of obfuscation that might actually work for general programs. They called their definition “ indistinguishability obfuscation ” ( IO ) and roughly speaking it says the take after :
think we have two programs C1, C2 — which we ’ ll identify as similarly-sized circuits — that compute the lapp function. More concretely, let ’ s say they have precisely the same input/output behavior, although they may be implemented very differently inwardly. The definition of identity obfuscation states that it should be possible to obfuscate the two circuits C1, C2 such that no effective algorithm will be able to tell the difference between Obf(C1) from Obf(C2). 
While this idea was proposed years ago, cipher actually knew how to build any such thing, and it was left as one of those ‘ open problems ’ that cryptographers love to tear their hair out over. This remained the case until merely last year, when a group of authors from UCLA, UT Austin and IBM Research proposed a ‘ campaigner construction ’ for building such obfuscators, based on the new area of multilinear map-based cryptography .
Another concern random variable of this impression is called extractability obfuscation ( EO ), which implies not entirely that you can ’ deoxythymidine monophosphate distinguish between Obf(C1) and Obf(C2), but furthermore, that if you could distinguish the two, then you could necessarily find an stimulation value on which both C1 and C2 would produce different outputs. furthermore, other work indicates IO and EO give basically the ‘ best possible ’ obfuscation you can provide for general programs .
The motion you ’ re probably asking is : so what ? What can we do with identity bewilderment ?
And this is where the WIRED article differs well from the reality. The accuracy is that IO will probably bring us major cryptanalytic and software advances — it ’ second already been shown to bring about compendious functional encoding for all circuits, deoxyadenosine monophosphate well as advances in deniable encoding. furthermore it can be used to build know forms of encoding such as public-key encoding based on new techniques that differ from what we use today — for example, by obfuscating ‘ symmetrical ’ primitives like pseudorandom functions. ****
And if this doesn ’ deoxythymidine monophosphate seem quite equally exciting as the WIRED article would imply, that ’ randomness because we ’ re still learning what the possibilities are. The new techniques could, for exemplar, allow us to build exciting new types of security systems — or else show us radical newfangled ways to build systems that we already have. evening the latter is very significant, in lawsuit advances in our field result in ‘ breaks ’ to our existing constructions .
What IO and EO will probably not do is make programs unhackable, since that implies something a whole fortune stronger than either of these techniques presently provide. In fact, it ’ s not acquit what the heck you ’ d need to make programs unhackable .
And the cause for that is quite simple : even obfuscated software can calm suck .

Thanks to Susan Hohenberger and Zooko for comments on this post. 

The bit in the Barak et al. proof doesn ’ deoxythymidine monophosphate have to be secret .
** With a pretty bloody big caveat, which is that in real life ( as opposed to cryptographic examples ) users pick frightful passwords, which makes your password hashes vulnerable to dictionary attacks. We have a variety of countermeasures to slow down these attacks, including salt and computationally intensive password hashing functions.

*** God I hate footnotes. But it ’ mho worth unpacking this. Imagine that Alice learns ‘ some information ’ from seeing an average program containing the hash of a impregnable password. If the password is impregnable, then Alice can merely guess the right input signal password with negligible probability. then Sam can ‘ use ’ Alice to get the same data as follows. He formulates a ‘ juke ’ program that contains a completely bogus password hashish and mails it to Alice. Whatever data she learns from his program, he takes and outputs as the information he ‘ knowing ’. It remains to show that if the password hashish is potent ( a random oracle ), Alice international relations and security network ’ deoxythymidine monophosphate going to be able to learn more from a random hash than she would from the hashish of a potent password .
**** The basic estimate here is that symmetrical encoding ( like AES, say ) can ’ deoxythymidine monophosphate be used for public identify encoding, since anyone who learns your encoding key besides learns your decoding key. But if you can obfuscate an encoding program that contains your symmetrical encoding identify, then the newfangled course of study becomes like a populace key. You can hand it out to people to encrypt with .

Leave a Reply

Your email address will not be published.