In cryptography we typically share a secret which allows us to decrypt future messages. Commonly this is a password that I make up and submit to a Web site, then later produce to verify I am the same person.

I missed Kazue Sako’s Zero Knowledge Proofs 101 presentation at IIW last week, but Rachel Myers shared an impressively simply retelling in the car on the way back to San Francisco, which inspired me to read the notes and review the proof for myself. I’ve attempted to reproduce this simple explanation below, also noting additional sources and related articles.

Zero Knowledge Proofs (ZPKs) are very useful when applied to internet identity — with an interactive exchange you can prove you know a secret without actually revealing the secret.

Understanding Zero Knowledge Proofs with simple math:

x -> f(x)

Simple one way function. Easy to go one way from x to f(x) but mathematically hard to go from f(x) to x.

The most common example is a hash function. Wired: What is Password Hashing? provides an accessible introduction to why hash functions are important to cryptographic applications today.

f(x) = g ^ x mod p

Known(public): g, p
* g is a constant
* p has to be prime

Easy to know x and compute g ^ x mod p but difficult to do in reverse.

Interactive Proof

Alice wants to prove Bob that she knows x without giving any information about x. Bob already knows f(x). Alice can make f(x) public and then prove that she knows x through an interactive exchange with anyone on the Internet, in this case, Bob.

  1. Alice publishes f(x): g^x mod p
  2. Alice picks random number r
  3. Alice sends Bob u = g^r mod p
  4. Now Bob has artifact based on that random number, but can’t actually calculate the random number
  5. Bob returns a challenge e. Either 0 or 1
  6. Alice responds with v:
    If 0, v = r
    If 1, v = r + x
  7. Bob can now calculate:
    If e == 0: Bob has the random number r, as well as the publicly known variables and can check if u == g^v mod p
    If e == 1: u*f(x) = g^v (mod p)

I believe step 6 is true based on Congruence of Powers, though I’m not sure that I’ve transcribed e==1 case accurately with my limited ascii representation.

If r is true random, equally distributed between zero and (p-1), this does not leak any information about x, which is pretty neat, yet not sufficient.

In order to ensure that Alice cannot be impersonated, multiple iterations are required along with the use of large numbers (see IIW session notes).

Further Reading

An emerging pattern in server-side event-driven programming formalizes the data that might be generated by an event source, then a consumer of that event source registers for very specific events.

A declarative eventing system establishes a contract between the producer (event source) and consumer (a specific action) and allows for binding a source and action without modifying either.

Comparing this to how traditional APIs are constructed, we can think of it as a kind of reverse query — we reverse the direction of typical request-response by registering a query and then getting called back every time there’s a new answer. This new model establishes a specific operational contract for registering these queries that are commonly called event triggers.

This pattern requires a transport for event delivery. While systems typically support HTTP and RPC mechanisms for local events which might be connected point-to-point in a mesh network, they also often connect to messaging or streaming data systems, like Apache Kafka, RabbitMQ, as well as proprietary offerings.

This declarative eventing pattern can be seen in a number of serverless platforms, and is typically coupled with Functions-as-a-Service offerings, such as AWS Lambda and Google Cloud Functions.

An old pattern applied in a new way

Binding events to actions is nothing new. We have seen this pattern in various GUI programming environment for decades, and on the server-side in many Services Oriented Architecture (SOA) frameworks. What’s new is that we’re seeing server-side code that can be connected to managed services in a way that is almost as simple to set up as an onClick handler in HyperCard. However, the problems that we can solve with this pattern are today’s challenges of integrating data from disparate systems, often at high volume, along with custom analysis, business logic, machine learning and human interaction.

Distributed systems programming is no longer solely the domain of specialized systems engineers who create infrastructure, most applications we use every day integrate data sources from multiple systems across many providers. Distributed systems programming has become ubiquitous, providing an opportunity for interoperable systems at a much higher level.

Some people have the privilege to be recognized in our society. We’ve started to resurrect history and tell stories of people whose contributions have been studiously omitted. In America, February is Black history month. I didn’t learn Black history in school. I value this time for remedial studies, even as I feel a bit disturbed that we need to aggregate people by race to notice their impact. I had hoped that we, as a society, would have come farther along by now, in treating each other with fairness and respect. Instead, we are encoding our bias about what is noticed and who is recognized.

At the M.I.T. Media Lab, researcher Joy Buolamwini has studied facial recognition software, finding error rates increased with darker skin (via NYT). Specifically, algorithms by Microsoft, IBM and Face++ more frequently failed to identify the gender of black women than white men.

When the person in the photo is a white man, the software is right 99 percent of the time.
But the darker the skin, the more errors arise — up to nearly 35 percent for images of darker skinned women

A lack of judgement in choosing a data set is cast as an error of omission, a small lapse in attention on the part of software developers, yet the persistence of these kinds of errors illustrates a systemic bias. The systems that we build (ones made of code and others made of people) lack checks and balances where we actively notice whether our peers and our software are exercising good judgement, which includes treating people fairly and with respect.

Errors made by humans are amplified by the software we create.

Google "knowledge card" for Bessie Blount Griffen, shows same photo for Marie Van Brittan Brown and Miriam Benjamin

The Google “knowledge card” that appears next to the search results for “Bessie Blount Griffen” shows “people also searched for” two other women who are identified with the same photo. It’s hard to tell when the error first appeared, but we can guess that it was amplified by Google search results and perhaps by image search.

I discovered this error first when reading web articles about two different inventors and noticed that the photos used many of the articles were identical. This can be seen clearly in two examples below where the photo is composited with an image of the corresponding patent drawings. The patents were awarded to two unique humans, but somehow we, collectively, blur their individual identities, anonymizing them with a singular black female face.

I found a New York Times article of Marie Van Brittan Brown, and it seems that the oft replicated photo is Bessie Blount Griffen.

Additional confirmation by @SamMaggs tweet, author of “Wonder Women: 25 Innovators, Inventors, and Trailblazers Who Changed History”

photo of black women and patent drawing of medical apparatus
source: Black Then

patent drawing with home security system, with text: Marie Van Brittan Brown invented First Home Security System in 1966
source: Circle City Alarm blog

Newspaper article with photo of black woman and man behind her, caption: Mr and Mrs Albert L Brown