Friday, November 30, 2018

Maximum Likelihood Estimation

$$ \newcommand{\THETA}{ \boldsymbol\theta } \newcommand{\x}{ \boldsymbol{x} } $$
$$ \newcommand{\pdata}{ p_{\text{data}}(\x) } \newcommand{\pmodel}[2][;\THETA]{ p_{\text{model}}(#2#1) } $$
$$ \DeclareMathOperator*{\argmax}{ arg\,max } $$

There are many ways we can define in statistics good estimators for a model, like the mean or variance. However, instead of relying on authority and tradition to provide such estimators it would be useful to have a principled approach.

Hence, we are going to have a look at the maximum likelihood estimation, which is a very commonly used principle to derive the sought after estimators.

Let’s start by examining $m$ independent samples $\mathbb{X} = \{\x^{(1)}...\x^{(m)}\}$ from a distribution $\pdata$, where the latter is actually not known. Then, we can write down the model distribution as:

$$\begin{equation} p_{\text{model}}: (\x;\THETA) \mapsto \hat{p}\in\mathbb{R} \;\hat{=}\; p_{data}(\mathbf{x}) \end{equation}$$

where $\THETA$ is a parameter over the family of probability distributions $\pmodel{\x}$. Then the maximum likelihood estimator for $\THETA$ is defined as:

$$\begin{align} \THETA_{\text{ML}} &= \argmax_{\THETA} \pmodel{\mathbb{X}} \\ &= \argmax_{\THETA} \prod_{i=1}^{m} \pmodel{\x^{(i)}} \end{align}$$

Please note, that working with the product $\prod$ is for a variety of reasons rather inconvenient. Hence, let’s transform it into a sum $\sum$ by taking the logarithm:

$$\begin{equation} \THETA_{\text{ML}} = \argmax_{\THETA} \sum_{i=1}^{m} \log\pmodel{\x^{(i)}} \end{equation}$$

This works because applying the logarithm does not change the $\argmax$ for $\THETA$. We can continue transforming by dividing by $m$, which also has no effect on the $\argmax$ for $\THETA$:

$$\begin{equation} \THETA_{\text{ML}} = \argmax_{\THETA} \frac{1}{m} \sum_{i=1}^{m} \log\pmodel{\x^{(i)}} \end{equation}$$

which we can then express as an expectation with respect to the empirical distribution $\hat{p}_\text{data}$:

$$\begin{equation} \THETA_{\text{ML}} = \argmax_{\THETA} \mathbb{E}_{{\x} \sim {\hat{p}_\text{data}}} \log\pmodel{\x^{(i)}} \end{equation}$$

Summarized: So, by maximizing the expectation $\mathbb{E}$ over the logarithm of our model distribution $p_\text{model}$ for the given samples ${\x} \sim {\hat{p}_\text{data}}$ and with respect to the parameter $\THETA$, we tend to end up with good estimators.

Friday, November 2, 2018

Consensus by Avalanche

— by Hasan Karahan #Blockchain, #Avalanche, #AVA

Introduction

Let’s assume you are watching a soccer game in a very large stadium built for such an event. Further, let’s assume that the capacity of the stadium is about one hundred thousand people. So the population $\mathcal{P}$ has a size of:

$$\begin{equation} |\mathcal{P}| = 100'000\tag{$\star$} \end{equation}$$

Now, let’s give the game a twist and ascribe to the spectators a dislike for centralized decision making: They’ll reflect such distaste equally in the soccer game and prefer to not have a referee. Or rather, they have restricted the power of the referee to not being able to issue a yellow or a red card upon a foul.

So, what happens when there is a situation where a player engages another player unfairly, but where it is not so clear if the situation deserves a yellow or a red punishment? The referee cannot issue a decision because otherwise the spectators would rip him apart for transgressing his powers.

So, they themselves have to decide: But the decision should be made almost unanimously, otherwise it does not count. To enable all of the $100'000$ people to converge quickly to the same conclusion, everybody agrees to the following consensus rules:

  1. Upon an unfair behavior make an individual decision if the culprit deserves a yellow or a red punishment and then raise your card, such that it is easily visible for everybody else.

  2. Then take a random sample of $n$ people among the other fellow spectators in the stadium and change your own card according to the majority of those people you have chosen; but stick to your card if there is a tie.

  3. Iterate the second step until the whole (or almost whole) stadium displays the same color.

Formalism

For each person $p$ among all the population $\mathcal{P}$ and a given round $r\in\mathbb{N}$ we can define a ${card}_r(p)$ as:

$$\begin{equation} {card}_{r}(p) =\begin{cases} 1_{red} & |\mathbb{S}_{1_{red}}^{(r)}(p)| > {n}\big{/}{2},\\ {card}_{r-1}(p) & |\mathbb{S}_{1_{red}}^{(r)}(p)| = {n}\big{/}{2},\\ 0_{yellow} & otherwise; \end{cases} \end{equation}$$

where we encode a value of $1$ for $red$ and $0$ for $yellow$ cards, and where the sample set $\mathbb{S}_{color}$ is defined as:

$$\begin{equation} \mathbb{S}_{color}^{(r)}(p) = \big\{ s \mid {s\in{sample_{r}(p)}}\land{s=color} \big\} \end{equation}$$

where $sample_r(p)$ is defined as:

$$\begin{equation}sample_r(p) = \big\{ {card_{r-1}(p_i) \mid p_i\hat\in{\mathcal{P}}} \land {i\in{\{1..n\}}} \big\}\end{equation}$$

where each $p_i\hat\in{\mathcal{P}}$ denotes a random selection of $p_i$ from the population $\mathcal{P}$ according to the uniform distribution, i.e. probability $\mathbb{P}(p_i) = 1\big/|\mathcal{P}|$ for each $p_i\in{\mathcal{P}}$.

Summarized: People keep influencing each other, until everybody hopefully converges onto the same opinion.

Simulation

There is now a very interesting question that needs to be clarified:

How many rounds does it usually take to reach consensus, i.e. until almost the whole stadium displays either $red$ or $yellow$ cards?

We have answered this question by running multiple simulations of the whole process as described above, and found that the number of rounds required to reach consensus is dependent on:

  • the total population $|\mathcal{P}|$, where the larger it gets the more rounds need to be iterated, and also on

  • the number of people randomly sampled in each round i.e. $|\mathbb{S}_{color}^{(r)}(p)|$, where for larger sample sizes fewer rounds need to be iterated.

The simulation is using the term avalanche, since the change of opinions is reminiscent of an avalanche, where the higher the percentage of people (in a given round $r$) thinking for example that a foul should be punished with red, the higher the probability of people changing their opinion from yellow to red (in the next round $r+1$).

Aggregate over $1'000$ simulations for $|\mathcal{P}|=100'000$:

$|\mathcal{P}|=10^5$: Consensus for sample sizes $|\mathbb{S}|\in{[2..10]}$ and $rounds\in{[1..13]}$

For a population of $100'000$ people consensus can be achieved within $13$ rounds, where it is reached faster when more fellow spectators are taken into consideration: A sample size of $|\mathbb{S}|=8$ (or more) seems to produce average and median consensus levels very close to $100\%$ with a tiny standard deviation, while having reasonably low amounts of outliers.

$|\mathcal{P}|=10^5$: Consensus for sample sizes $|\mathbb{S}|\in{[2..10]}$ and $rounds\in{[1..13]}$

The same simulation data shown here as histograms yields overall the same results as above with the box plots.

Aggregate over $1'000$ simulations for $|\mathcal{P}|=1'000'000$:

$|\mathcal{P}|=10^6$: Consensus for sample sizes $|\mathbb{S}|\in{[2..10]}$ and $rounds\in{[1..13]}$

One may expect, that by drastically increasing the population size by an order to $|\mathcal{P}|=1'000'000$ people, a similar increase of the sample size and rounds would be required, however that does not seem to be the case: While there is a small decrease in consensus, it is almost not perceptible (for the same sample size of $8$ and same number of rounds of $13$). However, the number of outliers seems to increase while still being reasonable.

$|\mathcal{P}|=10^6$: Consensus for sample sizes $|\mathbb{S}|\in{[2..10]}$ and $rounds\in{[1..13]}$

Again, using histograms for the visualization of the same simulations for a population of one million leads to the same conclusions as above with the box plots.

Conclusion

The overall approach of using the model of an avalanche seems to be an efficient process of generating consensus among a very large population. While there seems to be an engineering trade-off between a lower sample size and a higher number of rounds, sampling $8$ or more fellow spectators over $13$ rounds seems to be a good choice, given a population of $100'000$ people. Also, these parameters can be applied to a much larger crowd of one million people without almost any significant decrease in consensus.

Appendix

The avalache.py simulator

Simulates for the provided population size $|\mathcal{P}|$, maximum sample size $|\mathbb{S}|$ and maximum number of rounds the avalanche process:

#!/usr/bin/env python
from typing import Dict

from numpy.random import choice
from numpy.random import randint

from numpy import array
from numpy import maximum
from numpy import round
from numpy import zeros

from matplotlib import pyplot as pp

import argparse
import json
import numpy
import sys
def avalanche(
    population_size: int, max_sample: int, max_round: int
) -> array:
    """
    Simulates for the provided number of `max_round`, the given
    `population_size` and `max_sample` the *avalanche* process,
    where we keep the `population_size` fixed, but consider all
    sample sizes from `1` until `max_sample` (inclusive);  then
    return a matrix (array) `m` for each round and sample size.

    While technically it is not required to use the matrix `m`,
    and rely *only* on the array `p`  (to successfully simulate
    the avalanche process), it is used for book-keeping helping
    later-on to generate plots of the entire process.
    """
    m = zeros(shape=(max_round, max_sample))
    for s in range(max_sample):

        p = population(size=population_size)
        for r in range(max_round):

            m[r, s] = p.sum() / population_size
            p = resample(p, s + 1)

    return m
def population(size: int) -> array:
    """
    Returns a uniformly sampled population for the given `size`
    where a value of `0` represents *yellow* and a value of `1`
    *red* cards.
    """
    return randint(0, 2, size=size)
def resample(p: array, size: int) -> array:
    """
    Make a random choice of a sample `size` (from a given array
    `p`) for *each* element within the array `p`. Then for each
    sample calculate, if the majority of the sample indicates a
    *red* (i.e. `1`) or a *yellow* (i.e. `0`) card. However, if
    there is *no* majority (i.e. a tie) within the sample, then
    do nothing.

    Note that the returned array has the same size as the array
    `p`: It represents a state update of `p`, where elements of
    the array have switched their "opinion" with respect to the
    majority they have sampled.
    """
    ps = choice(p, size=(size, p.size)).sum(axis=0)
    eq, gt = ps == size / 2.0, ps > size / 2.0

    return p * eq + gt
if __name__ == '__main__':
    parser = argparse.ArgumentParser(
        description='Produces the Avalanche Matrix '
                    '[rounds x max.sample-size]')

    parser.add_argument('-p', '--population',
        type=int, help='population size '
        '(default: %(default)s)', default=100000)
    parser.add_argument('-r', '--max-round',
        type=int, help='maximum number of rounds '
        '(default: %(default)s)', default=13)
    parser.add_argument('-s', '--max-sample',
        type=int, help='maximum sample size '
        '(default: %(default)s)', default=10)

    args = parser.parse_args()

    m = avalanche(
        args.population, args.max_sample, args.max_round)
    numpy.savetxt(
        sys.stdout, m, header=json.dumps(vars(args)))

Running a single simulation for the population size of $|\mathcal{P}|=100'000$ people yields the matrix below, where the rows display the rounds in the range of $[1..13]$ and the columns iterate over the sample sizes of $[1..10]$:

$ ./avalanche.py --population=100000 --max-sample=10 --max-round=13
# {"population": 100000, "max_sample": 10, "max_round": 13}
4.988699999999999801e-01 4.966800000000000104e-01 4.984000000000000097e-01 5.022699999999999942e-01 5.014100000000000223e-01 5.012299999999999534e-01 5.022900000000000142e-01 5.009400000000000519e-01 5.020000000000000018e-01 5.022199999999999998e-01
4.988299999999999956e-01 4.967199999999999949e-01 4.961999999999999744e-01 5.048200000000000465e-01 5.040599999999999525e-01 5.027399999999999647e-01 5.079200000000000381e-01 5.044100000000000250e-01 5.049900000000000500e-01 5.068599999999999772e-01
5.008399999999999519e-01 4.946300000000000141e-01 4.975700000000000123e-01 5.091999999999999860e-01 5.078399999999999581e-01 5.056699999999999529e-01 5.165199999999999791e-01 5.125600000000000156e-01 5.125399999999999956e-01 5.180099999999999705e-01
5.001700000000000035e-01 4.928899999999999948e-01 4.971099999999999963e-01 5.179200000000000470e-01 5.152900000000000258e-01 5.101200000000000179e-01 5.370099999999999874e-01 5.320200000000000484e-01 5.312400000000000455e-01 5.497100000000000319e-01
4.999100000000000210e-01 4.890999999999999792e-01 4.946800000000000086e-01 5.317199999999999704e-01 5.306300000000000461e-01 5.204100000000000392e-01 5.826999999999999957e-01 5.774500000000000188e-01 5.754700000000000371e-01 6.302799999999999514e-01
5.019000000000000128e-01 4.824499999999999900e-01 4.934000000000000052e-01 5.604200000000000292e-01 5.594000000000000083e-01 5.454499999999999904e-01 6.734200000000000186e-01 6.862800000000000011e-01 6.799100000000000144e-01 8.149999999999999467e-01
5.020700000000000163e-01 4.750900000000000123e-01 4.914999999999999925e-01 6.127299999999999969e-01 6.083399999999999919e-01 5.992399999999999949e-01 8.385000000000000231e-01 8.855800000000000338e-01 8.746000000000000441e-01 9.923300000000000454e-01
4.981400000000000272e-01 4.623099999999999987e-01 4.866599999999999815e-01 7.039199999999999902e-01 6.959999999999999520e-01 7.061199999999999699e-01 9.845000000000000417e-01 9.983600000000000252e-01 9.976399999999999713e-01 1.000000000000000000e+00
5.013100000000000334e-01 4.423199999999999910e-01 4.833500000000000019e-01 8.430499999999999661e-01 8.304599999999999760e-01 8.831599999999999451e-01 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00
5.021499999999999853e-01 4.149599999999999955e-01 4.769399999999999751e-01 9.706099999999999728e-01 9.632399999999999851e-01 9.950499999999999901e-01 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00
5.045800000000000285e-01 3.738000000000000211e-01 4.657899999999999818e-01 9.998099999999999765e-01 9.995000000000000551e-01 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00
5.050700000000000189e-01 3.149700000000000277e-01 4.497700000000000031e-01 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00
5.029900000000000482e-01 2.368300000000000127e-01 4.264899999999999802e-01 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00 1.000000000000000000e+00

Monday, May 15, 2017

Libbitcoin: bx ec-to-address

Today, I’d like to discuss how you can derive a Bitcoin address from a public key using bx ec-to-address:

hsk81 ~ $ bx help ec-to-address

Usage: bx ec-to-address [-h] [--config value] [--version value]          
                        [EC_PUBLIC_KEY]

Info: Convert an EC public key to a payment address.                     

Options (named):

-c [--config]        The path to the configuration settings file.        
-h [--help]          Get a description and instructions for this command.
-v [--version]       The desired payment address version.                

Arguments (positional):

EC_PUBLIC_KEY        The Base16 EC public key to convert. If not         
                     specified the key is read from STDIN.               

Alright, apparently this command eats a public key, and produces an output, which we call then a Bitcoin address. Here is a way to produce one:

hsk81 ~ $ bx seed | bx ec-new | bx ec-to-public | bx ec-to-address
1CDR8xyAJ4vzAHoTBbXy1J14B8QhjZ366r

So, the 1CDR8xyAJ4vzAHoTBbXy1J14B8QhjZ366r is a bitcoin address, which you can use to ask people to send you Bitcoins to. But beware! If you create your address this way, you omit to save your private key, and hence you would have zero possibility to spend you coins again. That would be bad. Hence, let’s do this:

hsk81 ~ $ EC_PRIVATE=$(bx seed | bx ec-new)
hsk81 ~ $ echo $EC_PRIVATE 
fa86579eb7754d0ba98c96d0d180da7c10f1b9def22db603925b76fa9e8ec87c
hsk81 ~ $ EC_ADDRESS=$(echo $EC_PRIVATE | bx ec-to-public | bx ec-to-address)
hsk81 ~ $ echo $EC_ADDRESS
19vLj79rrErRK6q2GAcfYxRmWyT41MXcV

Now, we capture the private key in EC_PRIVATE and the address in EC_ADDRESS. Alright! But how does bx ec-to-address work under the hood?

Let’s examine the source code:

console_result ec_to_address::invoke(std::ostream& output, std::ostream& error)
{
    const auto& point = get_ec_public_key_argument();
    const auto version = get_version_option();

    output << payment_address(point, version) << std::endl;
    return console_result::okay;
}

Good: So, the point argument seems to be the public key we provide via the CLI, and the version is some additional argument we don’t want to care about (in this post).

Then point is forwarded to payment_address, where the result is then serialized to the output. Good! But what does payment_address do? Let’s check:

payment_address::payment_address(const ec_public& point, uint8_t version)
  : payment_address(from_public(point, version))
{
}

Above, the constructor is just putting together the payment address with the help of the (static) payment_address function:

payment_address payment_address::from_public(const ec_public& point,
    uint8_t version)
{
    if (!point)
        return payment_address();

    data_chunk data;
    return point.to_data(data) ?
        payment_address(bitcoin_short_hash(data), version) :
        payment_address();
}

Alright, this looks very promising: So the public key point is converted to some data chunk, hashed and then turned into a payment address. Let’s have a look at that bitcoin_short_hash:

short_hash bitcoin_short_hash(data_slice data)
{
    return ripemd160_hash(sha256_hash(data));
}

OK, so here we got the famous double hash where a sha256 hash is re-hashed using a ripemd160 hash! So, this is pretty much it: The result get’s saved in the internal data structures of the payment address and then the whole thing is serialized using base58 encoding:

std::string payment_address::encoded() const
{
    return encode_base58(wrap(version_, hash_));
}

Finally, we’ve managed to work through the generation of a seed with bx seed, converted it into a private address with bx ec-new, and turned that one into a public address with bx ec-to-public, which was then turned into a Bitcoin address with bx ec-to-address.

Friday, April 21, 2017

Libbitcoin: bx ec-to-public

Don’t trust a man who needs an income – except if it is minimum wage. Those in corporate captivity would do anything to “feed a family”. [NNT]

Generator G multiplied 8 times over an Elliptic curve
Generator G multiplied 8 times over an Elliptic curve

Today, I’d like to talk about how to derive a public key from a secret using Elliptic Curve Cryptography. Let’s try the following command bx ec-to-public:

hsk81 ~ $ bx ec-to-public -h

Usage: bx ec-to-public [-hu] [--config value] [EC_PRIVATE_KEY]           

Info: Derive the EC public key of an EC private key. Defaults to the     
compressed public key format.                                            

Options (named):

-c [--config]        The path to the configuration settings file.        
-h [--help]          Get a description and instructions for this command.
-u [--uncompressed]  Derive using the uncompressed public key format.    

Arguments (positional):

EC_PRIVATE_KEY       The Base16 EC private key. If not specified the key 
                     is read from STDIN. 

Alright, apparently ec-to-public takes a secret and turns it into something public, allowing other people to encrypt messages, which only you the holder of the secret can decrypt. That’s at least the working assumption under which public key cryptography operates!

In the context of Bitcoin the public key can be used as your address people can send “money” to, and only you can spend them, since as a holder of the private key only you can sign the associated transactions.

So, let’s generate such a public address for other people to send us money:

hsk81 ~ $ bx seed | bx ec-new | bx ec-to-public
03e81a84fe1d5aa4269b0faa78110549caf3873364467688dec27c94761c2b1e6e

Well, this does not really look like the traditional Bitcoin address you would expect so see – for example:

12AEJmWva5QJVYmQ3r3vdFZeFfLUeSVPBq

But that conversion from a public key to an actual address (accomplished with bx ec-to-address) will be discussed in depth in another post. Here, we want to focus on how exactly this “magic” transformation from a private key (secret) to a public key (address) is accomplished.

Let’s dive directly into the code of the bx ec-to-public command:

console_result ec_to_public::invoke(
    std::ostream& output, std::ostream& error)
{
    const auto& secret = get_ec_private_key_argument();
    const auto& uncompressed = get_uncompressed_option();

    ec_compressed point;
    secret_to_public(point, secret);

    output << ec_public(point, !uncompressed) << std::endl;
    return console_result::okay;
}

As we see, some secret is transformed to a public point, which is called so because it corresponds to an actual “point” in Elliptic curve cryptography. The short explanation of how this is achieved, is to note that the secret is like the number of multiplications of the point over Elliptic curves, and the NSA telling us that this multiplication is “secure” enough to operate a multi-billion economy. Let’s see, where we will end up…

Anyway, since there is a bunch of “independent” mathematicians telling us more or less the same story, that ECC is secure enough (under the assumption that one-way functions exist, and under the assumption that $P\neq{NP}$ in a Turing model of computation, and under the assumption that current Quantum computers are not able to brute force the secret from the public point – lot’s of assumptions), we believe what we’re told and move on with our technical discussion.

Alright, let’s have a look at the secret_to_public function:

bool secret_to_public(ec_compressed& out, const ec_secret& secret)
{
    const auto context = signing.context();
    return secret_to_public(context, out, secret);
}

So, apparently what we do is to turn the secret under a certain signing context (which I will not further elaborate) into something public – represented here by the out reference. Let’s dive deeper:

template <size_t Size>
bool secret_to_public(
    const secp256k1_context* context,
    byte_array<Size>& out, const ec_secret& secret)
{
    secp256k1_pubkey pubkey;
    return secp256k1_ec_pubkey_create(
        context, &pubkey, secret.data()) == 1 &&
        serialize(context, out, pubkey);
}

We’re getting close to the truth, and the magic seems to become more and more profane: Apparently, the secp256k1_ec_pubkey_create function seems to be doing the hard job, and if everything goes fine (== 1), we serialize the pubkey to out. Let’s dive even further:

int secp256k1_ec_pubkey_create(
    const secp256k1_context* ctx,
    secp256k1_pubkey *pubkey,
    const unsigned char *seckey)
{
    secp256k1_gej pj;
    secp256k1_ge p;
    secp256k1_scalar sec;
    int overflow;
    int ret = 0;
    VERIFY_CHECK(ctx != NULL);
    ARG_CHECK(pubkey != NULL);
    memset(pubkey, 0, sizeof(*pubkey));
    ARG_CHECK(secp256k1_ecmult_gen_context_is_built(&ctx->ecmult_gen_ctx));
    ARG_CHECK(seckey != NULL);

    secp256k1_scalar_set_b32(&sec, seckey, &overflow);
    ret = (!overflow) & (!secp256k1_scalar_is_zero(&sec));
    if (ret) {
        secp256k1_ecmult_gen(&ctx->ecmult_gen_ctx, &pj, &sec);
        secp256k1_ge_set_gej(&p, &pj);
        secp256k1_pubkey_save(pubkey, &p);
    }
    secp256k1_scalar_clear(&sec);
    return ret;
}

And welcome to the funny world of C programmers, who are not able to come up with reasonable variable names: The code above is actually not part of the libbitcoin software (which is currently maintained by Eric Voskuil using his beautiful C++).

Alright, let’s deconstruct the gibberish above and understand what’s going on: Well, this code does not do actually much except taking the input arguments, doing some sanity checks and invoking then the secp256k1_ecmult_gen function, which does the “heavy” lifting – namely executing a sec number of times an addition over Elliptic curves (modulo some prime number), which is then saved to pj, which is then turned into p, which is then finally stored in the pubkey pointer.

So, what does secp256k1_ecmult_gen really do? Let’s see:

static void secp256k1_ecmult_gen(
    const secp256k1_ecmult_gen_context *ctx,
    secp256k1_gej *r, const secp256k1_scalar *gn)
{
    secp256k1_ge add;
    secp256k1_ge_storage adds;
    secp256k1_scalar gnb;
    int bits;
    int i, j;
    memset(&adds, 0, sizeof(adds));
    *r = ctx->initial;
    /* Blind scalar/point multiplication by
       computing (n-b)G + bG instead of nG. */
    secp256k1_scalar_add(&gnb, gn, &ctx->blind);
    add.infinity = 0;
    for (j = 0; j < 64; j++) {
        bits = secp256k1_scalar_get_bits(
            &gnb, j * 4, 4
        );
        for (i = 0; i < 16; i++) {
            secp256k1_ge_storage_cmov(
                &adds, &(*ctx->prec)[j][i], i == bits
            );
        }
        secp256k1_ge_from_storage(&add, &adds);
        secp256k1_gej_add_ge(r, r, &add);
    }
    bits = 0;
    secp256k1_ge_clear(&add);
    secp256k1_scalar_clear(&gnb);
}

Mmh, more gibberish – although the main idea of this implementation is actually pretty ingenious (provided you believe the NSA as mentioned above): So, the r and gn variables seem to be important, where the former looks like the current sum and gn is of course our secret or the number of times we’re supposed to sum r over and over again (hence the name ec_mult_gen). The gen suffix looks like to stand for the generator of the Elliptic curve, which initially seems to come from the context member ctx->initial.

Two nice features, I like about this code is that the author tries very hard to write secure code by doing a so called “blind” multiplication, by initially adding a number ctx->blind to gn. What I find interesting, and don’t really understand yet is, that this initial blind number is not subtracted at the end to neutralize the initial addition. I can only explain this by speculating, that either ECC multiplication is invariant w.r.t. to an initial (blind) shift or it is auto-magically taken care of implicitly within the (nested) loops. If there is somebody among my readers, who’d like to enlighten me, please drop below a comment!

The other feature I like is, that the author seems to have tried to protect the multiplication against side-channels attacks, since as we all know measuring CPU thermodynamics can reveal under the right circumstances the very secret we utilize here (see the original source on GitHub with some corresponding comments, which I’ve omitted here for the sake of brevity).

Wow, alright! Let’s re-capitulate:

hsk81 ~ $ bx seed | bx ec-new | bx ec-to-public
03e81a84fe1d5aa4269b0faa78110549caf3873364467688dec27c94761c2b1e6e

So, an initial pseudo-random seed, which is calculated based on the clock (according to the current implementation), is turned into a private key (based on HD wallets – to be discussed in a later post), which is then morphed into a public key from which we can derive then an address.

Summarized: (a) measure the current time really really well (to a high degree of precision), while making sure nobody is watching; (b) do some deterministic magic to it and label it as your private key (and trust all your wealth upon this calculation – including your wife and children); (c) then take some publicly available number (the initial generator of an Elliptic curve), and (d) multiply your private key with that number (again trust your wife and children upon this multiplication), deriving finally (e) the public key. Ultimately (f) turn this public key into a Bitcoin address.

If it were only for the initial steps (a) till (e), I would be very suspicious of this whole construction; although admittedly, it’s pretty much the current state of the art in today’s – publicly available – cryptography.

But the last step (f) does mitigate a lot of the risks I see here, which I’m going to discuss in my next post.

Tuesday, April 18, 2017

Libbitcoin: bx ec-new

Today, I’d like to present the bx ec-new command which generates a private key from entropy. For the uninitiated, entropy means just some random noise and a private key means that it allows you to spend your Bitcoins (by allowing you to sign corresponding transactions). And the ec prefix stands for Elliptic Curves, which is an efficient way of implementing public key cryptography.

In public key cryptography you have a private and a public key pair, where the former is used to encrypt messages, and the latter is used to decrypt them. The key pair can also be used to sign and verify messages (or in our case, transactions).

Now, let’s check what the integrated help systems of libbitcoin’s explorer tells us:

hsk81 ~ $ bx help ec-new

Usage: bx ec-new [-h] [--config value] [SEED]                            

Info: Create a new Base16 EC private key from entropy.                   

Options (named):

-c [--config]        The path to the configuration settings file.        
-h [--help]          Get a description and instructions for this command.

Arguments (positional):

SEED                 The Base16 entropy for the new key. Must be at least
                     128 bits in length. If not specified the seed is    
                     read from STDIN.                   

Alright, so we apparently need some SEED which we need to feed to the ec-new monster so we get our private key (using a hexadecimal encoding). The seed can be generated for example with bx seed and piped through to ec-new:

hsk81 ~ $ bx seed | bx ec-new
172a430e125bbab14a76847db2344835783fc501d44893a0ed3c43e58d0712d4

Or if you want to avoid bx seed and read directly from a random source like urandom (available on Linux systems) you could also just do this:

hsk81 ~ $ head -c 1024 /dev/urandom | tr -dc '0-9a-f' | fold -w 48 | head -n 1 | bx ec-new
a9d0daf848e00825bbe3f2cf3a85b4316ccbe15d6207567b215e92b238de4def

Yes, it’s more tedious, but there is no dependency on the system clock like in the case of bx seed. But when you would ask me, which one is more secure, answering that would be beyond me to judge, since bx seed does some fancy twisting to the system clock, which is then fed to a uniform pseudo-random generator. However, urandom is also not be underestimated either, since it collects according to its manual page entropy from various parts of the operating system, to deliver useful randomness. I guess querying both a few billion times and running fancy statistical tests to measure their entropy would be a nice exercise for the interested reader.

But let’s focus on ec-new: As mentioned, it’s based on Elliptic Curves and uses according to the source code a new_key(seed) invocation to derive a secret from the given seed:

console_result ec_new::invoke(std::ostream& output, std::ostream& error)
{
    const data_chunk& seed = get_seed_argument();
    if (seed.size() < minimum_seed_size)
    {
        error << BX_EC_NEW_SHORT_SEED << std::endl;
        return console_result::failure;
    }

    ec_secret secret(new_key(seed));
    if (secret == null_hash)
    {
        error << BX_EC_NEW_INVALID_KEY << std::endl;
        return console_result::failure;
    }

    output << config::ec_private(secret) << std::endl;
    return console_result::okay;
}

The new_key function is based on BIP32, which describes so called hierarchical deterministic wallets (or HD Wallets): Basically a fancy way to derive multiple public and private keys based on a master key. I’ll defer a deeper look into these HD wallets to a later post, since there are specific bx commands which deal directly with them:

ec_secret new_key(const data_chunk& seed)
{
    const wallet::hd_private key(seed);
    return key.secret();
}

So, that’s it for today. I keep this post short, since it’s getting rather late, although there is so much to write about this little bx ec-new command: It’s in my opinion, pretty much the foundation stone (or one of them) upon which Bitcoin is built, producing a private key to keep your coins safe and spend them at will.

But for the complete uninitiated, I will offer a rather not so perfect analogy: You could compare bx ec-new to a bank employee creating an account for you (after asking you to come up with some random number), and then handing you a secret for the created account. This secret will allow you to access and transfer your funds. The nice part here is that there is neither a bank, nor an employee but just you and your computer (or smart phone) performing all the magic.

Sunday, April 9, 2017

Libbitcoin: bx seed

Today, I would like to write about a project I’ve been following for some time now, but had simply not the capacity to devote more attention to it: libbitcoin.org, which is a

C++ Bitcoin toolkit library for asynchronous apps.

As far as I’m informed, it was originally conceived by the now (in)famous Amir Taaki with the intention to provide an alternate implementation to Bitcoin Core.

When I’ve checked the source code on GitHub.com/libbitcoin I was simply blown away: An extremely well written piece of software, which splits the various required functionalities – to make Bitcoin work – into cleanly separated commands, adhering to the Unix philosophy of doing a single thing and doing it right!

Upon further investigation, I realized that an huge amount of work has been performed by Eric Voskuil: Based on my own research, it seems like that he took Amir’s ingenious work and turned it into a nice piece of well organized software!

Alright, enough of talking about people. Now, let’s get down to business: The point of this article is to start a series of posts about each individual command which have been implemented using libbitcoin in general and libbitcoin-explorer in particular, where the latter provides the bx binary, which in turn allows to access the various commands. Here is a list of them:

hsk81 ~ $ bx help

Usage: bx COMMAND [--help]

Version: 4.0.0

Info: The bx commands are:

address-decode
address-embed
address-encode
base16-decode
base16-encode
base58-decode
base58-encode
base58check-decode
base58check-encode
base64-decode
base64-encode
bitcoin160
bitcoin256
btc-to-satoshi
cert-new
cert-public
ec-add
ec-add-secrets
ec-multiply
ec-multiply-secrets
ec-new
ec-to-address
ec-to-ek
ec-to-public
ec-to-wif
ek-address
ek-new
ek-public
ek-public-to-address
ek-public-to-ec
ek-to-address
ek-to-ec
fetch-balance
fetch-header
fetch-height
fetch-history
fetch-public-key
fetch-stealth
fetch-tx
fetch-tx-index
fetch-utxo
hd-new
hd-private
hd-public
hd-to-ec
hd-to-public
help
input-set
input-sign
input-validate
message-sign
message-validate
mnemonic-new
mnemonic-to-seed
qrcode
ripemd160
satoshi-to-btc
script-decode
script-encode
script-to-address
seed
send-tx
send-tx-node
send-tx-p2p
settings
sha160
sha256
sha512
stealth-decode
stealth-encode
stealth-public
stealth-secret
stealth-shared
token-new
tx-decode
tx-encode
tx-sign
uri-decode
uri-encode
validate-tx
watch-address
watch-stealth
watch-tx
wif-to-ec
wif-to-public
wrap-decode
wrap-encode

Bitcoin Explorer home page:

https://github.com/libbitcoin/libbitcoin-explorer

As you see there are around 80 commands, hence I’ve some decent amount of work about understanding, dissecting and writing about them. The very first command I’d like to talk about is bx seed:

hsk81 ~ $ bx help seed

Usage: bx seed [-h] [--bit_length value] [--config value]                

Info: Generate a pseudorandom seed.                                      

Options (named):

-b [--bit_length]    The length of the seed in bits. Must be divisible by
                     8 and must not be less than 128, defaults to 192.   
-c [--config]        The path to the configuration settings file.        
-h [--help]          Get a description and instructions for this command.

So, it generates a pseudorandom seed, which means it returns a random looking number of a given bit-length:

hsk81 ~ $ bx seed
a6943c12a9e7fabd8b96ad15f6b1a24a2b7fba2d434cbbba

Each time you run it, you should get something else. Let’s have a deeper look into the code at seed.cpp:

console_result seed::invoke(std::ostream& output, std::ostream& error)
{
    const auto bit_length = get_bit_length_option();

    if (bit_length < minimum_seed_size * byte_bits ||
        bit_length % byte_bits != 0)
    {
        error << BX_SEED_BIT_LENGTH_UNSUPPORTED << std::endl;
        return console_result::failure;
    }

    const auto seed = new_seed(bit_length);

    output << base16(seed) << std::endl;
    return console_result::okay;
}

So, we see that upon doing some checks, the new_seed function with the bit_length argument is invoked delivering our seed, which then in turn is encoded using based16 and send to the output. That was the easy part! Let’s dissect new_seed in utility.cpp:

data_chunk new_seed(size_t bit_length)
{
    size_t fill_seed_size = bit_length / byte_bits;
    data_chunk seed(fill_seed_size);
    random_fill(seed);
    return seed;
}

Alright, so apparently random_fill is our work horse here, which in turn delegates to pseudo_random_fill in random.cpp:

void pseudo_random_fill(data_chunk& chunk)
{
    std::uniform_int_distribution<uint16_t> distribution(0, max_uint8);

    for (auto& byte: chunk)
        byte = static_cast<uint8_t>(distribution(get_twister()));
}

Now, it’s getting interesting: Apparently a uniform distribution over uint8 is used to query the random numbers and to fill the chunk. The get_twister function seems to deliver a seed:

static std::mt19937& get_twister()
{
    const auto deleter = [](std::mt19937* twister)
    {
        delete twister;
    };

    static boost::thread_specific_ptr<std::mt19937> twister(deleter);

    if (twister.get() == nullptr)
    {
        // Seed with high resolution clock.
        twister.reset(new std::mt19937(get_clock_seed()));
    }

    return *twister;
}

Ah, we are getting closer to what’s really going on: Apparently a high resolution clock is used to seed the uniform distribution. The rest around it is just some technical detail w.r.t. pseudo-random number generation. And what does get_clock_seed do?

static uint32_t get_clock_seed()
{
    const auto now = high_resolution_clock::now();
    return static_cast<uint32_t>(now.time_since_epoch().count());
}

Alright, here we have it: The current time in high resolution is the seed! So this means, we take the current time which has elapsed since about 1970 measure it really really well, use that number as a seed for a pseudo-random generator and pick multiple uint8 numbers uniformly at random till we have our final seed of desired length.

What does that mean? Well it means, that you should rather treat your current clock as a secret, since otherwise people could guess the result of bx seed, which might actually be not very difficult, if your system uses the ntp time synchronization protocol.

But before you start jumping around and start screaming security hole, please realize that that’s the nature of pseudo-random generators: If you enter the same seed then you will get the same result.

Hence my suggestion would be to make it really hard for outsiders to determine when exactly your bx seed commands are invoked, which in all practicality should be the case anyway when you disallow unauthorized access to your machine.

You could also run bx seed in batch at some point in time un-guessable by a potential adversary and store the results securely and safely for later retrieval. But you should asses the risk of the seeds being stolen versus the risk of some all observing adversary guessing the exact time of invocation. My gut tells me that pre-calculating the seeds would actually be less secure, than asking for them on demand.

So how does bx seed scale? It should take about 10 times more time, if you run it 10 times in a row, hence a linear dependency:

As you see this is indeed the case. Here is the code I used to produce the corresponding data:

hsk81 ~ $ time for i in $(seq 1) ; do bx seed > /dev/null ; done

real 0m0.013s
user 0m0.010s
sys 0m0.000s

hsk81 ~ $ time for i in $(seq 10) ; do bx seed > /dev/null ; done

real 0m0.142s
user 0m0.107s
sys 0m0.017s

hsk81 ~ $ time for i in $(seq 100) ; do bx seed > /dev/null ; done

real 0m1.225s
user 0m0.927s
sys 0m0.190s

hsk81 ~ $ time for i in $(seq 1000) ; do bx seed > /dev/null ; done

real 0m13.423s
user 0m10.227s
sys 0m1.763s

hsk81 ~ $ time for i in $(seq 10000) ; do bx seed > /dev/null ; done

real 1m53.335s
user 1m23.380s
sys 0m11.900s

So, we’re at then end of our investigation: I hope, that I could give you a rather in depth technical view on bx seed and I’m looking forward to talk about bx ec-new in my next post.

Monday, March 13, 2017

About philosophers and illogical men

Xkcd.com: Philosophy
Xkcd.com: Philosophy

Just recently, I’ve been reading a beautiful book written by Charles Petzold titled Code: I’m still at chapter eleven “Gates (not Bill)”, but just the chapter before, namely “Logic and Switches” had an intriguing problem described:

$\eqref{eq:dcb6}$ All philosophers are logical; and $\eqref{eq:e6e9}$ an illogical man is always obstinate (stubborn).

Well, these are two premises, but what would the conclusion be? According to Charles it is:

$\eqref{eq:e1c5}$ Some obstinate people are not philosophers.

This is quite counter-intuitive, isn’t it? Well, I wanted see the proof! So I sat down, and translated all statements into predicate logic. Well, the first premise goes like this:

$$\begin{equation}\label{eq:dcb6} \forall{p}\in\mathbb{P}:\textbf{logical}(p) \end{equation} $$

You can read the above statement $\eqref{eq:dcb6}$ like this: “For all $p$ in $\mathbb{P}$ it holds that $p$ is logical”, where $p$ stands for a single philosopher and $\mathbb{P}$ stands for the set of all philosophers. This means we can read the whole thing more naturally like this: “For each philosopher, it holds that he (or she) is logical”, or shorter: “All philosophers are logical.”

Now, let’s have a look at the second premise:

$$\begin{equation}\label{eq:e6e9} \forall{m}\in\mathbb{M}:\neg\textbf{logical}(m) \implies\textbf{obstinate}(m) \end{equation} $$

So, let’s read it again: For all $m$ in $\mathbb{M}$ it holds, that if $m$ is illogical, then $m$ is obstinate, simply meaning that “all illogical men are always obstinate.” Now, let’s translate the conclusion:

$$\begin{equation}\label{eq:e1c5} \neg\forall{m}\in\mathbb{M}:\textbf{obstinate}(m) \land m\not\in\mathbb{P} \end{equation} $$

which means that “not for all $m$ among all men $\mathbb{M}$, it holds that $m$ is obstinate and $m$ does not belong to the set of philosophers $\mathbb{P}$”. This rather very complicated statement can be simplified to: “Some obstinate men are not philosophers!”

Good, now that we have translated perfectly understandable English into completely incomprehensible predicate logic, let’s do some magic to derive from the premises $\eqref{eq:dcb6}$ and $\eqref{eq:e6e9}$ the conclusion $\eqref{eq:e1c5}$. But before that we need to state another rather obvious fact, namely that “all philosophers are men, but not all men are philosophers:”

$$\begin{align}\label{eq:49fe} \mathbb{P}\subset\mathbb{M}\tag{$\star$} \end{align} $$

Above, we say that the set of philosophers $\mathbb{P}$ is a subset of all men $\mathbb{M}$, which is the same as declaring that all philosophers are men (but not necessarily the other way around).

Derivation of the conclusion

Based on the premises $\eqref{eq:49fe}$ that all philosophers are men, and $\eqref{eq:dcb6}$ that they are logical, we can deduce that there are some men who are not philosophers, hence illogical (where we assume based on $\eqref{eq:dcb6}$ that only philosophers can be logical):

$$\begin{equation}\label{eq:95af} \exists{m}\in\mathbb{M}:m\not\in\mathbb{P} \vdash \neg\textbf{logical}(m) \end{equation} $$

Further, since all illogical men tend to be obstinate, we can derive that there exists also an illogical one, who is indeed obstinate:

$$\begin{equation}\label{eq:aa48} \exists{m}\in\mathbb{M}:\neg\textbf{logical}(m) \implies\textbf{obstinate}(m) \end{equation} $$

which we can reformulate by using the fact, that “$\neg{a}$ implying $b$” ($\neg{a}\implies{b}$) is equivalent to “$a$ or $b$” ($a\lor{b}$):

$$\begin{equation}\label{eq:1c8c} \exists{m}\in\mathbb{M}:\textbf{logical}(m) \lor\textbf{obstinate}(m) \end{equation} $$

Hence apparently, there is a man who is logical or obstinate! Now, if we combine $\eqref{eq:95af}$ and $\eqref{eq:1c8c}$ we get:

$$\begin{equation}\label{eq:a4ba} \exists{m}\in\mathbb{M}:\neg\textbf{logical}(m)\land\bigg{(} \textbf{logical}(m)\lor\textbf{obstinate}(m)\bigg{)} \end{equation} $$

which is equal to — derived by pulling the left hand side into the parentheses and then simplifying $\neg\textbf{logical}(m)\land\textbf{logical}(m)$ to $\bot$, namely false:

$$\begin{equation}\label{eq:c038} \exists{m}\in\mathbb{M}:\bot\lor\bigg{(} \neg\textbf{logical}(m)\land\textbf{obstinate}(m)\bigg{)} \end{equation} $$

which is equal to — derived by dropping $\bot$ in the or statement:

$$\begin{equation}\label{eq:8a48} \exists{m}\in\mathbb{M}: \neg\textbf{logical}(m)\land\textbf{obstinate}(m) \end{equation} $$

which is equal to — according to $\eqref{eq:dcb6}$:

$$\begin{equation}\label{eq:b3b1} \exists{m}\in\mathbb{M}: \textbf{obstinate}(m)\land m\not\in\mathbb{P} \end{equation} $$

So $\eqref{eq:b3b1}$ means that “some obstinate man are not philosophers!” Quod erat demonstrandum. $\blacksquare$