Skip to content

Commit

Permalink
Tidy things up for cargo-deb
Browse files Browse the repository at this point in the history
Add a bunch of info to Cargo.toml, and the README in markdown
  • Loading branch information
julianandrews committed Sep 22, 2022
1 parent 33f0904 commit 80fa869
Show file tree
Hide file tree
Showing 3 changed files with 76 additions and 45 deletions.
4 changes: 3 additions & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

34 changes: 33 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
[package]
name = "markovpass"
version = "1.0.1"
version = "1.0.3"
authors = ["Julian Andrews <[email protected]>"]
license = "BSD-3-Clause"
description = "Markov chain based passphrase generator"
readme = "README.md"
repository = "https://github.com/julianandrews/markovpass"
edition = "2018"

[dependencies]
Expand All @@ -11,6 +15,34 @@ rand = "0.7.3"
[features]
benchmarks = []

[package.metadata.deb]
extended-description = """\
A markov chain based passphrase generator.
Generates randomized passphrases based on a Markov chain along with the \
total Shannon entropy of the nodes traversed. Long random sequences of \
characters are difficult to remember. Shorter, or less random sequences \
are bad passphrases. Long sequences of words (xkcd style passphrases) \
are relatively easy to remember but take a long time to type. Markovpass \
generates human sounding phrases, which aim to strike a balance between \
ease of memorization, length, and passphrases quality. The passphrases \
produced look something like
qurken ret which bettle nurence
or
facupid trible taxed partice
Markovpass requires a corpus of human language to work from - the longer the
better. If you want a quick and easy way to try it out run:
curl -s http://www.gutenberg.org/files/1342/1342.txt | markovpass
to download "Pride and Predjudice from Project Gutenberg and use that as a
corpus."""


[[bin]]
path = "src/main.rs"
name = "markovpass"
83 changes: 40 additions & 43 deletions README.rst → README.md
Original file line number Diff line number Diff line change
@@ -1,70 +1,67 @@
Markovpass
==========

.. image:: https://github.com/julianandrews/markovpass/workflows/Continuous%20integration/badge.svg
![Continuous integration](https://github.com/julianandrews/markovpass/workflows/Continuous%20integration/badge.svg)

A Markov chain based passphrase generator with entropy estimation.
``markovpass`` generates randomized passphrases based on a Markov chain along
`markovpass` generates randomized passphrases based on a Markov chain along
with the total Shannon entropy of the nodes traversed. Long random sequences of
characters are difficult to remember. Shorter, or less random sequences are bad
passphrases. Long sequences of words are relatively `easy to remember
<https://xkcd.com/936/>`_ but take a long time to type. ``markovpass``
passphrases. Long sequences of words are relatively [easy to
remember](https://xkcd.com/936/) but take a long time to type. `markovpass`
generates human sounding phrases, which aim to strike a balance between ease of
memorization, length, and passphrases quality. The passphrases produced look
something like::
something like

qurken ret which bettle nurence

or::
or

facupid trible taxed partice

Installation
------------

Check for binary releases here_.
Check for binary releases
[here](https://github.com/julianandrews/markovpass/releases/).

.. _here: https://github.com/julianandrews/markovpass/releases/

Alternativey, assuming you have ``rustc`` and ``cargo`` installed, you should
be able to build ``markovpass`` with ``cargo build --release``. ``markovpass``
Alternatively, assuming you have `rustc` and `cargo` installed, you should be
able to build `markovpass` with `cargo build --release`. `markovpass`
is just a standalone binary, and you can put it wherever you like.

Usage
-----

::

Usage: markovpass [FILE] [options]
Usage: markovpass [FILE] [options]

Options:
-n NUM Number of passphrases to generate (default 1)
-e MINENTROPY Minimum entropy (default 60)
-l LENGTH NGram length (default 3)
-w LENGTH Minimum word length for corpus (default 5)
-h, --help Display this help and exit
--show-entropy Print the entropy for each passphrase
Options:
-n NUM Number of passphrases to generate (default 1)
-e MINENTROPY Minimum entropy (default 60)
-l LENGTH NGram length (default 3)
-w LENGTH Minimum word length for corpus (default 5)
-h, --help Display this help and exit
--show-entropy Print the entropy for each passphrase

``markovpass`` requires a corpus to work with. The corpus can be provided via
the ``FILE`` argument. Alternatively, ``markovpass`` will look for data on
``STDIN`` if no ``FILE`` argument is provided. It can take pretty much any text
`markovpass` requires a corpus to work with. The corpus can be provided via
the `FILE` argument. Alternatively, `markovpass` will look for data on
`STDIN` if no `FILE` argument is provided. It can take pretty much any text
input and will strip the input of any non-alphabetic characters (and discard
words containing non-alphabetic characters, but keep words sandwiched by
non-alphabetic characters). The larger and more varied the corpus the greater
the entropy, but you'll hit diminishing returns fairly quickly. I recommend
using `Project Guttenberg <https://www.gutenberg.org/>`_; personally I like a
mix of H.P. Lovecraft and Jane Austen. The ``-w`` option can be used to remove
using [Project Guttenberg](https://www.gutenberg.org/); personally I like a
mix of H.P. Lovecraft and Jane Austen. The `-w` option can be used to remove
short words from the corpus which will increase the average length of words in
your passphrase, but not guarantee a minimum length (the minimum word length
will be the lesser of the ``-w`` and ``-l`` options). Obviously increasing the
will be the lesser of the `-w` and `-l` options). Obviously increasing the
minimum word length will lead to longer passphrases for the same entropy.

If you want a quick easy way to try it out (and you have ``curl`` installed)::
If you want a quick easy way to try it out (and you have `curl` installed)

curl -s http://www.gutenberg.org/files/1342/1342.txt | markovpass

should download "Pride and Predjudice" from Project Gutenberg and use it as
your corpus. I keep a folder with a bunch of text files in it and use::
should download "Pride and Prejudice" from Project Gutenberg and use it as
your corpus. I keep a folder with a bunch of text files in it and use

cat corpus/*.txt | markovpass -e 80

Expand All @@ -75,37 +72,37 @@ Shannon Entropy and Guesswork

Shannon entropy provides a good estimate of the lower bound of the average
guesswork required to guess a passphrases (to within an equivalent of a little
over 1.47 bits) [1]_, but average guesswork is not necessarily a reliable proxy
for difficulty in guessing a passphrase [2]_. Consider the following
over 1.47 bits) [^1], but average guesswork is not necessarily a reliable proxy
for difficulty in guessing a passphrase [^2]. Consider the following
passphrases generation method: I choose 500 characters and for each character
there is a 0.999 chance I choose 'a' and a 0.001 chance I choose 'b'. The
Shannon entropy for this process is about 5.7 bits, which based on [1]_ should
Shannon entropy for this process is about 5.7 bits, which based on [^1] should
give an average number of guesses needed of at least 17.9. Yet an adversary who
knows my method will guess 'aaaaa...' and get my passphrases right on the first
guess 60.4% of the time. So you should treat Shannon entropy estimates with
caution.

That said, I suspect that for moderately long ``markovpass`` passphrases using
That said, I suspect that for moderately long `markovpass` passphrases using
a representative corpus of language, Shannon entropy is probably a good proxy
for difficulty in guessing. The fundamental problem with average guesswork is
that the distribution of passphrase probabilities isn't necessarily flat. If
the distribution has a strong peak (or multiple peaks) and a long tail of lower
probability passphrases then average guesswork is going to be a poor proxy for
the strength of the passphrase generation method. In the case of
``markovpass``, if trained on a reasonably representative corpus of language,
`markovpass`, if trained on a reasonably representative corpus of language,
over a large enough series of decisions the probability distribution of
passphrases should look more or less gaussian (some variant of the `Central
limit theorem <https://en.wikipedia.org/wiki/Central_limit_theorem>`_ should
apply). While a gaussian distribution isn't a flat distribution, it's also a
passphrases should look more or less Gaussian (some variant of the [Central
limit theorem](https://en.wikipedia.org/wiki/Central_limit_theorem) should
apply). While a Gaussian distribution isn't a flat distribution, it's also a
long ways from the pathological example above. The Shannon entropy given is
definitely an overestimate of the difficulty in guessing, but probably not a
terrible one. Still, use ``markovpass`` at your own risk - I can make no
terrible one. Still, use `markovpass` at your own risk - I can make no
guarantees!

.. [1] J. L. Massey, “Guessing and entropy,” in Proc. IEEE Int. Symp.
Information Theory, 1994, p. 204.
.. [2] D. Malone and W.G. Sullivan, “Guesswork and Entropy,” IEEE Transactions
on Information Theory, vol. 50, 525-526, March 2004.
[^1]: J. L. Massey, “Guessing and entropy,” in Proc. IEEE Int. Symp. Information
Theory, 1994, p. 204.
[^2]: D. Malone and W.G. Sullivan, “Guesswork and Entropy,” IEEE Transactions
on Information Theory, vol. 50, 525-526, March 2004.

Acknowledgements
----------------
Expand Down

0 comments on commit 80fa869

Please sign in to comment.