Why you need a knowledge management system

What’s a knowledge management system? It’s basically your notes, organized. It can be a Google doc, paper, Evernote, or in my case, emacs and org-mode.

It started when I was trading crypto. I’d open a position, and several days later it’d seem like a bad idea, and I couldn’t remember why i had entered that position in the first place. So I started to log my thoughts. This helped me watch my thought process and actually improve.

Plus, it just pisses me off when I have to re-Google something I already learnt a long time ago. This happens a lot when wrestling with Linux for example.

But hopefully, dear reader, I don’t need to convince you why this is a good idea (although I lived without such a system for many years).

What I do want to discuss in this post, though, is using such systems to shape your mindset.

Dance

I learn so many things in dance class that I can’t remember all of them. So once I get home, I write them down in a text file.

Actually revising the notes is something I haven’t systematized. It already takes a few weeks to internalize a Zouk movement so even then, many ideas in the notes go to waste because you can’t remember them on the dance floor.

Computers

It is so easy to wonder: “how much does (thing I want to buy) cost again?” And waste an hour being distracted on eBay.

Or “how do I do that in Linux again?” And waste time Googling that site you knew you went to ages ago.

My computer notes are just about factual information but it is no understatement to say they have improved my productivity 5x at least by preventing me from being distracted.

The only problem is if I have so many notes in this category that I forget that I already made a note for something.

Books

To remember what a specific book said, I summarize it so that it won’t take me much brain power to understand the important parts in the future.

This might not imprint its insights into my brain that deeply, but it does dig up the associated thoughts I had when reading that passage again, without having to actually spend time re-reading the passage. Repeat this over time and it becomes internalized.

Spaced Repetition with Anki

Using Anki to remember facts is straightforward enough. I can even use it to get me to see things in a new light, using carefully written questions and making sure that I don’t just answer the question correctly, I actually hold that thought in my head for a few minutes before finally answering.

However I haven’t figured out how to use Anki to totally change my mindset. Perhaps that only comes naturally after having seen enough subtopics in a new light.

Repeated visualization of a certain situation, and rehearsing how I should think vs. how I would think in that situation helps a lot more.

A simple guide to Winternitz OTS signatures

WOTS is a way of generating a public/private keypair, and using it for signing messages. In other words, it’s a signature scheme. Importantly, it only depends on having a good hash function, which makes it ‘quantum resistant’ because Shor’s algorithm can make cracking elliptic curves easier on quantum computers (????). XMSS uses a tweaked version called WOTS+, which improves some cryptographical aspects which I don’t quite understand. A lot of what I learnt came from this page at Cryptography Services.

Suppose we have a message ‘1234’. Let’s sign it with WOTS to prove that we created/sent this message.

1. Generate a Secret Key

Since our message is 4 characters long, we need to generate 4 random collections of bits (let’s call them words, because calling them bytes would imply they’re 8 bits long, which doesn’t have to be the case). So let’s say these words should be 6 bits long.

secretkey = ['011001', '010110', '100001', '001000']

2. Calculate the Public Key from the Secret Key

There are many hash functions out there, like SHA-1, SHA-2, SHA-3, Whirlpool, and the one everybody knows from Bittorrent, MD5.

The idea is you hash each word in the secretkey /n/ times, so let’s say /n=8/.

publickey = [sha2(sha2(sha2(sha2...('011001')))), sha2(sha2(sha2(sha2...('010110')))) ... ]

It is now quite impossible to calculate the secret key from the public key. It’s impossible for 1 iteration of SHA2, let alone 8.

3. Signing the message

The signature of '1' is sha2('011001')
The signature of '2' is sha2(sha2('010110'))
The signature of '3' is sha2(sha2(sha2('100001')))
The signature of '4' is sha2(sha2(sha2(sha2('001000'))))

Once you release your public key, everyone can see that if you hash sha2(‘011001′) 7 more times, and sha2(sha2(‘010110′)) 6 more times etc., they will have calculated your public key, and thus you have proved that you were the one who actually signed the message ‘1234′.

Obviously, don’t use this method to sign ‘2345′! Or any other message. It’s called a OTS because it’s a One Time Signature. Generate a new secret key each time you want to sign something.

WOTS+

The secret key now includes a random XOR bitmask for each word.

secretkey = [('011001', '111111'), ('010110', '001001'), ('100001', '000000'), ('001000', '101010')]

When calculating the public key/signature, after calculating the sha2(word), you XOR the result with the word’s secret bitmask. This is supposed to make signature sizes smaller/harder to crack (don’t ask me).

How does Bitcoin pooled mining work?

Economics Behind Mining

To understand how mining really works, let’s first understand the economics behind it.

The network of computers running the coin software (let’s say Bitcoin) wants history (of transactions) to be recorded in the form of blocks, and it rewards those who do so with 12.5 BTC.

Bitcoin wants history to be recorded in a new block every 10 minutes. Additionally, since anybody could record history in a new block, it has to make sure that those who invested the most (electricity, stake, capacity etc) have a better chance of recording a new block, because those who invested the most are probably more interested in seeing Bitcoin work properly, as opposed to those who are trying to spend their Bitcoin twice.

Since recording a new block has a reward associated with it, and a new block mustn’t appear too often, and those who write history may have nefarious intents, there is a need to make recording a new block artificially difficult. This is mining/Proof of Work.

Continue reading How does Bitcoin pooled mining work?

Generating Jules Verne novels with Torch-RNN

Torch-RNN is a rewrite of Andrej Karpathy’s char-rnn. You train it on some text, and then it can generate ‘similar’ text. I loved reading Jules Verne novels, so being able to just crank out some new novels whenever I feel like it sounds like a great idea right?

The hardest part was setting the environment up.

The text preprocessor was written in Python 2, and you’d think “hey it’s just wrangling some text what requirements does it need”. And it pulled in Cython, which numpy requires. Compiling Cython is always a bitch. I hate wading through compile scripts that I didn’t write myself that break.

The NN model itself is written in Lua and needs something called LuaJIT, which sounds like a faster variant of a Lua interpreter. Whatever, not interested in learning the language. Setting that up required a lot of compiling too.

In the end I managed to get everything setup, and realized that I couldn’t run the neural network on my GPU because everybody only writes for CUDA (thanks guys) and I have a Radeon HD 6870 (note: OpenCL won’t work out of the box with the Radeon Crimson 16.x beta drivers, the last ones to be released for Barts. You need Catalyst 15.7.1 WHQL for proper OpenCL 1.2 support).

Anyway. I took From the Earth to the Moon, Eight Hundred Leagues on the Amazon (I wanna read that!), and. The Secret of the Island (I just found out that this was written by someone else, and only translated by Verne!) and put them all in one huge
text file.

Training the neural network took all of my CPU. Since I was running in the Bash shell for Windows 10 there was no way I could’ve gotten it to run on the GPU anyway.

After a day or two of training (and about 20K iterations) the virtual Jules Verne spat this out:

& thon tole, by seemed profufess and metal stations requirements of Judge Ribeiro, which degrees. They have been descended some caber. “Imlet in “the struggled a pressure, we must attracted by Recthman. That is more principants of the amazing, nothing the course were so craw of the same peculiable perpetibilitions of the life of Judge Jarriquez frave to given arrive to violence. On the forests, with the /Tapperto/”_ The companion, and it like a close stretter had the traveler considerable than to the earth where the quality. On thick the Gun Club; what they disarsed of the idea of twenty-keeping them. But to this day as to me, fix _“jussiba!” “Aboats of do. During the darkment is recovertaken a despaited that public Street!” The long poblen the: dembnoit at langual paralle _ther of the Amazon?” “What a compressed to Project Gutenberg-textected to this apparent for hourbascure was no doubt, when he was with in its finished, “there would return with all, or rather journey. The step of the document seen to ask supering and have been liness; it would true. Not the diamobas writable intomarier of great a previous; after less mean them. And which simple with more than the villal work of the projectile would have indeed the topped the moon? Work of the pounds to proceed, at them in at over Joam Dacosta dadled for the refund of Sateltences comply up the certain soon in the right of nigmon. The Sound the projectile, and without retrew of the best conquernts misceldt. He as reply, the loud, the two gran! Unifer-“it is inches that the mass approach it, and we branches-that I cabinged that a comes ank, low. They reprisonation of the metal plantant mashed which his destity proper profession of a large feeting his none-wall of the gas, free cupiness, through frittle for the jokes Donselver, putting scars of carrying an edgars a fear one of the Rodroats as soverlocks at the Chaboy, you, she not have the syriy the coupo
Clearly it seems I should’ve just removed the Project Gutenberg prefaces from the training text.

Anyway it’s doing kinda well for a neural network that doesn’t understand English, and besides, doesn’t even understand the concepts behind the words.

Don’t add Django migrations to version control

test_models.py, test_manager.py, at least.

And here’s the one thing that might be useful, might not: committing your migrations to version control.

Here’s the situation: you’re working on the models for a Django app.

class Dog(models.Model):
    owner = models.ForeignKey(Person, on_delete=models.CASCADE)
    name = models.CharField(max_length=255)
    age = models.IntegerField()

Hmm, perhaps age should be date-of-birth. After all, you don’t want to have to write some script to be updating the age value for every dog every year. No, perhaps it should be named dob after all.

So you make the change in models.py and add the migrations for the changes as well, they get saved as 0002_renamed_age_to_dob.py or something.

Problem is, your colleague has been working on the very same model, and he had added some other fields too:

class Dog(models.Model):
    owner = models.ForeignKey(Person, on_delete=models.CASCADE)
    name = models.CharField(max_length=255)
    dob = models.DateField()
    fav_foods = models.CharField(max_length=255, blank=True)
    potty_trained = models.BooleanField(default=False)

The migration that changes the first code snippet to the second in PostgreSQL gets saved as 0002_added_fields.py, and now your django-migrate fails after a git pull. Because 0002_added_fields.py assumes that age is still there and it’s a models.IntegerField, but the database on your machine looks different because you already renamed age to dob. Git can resolve conflicts between your models.py, but not between the Django migrations.

So it’s best to just not add those migrations to the code repository unless you’re really sure that only one guy is working on the models. Because if more than one guy is working on the models, then everybody’s databases are different and the migrations can’t resolve that. You might as well do django-manage makemigrations from scratch.

Unless, of course, you already have data in there.

From Python to Google Go and Life

Now that I’m an adult, I find that doing things on the side these days is nigh unsustainable when one has to spend most of the day making a living. Besides working out almost everyday and reading articles on entrepreneurship like I used to devour articles on dating, there’s no time left but to get a good 8 hours of sleep.

But recently I got a chance to study Golang. As an enthusiastic Python developer, Go shows up as a language that has the same philosophy, but just happens to be compiled, statically typed, and have better support for concurrency.

Documentation

The documentation is incredible. You can even do the Tour of Go on localhost by simply installing the gotour package.

go get golang.org/x/tour/gotour

But I never learned anything from that because the Go Tour is just a museum of code snippets that show you Go’s features.

As usual, the best way to learn is to implement some utility that you want for your own in Go. For this, the go doc command is incredible. For example, this is the output of go doc json, straight from the terminal:

shinichi@ayanami ~/go/src/github.com/randomshinichi/goutil $ go doc json
package json // import "encoding/json"

Package json implements encoding and decoding of JSON as defined in RFC
4627. The mapping between JSON and Go values is described in the
documentation for the Marshal and Unmarshal functions.

See "JSON and Go" for an introduction to this package:
https://golang.org/doc/articles/json_and_go.html

func Compact(dst *bytes.Buffer, src []byte) error
func HTMLEscape(dst *bytes.Buffer, src []byte)
func Indent(dst *bytes.Buffer, src []byte, prefix, indent string) error
func Marshal(v interface{}) ([]byte, error)
func MarshalIndent(v interface{}, prefix, indent string) ([]byte, error)
func Unmarshal(data []byte, v interface{}) error
type Decoder struct{ ... }
func NewDecoder(r io.Reader) *Decoder
type Delim rune
type Encoder struct{ ... }
func NewEncoder(w io.Writer) *Encoder
type InvalidUTF8Error struct{ ... }
type InvalidUnmarshalError struct{ ... }
type Marshaler interface{ ... }
type MarshalerError struct{ ... }
type Number string
type RawMessage []byte
type SyntaxError struct{ ... }
type Token interface{}
type UnmarshalFieldError struct{ ... }
type UnmarshalTypeError struct{ ... }
type Unmarshaler interface{ ... }
type UnsupportedTypeError struct{ ... }
type UnsupportedValueError struct{ ... }

And you can go deeper and ask for documentation on the functions and structs too:

shinichi@ayanami ~/go/src/github.com/randomshinichi/goutil $ go doc json.Token
package json // import "encoding/json"

type Token interface{}
A Token holds a value of one of these types:

Delim, for the four JSON delimiters [ ] { }
bool, for JSON booleans
float64, for JSON numbers
Number, for JSON numbers
string, for JSON string literals
nil, for JSON null

I only needed the internet to figure out how people usually did things in Go. For the specifics, this go doc command was incredible – never even had to leave my terminal.

The Result

A few hours took me from Hello World to a little utility that mirrors a directory structure with empty files. Why? My photos have important information in their filenames, and I want to write scripts that mess around with said filenames. Not going to do that on my real photo collection.

package main

import (
    "fmt"
    "path/filepath"
    "os"
)

func mirror(path string, info os.FileInfo, err error) error {
    relpath, _ := filepath.Rel("/Volumes/Toshiba 2TB/pictures/", path)
    fmt.Println(relpath)
    if info.IsDir() {
        os.Mkdir(relpath, 0755)
    } else {
        file , err := os.Create(relpath)
        file.Close()
        if err != nil {
            fmt.Println(err)
        }
    }
    return nil
}

func main(){
    fmt.Println("goutil starting")
    filepath.Walk("/Volumes/Toshiba 2TB/pictures/Photos", mirror)   
}

Things I noticed

Functions in Go usually return two values, the result and an error object. To receive both into variables you need := instead of =. I ran os.Create(), and some directories would have files in them but others wouldn’t, so I wanted to print the error object that os.Create() returns. However, it also returns a file object, and you can’t ignore that because the go compiler complains. That was seriously frustrating but it turns out that I did need the returned file handle because I was hitting the max open files limit. Good language design I suppose.

I have to say, error checking in returned values clutters up the code. Just look at the if statements above. This would be much more poetic in Python because of exceptions, which are implicit and bubble upwards from code below. Still, it probably doesn’t get much better than this in the compiled language world. Also, this can still be properly mitigated by keeping functions single purpose.

To ignore a return value, use _

Programming Languages and Social Issues

Afterwards I read Rob Pike’s blogpost on why people weren’t moving from C++ to Go as he had originally thought. It wasn’t about “the better tool for the job”, or productivity, or ease of maintenance. Simply put, C++ let you have control over everything, absolutely everything, and people who program in C++ like it that way, while Go has a garbage collector.

I get it, having control over absolutely everything, if only you knew enough about the language, is empowering.

However, I found that I really appreciate it when computers help me accomplish something and then get out of the way, like a tool. That’s why I use a Mac.

The choice of programming languages is now an ideology, a philosophy of life. Which brings us to the next question:

Does the inability of Lisp to gain popularity say something about the people who use it, and their life strategy?

Apparently it does, and I quickly found some articles about it. Rudolf Winestock’s The Lisp Curse is the most plausible and well explained. Mark Tarver’s The Bipolar Lisp Programmer is pithy and poetic, and it shows you what 56 years of living can do for your experience and knowledge.

The Lisp Curse also linked to Stanislav Datskovskiy, whose very writing radiates hatred, “I’m better than you-ness”, and a sense that he really is incredibly brilliant, which does no favours for his ego. I’ve been there, come back to earth, and I have just this one thing to say: he probably doesn’t get to fuck much.

And that was my day spent learning Google Go. In the end I guess I learned more about different walks of people than anything else.

The Concept of State in Blockchains

While working on the code that makes up an actual cryptocoin, I’d always come across a class or something that would refer to a ‘State’. And I never really knew what that was.

Was that the state of the coin’s network, on the latest block? The state of that node perhaps – but how could the node have a valid state when it’s still busy downloading the blockchain?

It wasn’t until I read Ethereum’s whitepaper that the concept of state was formalized: it’s the state of the coin as your node sees it, with the blockchain data it has on your computer.

Should your node not have the latest blocks, well then your node’s state is out of date, and it needs to download more blocks and recalculate the new state from that.

Bitcoin’s state is very simple. It is simply which addresses have how many coins, or the Unspent Transaction Outputs (UTXOs). The blocks don’t have any information about the state; the Bitcoin client must calculate its own state based on the private keys it has in its wallet, and from what it has downloaded of the blockchain.

Ethereum blocks include the state in them, in that each block header holds a hash of the state tree’s root, as well as hashes of the transaction and storage trees’ roots (you wouldn’t want blocks to always have a full copy of the state. That would be a waste of space. So you store them in a tree structure, where the branches can point to data (leaves) in older blocks). What’s in an Ethereum state? 1. The account’s nonce (each time you send a transaction from this account, the nonce increases by 1. This is to prevent double spending) 2. The account’s balance 3. storageRoot (data and Solidity programs go in here) 4. codeHash (what is this?)

Remember when you had to download that huge Bitcoin blockchain and wait for it to be synced up to the latest block just to know how much Bitcoin you have? In Ethereum, because the state is always stored with the block in this manner, you can just get the latest block, traverse the state tree (which points you to other blocks in the history that are relevant to your account) and you can get your balance much faster.

It seems many people like the idea of storing the state in blocks, since QRL also stores the state. The State in QRL consists of (for each account):

  1. The account’s nonce (this is everywhere. I wonder if Bitcoin accounts also have this) 2. The account’s balance 3. a list of public keys that this account used before, to prevent them from being used again (One Time Signatures are what protects us from quantum computers) 4. a list of staking accounts (this is a PoS coin)