Why you need a knowledge management system

What’s a knowledge management system? It’s basically your notes, organized. It can be a Google doc, paper, Evernote, or in my case, emacs and org-mode.

It started when I was trading crypto. I’d open a position, and several days later it’d seem like a bad idea, and I couldn’t remember why i had entered that position in the first place. So I started to log my thoughts. This helped me watch my thought process and actually improve.

Plus, it just pisses me off when I have to re-Google something I already learnt a long time ago. This happens a lot when wrestling with Linux for example.

But hopefully, dear reader, I don’t need to convince you why this is a good idea (although I lived without such a system for many years).

What I do want to discuss in this post, though, is using such systems to shape your mindset.

Dance

I learn so many things in dance class that I can’t remember all of them. So once I get home, I write them down in a text file.

Actually revising the notes is something I haven’t systematized. It already takes a few weeks to internalize a Zouk movement so even then, many ideas in the notes go to waste because you can’t remember them on the dance floor.

Computers

It is so easy to wonder: “how much does (thing I want to buy) cost again?” And waste an hour being distracted on eBay.

Or “how do I do that in Linux again?” And waste time Googling that site you knew you went to ages ago.

My computer notes are just about factual information but it is no understatement to say they have improved my productivity 5x at least by preventing me from being distracted.

The only problem is if I have so many notes in this category that I forget that I already made a note for something.

Books

To remember what a specific book said, I summarize it so that it won’t take me much brain power to understand the important parts in the future.

This might not imprint its insights into my brain that deeply, but it does dig up the associated thoughts I had when reading that passage again, without having to actually spend time re-reading the passage. Repeat this over time and it becomes internalized.

Spaced Repetition with Anki

Using Anki to remember facts is straightforward enough. I can even use it to get me to see things in a new light, using carefully written questions and making sure that I don’t just answer the question correctly, I actually hold that thought in my head for a few minutes before finally answering.

However I haven’t figured out how to use Anki to totally change my mindset. Perhaps that only comes naturally after having seen enough subtopics in a new light.

Repeated visualization of a certain situation, and rehearsing how I should think vs. how I would think in that situation helps a lot more.

A simple guide to Winternitz OTS signatures

WOTS is a way of generating a public/private keypair, and using it for signing messages. In other words, it’s a signature scheme. Importantly, it only depends on having a good hash function, which makes it ‘quantum resistant’ because Shor’s algorithm can make cracking elliptic curves easier on quantum computers (????). XMSS uses a tweaked version called WOTS+, which improves some cryptographical aspects which I don’t quite understand. A lot of what I learnt came from this page at Cryptography Services.

Suppose we have a message ‘1234’. Let’s sign it with WOTS to prove that we created/sent this message.

1. Generate a Secret Key

Since our message is 4 characters long, we need to generate 4 random collections of bits (let’s call them words, because calling them bytes would imply they’re 8 bits long, which doesn’t have to be the case). So let’s say these words should be 6 bits long.

secretkey = ['011001', '010110', '100001', '001000']

2. Calculate the Public Key from the Secret Key

There are many hash functions out there, like SHA-1, SHA-2, SHA-3, Whirlpool, and the one everybody knows from Bittorrent, MD5.

The idea is you hash each word in the secretkey /n/ times, so let’s say /n=8/.

publickey = [sha2(sha2(sha2(sha2...('011001')))), sha2(sha2(sha2(sha2...('010110')))) ... ]

It is now quite impossible to calculate the secret key from the public key. It’s impossible for 1 iteration of SHA2, let alone 8.

3. Signing the message

The signature of '1' is sha2('011001')
The signature of '2' is sha2(sha2('010110'))
The signature of '3' is sha2(sha2(sha2('100001')))
The signature of '4' is sha2(sha2(sha2(sha2('001000'))))

Once you release your public key, everyone can see that if you hash sha2(‘011001′) 7 more times, and sha2(sha2(‘010110′)) 6 more times etc., they will have calculated your public key, and thus you have proved that you were the one who actually signed the message ‘1234′.

Obviously, don’t use this method to sign ‘2345′! Or any other message. It’s called a OTS because it’s a One Time Signature. Generate a new secret key each time you want to sign something.

WOTS+

The secret key now includes a random XOR bitmask for each word.

secretkey = [('011001', '111111'), ('010110', '001001'), ('100001', '000000'), ('001000', '101010')]

When calculating the public key/signature, after calculating the sha2(word), you XOR the result with the word’s secret bitmask. This is supposed to make signature sizes smaller/harder to crack (don’t ask me).

Generating Jules Verne novels with Torch-RNN

Torch-RNN is a rewrite of Andrej Karpathy’s char-rnn. You train it on some text, and then it can generate ‘similar’ text. I loved reading Jules Verne novels, so being able to just crank out some new novels whenever I feel like it sounds like a great idea right?

The hardest part was setting the environment up.

The text preprocessor was written in Python 2, and you’d think “hey it’s just wrangling some text what requirements does it need”. And it pulled in Cython, which numpy requires. Compiling Cython is always a bitch. I hate wading through compile scripts that I didn’t write myself that break.

The NN model itself is written in Lua and needs something called LuaJIT, which sounds like a faster variant of a Lua interpreter. Whatever, not interested in learning the language. Setting that up required a lot of compiling too.

In the end I managed to get everything setup, and realized that I couldn’t run the neural network on my GPU because everybody only writes for CUDA (thanks guys) and I have a Radeon HD 6870 (note: OpenCL won’t work out of the box with the Radeon Crimson 16.x beta drivers, the last ones to be released for Barts. You need Catalyst 15.7.1 WHQL for proper OpenCL 1.2 support).

Anyway. I took From the Earth to the Moon, Eight Hundred Leagues on the Amazon (I wanna read that!), and. The Secret of the Island (I just found out that this was written by someone else, and only translated by Verne!) and put them all in one huge
text file.

Training the neural network took all of my CPU. Since I was running in the Bash shell for Windows 10 there was no way I could’ve gotten it to run on the GPU anyway.

After a day or two of training (and about 20K iterations) the virtual Jules Verne spat this out:

& thon tole, by seemed profufess and metal stations requirements of Judge Ribeiro, which degrees. They have been descended some caber. “Imlet in “the struggled a pressure, we must attracted by Recthman. That is more principants of the amazing, nothing the course were so craw of the same peculiable perpetibilitions of the life of Judge Jarriquez frave to given arrive to violence. On the forests, with the /Tapperto/”_ The companion, and it like a close stretter had the traveler considerable than to the earth where the quality. On thick the Gun Club; what they disarsed of the idea of twenty-keeping them. But to this day as to me, fix _“jussiba!” “Aboats of do. During the darkment is recovertaken a despaited that public Street!” The long poblen the: dembnoit at langual paralle _ther of the Amazon?” “What a compressed to Project Gutenberg-textected to this apparent for hourbascure was no doubt, when he was with in its finished, “there would return with all, or rather journey. The step of the document seen to ask supering and have been liness; it would true. Not the diamobas writable intomarier of great a previous; after less mean them. And which simple with more than the villal work of the projectile would have indeed the topped the moon? Work of the pounds to proceed, at them in at over Joam Dacosta dadled for the refund of Sateltences comply up the certain soon in the right of nigmon. The Sound the projectile, and without retrew of the best conquernts misceldt. He as reply, the loud, the two gran! Unifer-“it is inches that the mass approach it, and we branches-that I cabinged that a comes ank, low. They reprisonation of the metal plantant mashed which his destity proper profession of a large feeting his none-wall of the gas, free cupiness, through frittle for the jokes Donselver, putting scars of carrying an edgars a fear one of the Rodroats as soverlocks at the Chaboy, you, she not have the syriy the coupo
Clearly it seems I should’ve just removed the Project Gutenberg prefaces from the training text.

Anyway it’s doing kinda well for a neural network that doesn’t understand English, and besides, doesn’t even understand the concepts behind the words.

Don’t add Django migrations to version control

test_models.py, test_manager.py, at least.

And here’s the one thing that might be useful, might not: committing your migrations to version control.

Here’s the situation: you’re working on the models for a Django app.

class Dog(models.Model):
    owner = models.ForeignKey(Person, on_delete=models.CASCADE)
    name = models.CharField(max_length=255)
    age = models.IntegerField()

Hmm, perhaps age should be date-of-birth. After all, you don’t want to have to write some script to be updating the age value for every dog every year. No, perhaps it should be named dob after all.

So you make the change in models.py and add the migrations for the changes as well, they get saved as 0002_renamed_age_to_dob.py or something.

Problem is, your colleague has been working on the very same model, and he had added some other fields too:

class Dog(models.Model):
    owner = models.ForeignKey(Person, on_delete=models.CASCADE)
    name = models.CharField(max_length=255)
    dob = models.DateField()
    fav_foods = models.CharField(max_length=255, blank=True)
    potty_trained = models.BooleanField(default=False)

The migration that changes the first code snippet to the second in PostgreSQL gets saved as 0002_added_fields.py, and now your django-migrate fails after a git pull. Because 0002_added_fields.py assumes that age is still there and it’s a models.IntegerField, but the database on your machine looks different because you already renamed age to dob. Git can resolve conflicts between your models.py, but not between the Django migrations.

So it’s best to just not add those migrations to the code repository unless you’re really sure that only one guy is working on the models. Because if more than one guy is working on the models, then everybody’s databases are different and the migrations can’t resolve that. You might as well do django-manage makemigrations from scratch.

Unless, of course, you already have data in there.

From Python to Google Go and Life

Now that I’m an adult, I find that doing things on the side these days is nigh unsustainable when one has to spend most of the day making a living. Besides working out almost everyday and reading articles on entrepreneurship like I used to devour articles on dating, there’s no time left but to get a good 8 hours of sleep.

But recently I got a chance to study Golang. As an enthusiastic Python developer, Go shows up as a language that has the same philosophy, but just happens to be compiled, statically typed, and have better support for concurrency.

Documentation

The documentation is incredible. You can even do the Tour of Go on localhost by simply installing the gotour package.

go get golang.org/x/tour/gotour

But I never learned anything from that because the Go Tour is just a museum of code snippets that show you Go’s features.

As usual, the best way to learn is to implement some utility that you want for your own in Go. For this, the go doc command is incredible. For example, this is the output of go doc json, straight from the terminal:

shinichi@ayanami ~/go/src/github.com/randomshinichi/goutil $ go doc json
package json // import "encoding/json"

Package json implements encoding and decoding of JSON as defined in RFC
4627. The mapping between JSON and Go values is described in the
documentation for the Marshal and Unmarshal functions.

See "JSON and Go" for an introduction to this package:
https://golang.org/doc/articles/json_and_go.html

func Compact(dst *bytes.Buffer, src []byte) error
func HTMLEscape(dst *bytes.Buffer, src []byte)
func Indent(dst *bytes.Buffer, src []byte, prefix, indent string) error
func Marshal(v interface{}) ([]byte, error)
func MarshalIndent(v interface{}, prefix, indent string) ([]byte, error)
func Unmarshal(data []byte, v interface{}) error
type Decoder struct{ ... }
func NewDecoder(r io.Reader) *Decoder
type Delim rune
type Encoder struct{ ... }
func NewEncoder(w io.Writer) *Encoder
type InvalidUTF8Error struct{ ... }
type InvalidUnmarshalError struct{ ... }
type Marshaler interface{ ... }
type MarshalerError struct{ ... }
type Number string
type RawMessage []byte
type SyntaxError struct{ ... }
type Token interface{}
type UnmarshalFieldError struct{ ... }
type UnmarshalTypeError struct{ ... }
type Unmarshaler interface{ ... }
type UnsupportedTypeError struct{ ... }
type UnsupportedValueError struct{ ... }

And you can go deeper and ask for documentation on the functions and structs too:

shinichi@ayanami ~/go/src/github.com/randomshinichi/goutil $ go doc json.Token
package json // import "encoding/json"

type Token interface{}
A Token holds a value of one of these types:

Delim, for the four JSON delimiters [ ] { }
bool, for JSON booleans
float64, for JSON numbers
Number, for JSON numbers
string, for JSON string literals
nil, for JSON null

I only needed the internet to figure out how people usually did things in Go. For the specifics, this go doc command was incredible – never even had to leave my terminal.

The Result

A few hours took me from Hello World to a little utility that mirrors a directory structure with empty files. Why? My photos have important information in their filenames, and I want to write scripts that mess around with said filenames. Not going to do that on my real photo collection.

package main

import (
    "fmt"
    "path/filepath"
    "os"
)

func mirror(path string, info os.FileInfo, err error) error {
    relpath, _ := filepath.Rel("/Volumes/Toshiba 2TB/pictures/", path)
    fmt.Println(relpath)
    if info.IsDir() {
        os.Mkdir(relpath, 0755)
    } else {
        file , err := os.Create(relpath)
        file.Close()
        if err != nil {
            fmt.Println(err)
        }
    }
    return nil
}

func main(){
    fmt.Println("goutil starting")
    filepath.Walk("/Volumes/Toshiba 2TB/pictures/Photos", mirror)   
}

Things I noticed

Functions in Go usually return two values, the result and an error object. To receive both into variables you need := instead of =. I ran os.Create(), and some directories would have files in them but others wouldn’t, so I wanted to print the error object that os.Create() returns. However, it also returns a file object, and you can’t ignore that because the go compiler complains. That was seriously frustrating but it turns out that I did need the returned file handle because I was hitting the max open files limit. Good language design I suppose.

I have to say, error checking in returned values clutters up the code. Just look at the if statements above. This would be much more poetic in Python because of exceptions, which are implicit and bubble upwards from code below. Still, it probably doesn’t get much better than this in the compiled language world. Also, this can still be properly mitigated by keeping functions single purpose.

To ignore a return value, use _

Programming Languages and Social Issues

Afterwards I read Rob Pike’s blogpost on why people weren’t moving from C++ to Go as he had originally thought. It wasn’t about “the better tool for the job”, or productivity, or ease of maintenance. Simply put, C++ let you have control over everything, absolutely everything, and people who program in C++ like it that way, while Go has a garbage collector.

I get it, having control over absolutely everything, if only you knew enough about the language, is empowering.

However, I found that I really appreciate it when computers help me accomplish something and then get out of the way, like a tool. That’s why I use a Mac.

The choice of programming languages is now an ideology, a philosophy of life. Which brings us to the next question:

Does the inability of Lisp to gain popularity say something about the people who use it, and their life strategy?

Apparently it does, and I quickly found some articles about it. Rudolf Winestock’s The Lisp Curse is the most plausible and well explained. Mark Tarver’s The Bipolar Lisp Programmer is pithy and poetic, and it shows you what 56 years of living can do for your experience and knowledge.

The Lisp Curse also linked to Stanislav Datskovskiy, whose very writing radiates hatred, “I’m better than you-ness”, and a sense that he really is incredibly brilliant, which does no favours for his ego. I’ve been there, come back to earth, and I have just this one thing to say: he probably doesn’t get to fuck much.

And that was my day spent learning Google Go. In the end I guess I learned more about different walks of people than anything else.