Using GPT4 to help refactor a complex codebase

Recently I tried to use GPT-4 to analyze dove-foundation/dove in preparation for a huge refactor. This is similar to feeding in PDFs and asking questions of it.

llama-index and langchain were famous in this space (loading documents and indexing), but some people said they were overly complex, so I spent some time on griptape, simpleaichat.

But they seemed too simplistic. As I delved deeper into langchain, I didn’t find it was too complicated. Rather, it was very flexible and documentation was splendid.

I followed this tutorial.

When I asked GPT4 about the code, sometimes it would reply that not enough context was provided or that no code was provided. That’s weird, because I had gotten langchain to index the data. Time to dig deeper into what was really going on.

Code Splitting vs Text Splitting.

Turns out, when you feed text into langchain, it splits it up into chunks. And by default it splits them up by periods. Which makes sense for normal text, but not code. So you need a splitter that can understand code and split things up without mangling the meaning.

The model started to give useful output more often. Being able to see the chunks on Deeplake’s web UI helped a lot.

Must I use Deeplake?

As it turns out, the AI doesn’t have enough working memory (‘context’ is the technical word) to keep the whole codebase in mind. So people insert a database/index in between, which reads your data, determines what chunks of code are relevant to the question you’re asking and sends them together to the AI.

The tutorial told me to use Deeplake for the database/index, but I knew Deeplake was just a cloud play for something that doesn’t really need to be on the cloud (isn’t this familiar, fellow developers). Having a database/index isn’t something you need a cloud account for, you should be able to run that on your own computer.

Chroma is langchain’s built in database/index. It was terrible. I asked a question about evm.go and it didn’t even give the file to GPT4, only 2 results were relevant.

Then I tried an Ensemble retriever, BM25+FAISS. I asked the question “what does Observer.SaveAndPush()` do and paused my script to see what supporting code snippets the database/index would send to GPT4 to help make sense of my question.

pretty_print_docs(retriever.get_relevant_documents("what does Observer.SaveAndPush do?"))
Document 1 ({'source': '/home/shinichi/source/work/dove/bird/observer/observer.go'}):

func (d *Observer) SaveAndPush(event ev.EventObserved) {

    // Directly provide Job status data to the executor service
    // the "standard" way to do this would be to have the executor check in
    // with a contract on DoveChain that pulls data from EventStorage.sol
    // Specifically, the ExecutionQueue should use UELs to get Status data from
    // EventStorage.sol? Future ToDo
    if event.ChainName == "DoveGeth" &&
        event.ContractAddress == d.config.ExecutionQueueAddress.String() {

        d.modLog.Info("New ExecutionQueueEvent", "event", event)

        abi, err := abi.JSON(strings.NewReader(execution_queue.ExecutionQueueABI))
        if err != nil {
            // This shall not happen
            d.modLog.Info("OBS/EXC: ExecutionQueueAbi parse failed", "err", err)
Document 2 ({'source': '/home/shinichi/source/work/dove/bird/observer/observer.go'}):

package observer

import (


    drivers ""
    eventbrowserdb ""
    pool ""
    ev ""
    p2p ""
    reshare_monitor ""
    tbls ""

    logger ""
    log ""
Document 3 ({'source': '/home/shinichi/source/work/dove/bird/observer/observer.go'}):

func (d *Observer) reportMaliciousExecutor(executor common.Address, eventKey common.Hash) {
    self := d.config.MyAddress == executor
    if !self {
        tx, err := d.chainService.ReportBird(executor, chain_service.JOB_RELAY_FAULTY, eventKey)
        if err != nil {
            d.modLog.Error("OBS/EXEC: Failed to report malicious executor", "error", err)
        } else {
            d.modLog.Info("OBS/EXEC: Report malicious executor success", "txHash", tx)

// Verify the transaction hash updated by executor by checking whether
// executor called the correct contract and passed the correct parameters
// based on dispatched information from ExecutionQueue.sol
func (d *Observer) verifyJobTX_Save_Push(
    jobId [32]byte,
    txHash string,
    event ev.EventObserved) {

    d.modLog.Info("OBS/EXEC: VerifyJobTransaction",
        "JobId", common.BytesToHash(jobId[:]).Hex()[2:9])

    abi, err := abi.JSON(strings.NewReader(handle.HandleABI))
    if err != nil {
        d.modLog.Error("OBS/EXEC: Failed to parse HandleABI", "error", err)
Document 4 ({'source': '/home/shinichi/source/work/dove/bird/observer/observer.go'}):

var Exit = make(chan bool)

type Observer struct {
    chainConfigs    []config.ChainConfig
    config          config.MasterConfig
    watching        *ev.Watching
    eventPool       *pool.EventPool
    eventMiner      *pool.EventMiner
    transport       p2p.P2P
    modLog          log.Logger
    chainService    chain_service.ChainService
    executor        *executor.Executor
    executorAddress common.Address
    chainHandlers   map[int64]drivers.ChainHandler
    eventChannel    chan ev.EventObserved
    errorChannel    chan error
Document 5 ({'source': '/home/shinichi/source/work/dove/bird/observer/observer.go'}):

This Observer takes information for _one_ chain, and then looks for events on that chain.

Note that this code is not well written - e.g. we don't filter eventsConfig
Document 6 ({'source': '/home/shinichi/source/work/dove/bird/event_api/event_api_interface.go'}):

package eventbrowserdb

import (

type EventApiInterface interface {
    SaveNewEvent(ev events.EventObserved,
        coreHash common.Hash)
    SaveSubmitterEventHash(eventKey string,
        txHash string)
    SaveSubmitterError(eventKey string,
        errMsg string)
    SaveAction(observerId common.Address,
        eventKey string,
        coreHash string,
        timestamp uint64,
        actionName string)
    SaveActionWithData(observerId common.Address,
        eventKey string,
        coreHash string,
        timestamp uint64,
        actionName string,
        actionData []byte)
Document 7 ({'source': '/home/shinichi/source/work/dove/bird/cmd/root.go'}):

go ob.Start()
        <-observer.Exit // no message is ever sent to this channel, so it just serves as a block for the goroutines to run

        return nil
Document 8 ({'source': '/home/shinichi/source/work/dove/bird/contracts/events_to_watch/events_to_watch.go'}):

Document 9 ({'source': '/home/shinichi/source/work/dove/bird/drivers/bitcoin.go'}):

Document 10 ({'source': '/home/shinichi/source/work/dove/bird/drivers/bitcoin_test.go'}):

Even a novice programmer could see that only Documents 1, 3, 4 are relevant to the question at hand, but even then those weren’t the full function bodies, and the subfunctions that SaveAndPush() called weren’t even included, so how could GPT4 even give a satisfactory answer?

Integrating a language server that actually understands the programming language like gopls is really needed to understand what’s relevant to the question instead of a stupid search method.

Ah, Github Copilot

Then I remembered I had a Github Copilot subscription. It just makes sense that my IDE would understand enough about the code to send the relevant code to GPT.

I immediately installed Visual Studio Code and forgot about Code OSS. When it comes to cutting edge technology, open source software tends to disappoint. It’s just the way it is, unfortunately. For interoperability and commoditization, open source software is good.

Overall, AI cannot understand an entire codebase. This means it can’t help with design/architecture, but it can help with small annoying problems. With small enough code snippets, it can read and analyze code to find problems really quickly.

Supposedly Anthropic Claude-2 is better than GPT4 at coding and way cheaper (I spent about 3-4USD on this experiment) but for some reason it’s not available outside of the US.

Leave a Reply

Your email address will not be published.