2024-04-17

Updating my notes via email

Charles Darwin made it a habit to immediately write down anything that conflicted with his own ideas, so his brain would not forget or ignore it. In his own words:

"I had also, during many years, followed a golden rule, namely that whenever published fact, a new observation of thought came across me, which was opposed to my general results, to make a memorandum of it without fail and at once; for I had found by experience that such facts and thoughts were far more apt to escape from the memory than favourable ones."

Based on this, I've also made it a habit to quickly write down new ideas or thoughts. Unlike Darwin, however, I don't carry a notebook with me. Instead, I prefer to store my notes in a Git repository. Unlike a notebook, a Git repository is more fireproof, can more easily be searched and edited, and can scale to much larger sizes. Jeff Huang, for example, wrote that he has a single text file with all his notes from 2008 to 2022. At the time of his writing, the file contained 51,690 handwritten lines of text. He wrote that the file been his "secret weapon".

Show more

2024-03-08

Installing Forgejo with a separate runner

On the 15th of February 2024, Forgejo announced that they will be decoupling (hard forking) their project further from Gitea. I think this is great since Forgejo is the only European Git forge that I know of, and a hard fork means that the project can now grow more independently. With Forgejo, it is now possible to self-host host a forge on a European cloud provider like Hetzner. This is great because it allows decoupling a bit from American Big Tech. Put differently, a self-hosted Forgejo avoids having all your eggs in one basket.

This post will go through a full step by step guide on how to set things up. This guide is based on my Gitea configuration that I ran for a year, so it works. During the year, I paid about 10 euros per month for two Hetzner servers. The two servers allow separating Forgejo from the runners. This ensures that a heavy job on the runner will not slow down the Forgejo server.

Show more

2024-02-19

How much have batteries changed over time?

From time to time, batteries make the headlines because they store too little energy, because they are too expensive, or because the costs have dropped dramatically in the last year. These things could indeed be true at the same time. Or, as Hans Rosling would say, "things can be bad, and getting better." I was curious by how much. To figure out where things are heading, let's not focus on headlines and instead look at data for multiple years.

Phone Batteries

As a first investigation, I wonder whether much is happening in the area of small batteries. Thanks to the rise of smartphones, these batteries may have improved dramatically over time. We could checkout the raw battery prices, but consumers would not buy at these prices. So instead let's look at consumer smartphone prices. Smartphones are a mass produced product, so they should be able to incorporate state-of-the-art battery technology. Let's therefore look at iPhone battery capacity over time.

Show more

2024-02-03

Encrypting and decrypting a secret with wasm_bindgen

Doing a round trip of encrypting and decrypting a secret should be pretty easy, right? Well, it turned out to be a bit more involved than I thought. But, in the end it worked here is the code for anyone who wants to do the same.

I'll be going through the functions step by step first. The full example with imports is shown at the end.

First, we need to generate a key. Here, I've set extractable to false. This aims to prevent the key from being read by other scripts.

fn crypto() -> web_sys::Crypto {
    let window = web_sys::window().expect("no global `window` exists");
    window.crypto().expect("no global `crypto` exists")
}

pub fn generate_key() -> Promise {
    let sc = crypto().subtle();
    // Symmetric encryption is used, so the same key is used for both operations.
    // GCM has good performance and security according to Wikipedia.
    let algo = AesKeyGenParams::new("AES-GCM", 256);
    let extractable = false;
    let usages = js_array(&["encrypt", "decrypt"]);
    sc.generate_key_with_object(
        &algo,
        extractable,
        &usages
    ).expect("failed to generate key")
}

Show more

2024-01-28

An old solution to modern OpenAI GPTs problems

Ever since the introduction of ChatGPT, OpenAI has had a compute shortage. This might explain their current focus on GPTs, formerly known as Plugins. Simply put, you can see GPTs as a way to wrap around the base language model. In a wrapping, you can give some instructions (a prompt), 20 files, and enable Web Browsing, DALL·E Image Generation, and/or Code Interpreter. Also, you can define an Action, which allows the GPT to call an API from your own server.

At first sight the possibilities seem limited for developers. The code interpreter will only run Python code inside their sandbox. Furthermore, the interpreter has no internet access, so installing extra tools is not possible. You could spin up your own server and interact via the Actions (API calls), but that has some latency and requires setting up a server. Without spinning up a server, you could define some CLI script in Python and write in the instruction how to interact with that Python script. Unfortunately, this does limit the capabilities. Not all Python packages are installed in the sandbox and there is only so much that can be expressed in the instruction.

Show more

2023-11-25

Triggering entr

entr is an extremely useful little tool that can watch files and run a command automatically upon a file change. So, for example, the following can be used to watch all Rust source files and run the tests:

ls src/**/*.rs | entr -s "cargo test"

This works great and I've been using it for years. However, recently I switched to a Mac which restricts the number of files that can be watched to 256. This is a problem for large codebases. Furthermore, it can sometimes be very difficult to figure out which files to watch exactly. For instance when watching LaTeX files, it is important to not watch the log files or entr would go into an infinite loop.

Show more

2023-01-24

GPT versus Google

In 2019, I finished my master's thesis on the topic of Natural Language Processing (NLP) and I thought that I understood the basics of Artificial Intelligence (AI) after that. However, I've now finally tried ChatGPT and have to admit that my main conclusion was proven wrong. It is extremely likely that AI will mostly replace search engines as we know them and in this post I document Google's current responses versus the responses from recent GPT models. Google's responses will probably be fun to look back on in 20 years.

First a bit of background. In 2019 when I did my thesis, BERT was just released. Just like OpenAI's newest models, BERT is based on the idea of the machine learning model called transformers. In my thesis I applied BERT to the problem of automatically responding to customers. The idea was to feed BERT with lot's of data from customers and build a chat bot to automate the company's support center.

Show more

2022-06-25

Why I still recommend Julia (for Data Science)

Yuri Vishnevsky wrote that he no longer recommends Julia. This caused lengthy discussions at Hacker News, Reddit and the Julia forum. Yuri argues that Julia shouldn't be used in any context where correctness matters. Based on the amount and the ferocity of the comments, it is natural to conclude that Julia as a whole must produce incorrect results and therefore cannot be a productive environment. However, the scope of the blog post and the discussions are narrow. In general, I still recommend Julia for data science applications because it is fundamentally productive and, with care, correct.

Show more

2022-03-19

Optimizing Julia code

I'm lately doing for the first time some optimizations of Julia code and I sort of find it super beautiful.

This is how I started a message on the Julia language Slack in response to a question about why optimising Julia code is so difficult compared to other languages. In the message I argued against that claim. Optimising isn't hard in Julia if you compare it to Python or R where you have to be an expert in Python or R and C/C++. Also, in that message I went through a high-level overview of how I approached optimising. The next day, Frames Catherine White, who is a true Julia veteran, suggested that I write a blog post about my overview, so here we are.

Show more

2022-02-16

Static site authentication

More and more companies start providing functionality for static site hosting. For example, GitHub announced Pages in 2008, Netlify was founded in 2014, GitLab annouced pages in 2016 and Cloudflare launched a Pages beta in 2020. Nowadays, even large general cloud providers, such as Digital Ocean, Google, Microsoft or Amazon, have either dedicated products or dedicated tutorials on static site hosting.

In terms of usability, the products score similarly. Setting up hosting usually involves linking a Git account to a hoster which will allow the hoster to detect changes and update the site based on the latest state of the repository. In terms of speed, the services are also pretty similar.

Show more

◀ prev

▶ next