Nuxt HN | SHOW HN: Day 1 of trying to fit a Chatbot into a QR Code

Image for day 1: https://i.imgur.com/bQ3Oxc5.png

After I tried to fit DOOM inside a QR code last time (https://news.ycombinator.com/item?id=43729683), I'm trying to continue this "series" to get an actually decent chatbot into a QR code.

This is, of course, not as easy as the former. I could always cheat and make a rule-based ELIZA style chatbot (that I actually dabbled with earlier) but I want to make something actually somewhat useful. I know quite little about how LLMs and Transformers fundamentally work so this will also teach me a lot about AI (also, will be public and Open Source when it actually turns into something somewhat cool)

Here's our limitations: The largest standard QR code (Version 40) holds 2,953 bytes (~2.9 KB). This is very small—a Windows sound file of 1/15th of a second is 11 KB! PLUS, we can't directly dump HTML/JS into the QR code, we need to compress it to BASE64 (or BigInt) which takes up 0.1-0.15Kb as well, so we have about 2.7Kb for the entire thing, yikes!

Here's what I did for day 1:

The first version (v0) was incredibly basic - a simple pattern-matching chatbot with predefined responses:

``` const V = "you,I,is,are,do,what,how,why,,...e".split(","); const P = [ [5,2,0,8], // what is you like [5,4,0,8], // what do you like.... [0,8,15,9] // you like me think ]; ```

(v1) added better CSS (still light theme), topic memory, sentiment analysis and transition patterns, but all this made the file size a bit over 4kb.

(v2) was v1 with more compression, lost features but shrank to 2.8kb.

(v3) added a retro UI because it seemed fitting, ASCII art and simplified text formatting with newlines, but it was still extremely dumb. (v4) and (v5) added more cuts to barely get it below the limit (2.85kb).

So I changed the approach for (v6) and went for a trie data structure for response lookups: ``` const t={h:{e:{l:{l:{o:["Hello! How can I help you today?","Hi! What's on your mind?"]}}}}}; ```

This allowed for prefix matching under our constraints AND there was no need for pattern matching.

(v7) was trying to optimise it, but it still ended up being around 3.3kb, better than before but still not very "intelligent".

For (v8), I took a lot of time and switched to a very basic implementation of a 2 layered neural network: ``` const network = { embeddings: new Float32Array(c.vSize * c.eDim), hidden: new Float32Array(c.eDim * c.hSize), output: new Float32Array(c.hSize * c.oSize), hiddenBias: new Float32Array(c.hSize), outputBias: new Float32Array(c.oSize) }; ```

This gives us a 582 char neural network that's 8 bit quantized but, as you would expect, this was huge, about 11kb.

(v9) and (v10) were basically minifying this further, down to about 3.2kb, not bad!

The last version I worked on today was (v10.5). I used word level processing instead of character level with 4D vectors, template responses with context awareness, better state tracking and 8 output dimensions. Also added a repetition penalty (currently a little broken) but is actually kind of good... 5.3kb good.

For Day 2, I'm thinking: 1. Implement better context handling 2. Optimize the neural architecture further (maybe a tiny transformer?) 3. Maybe find a way to compress it even more?

Resources: https://www.youtube.com/watch?v=aircAruvnKk https://www.youtube.com/watch?v=zhxNI7V2IxM&t=275s https://github.com/rasbt/LLMs-from-scratch https://github.com/lionelmessi6410/Neural-Networks-from-Scra...