This is actually really cool. I just tried it out using an AI studio API key and was pretty impressed. One issue I noticed was that the output was a little too much "for dummies". Spending paragraphs to explain what an API is through restaurant analogies is a little unnecessary. And then followed up with more paragraphs on what GraphQL is. Every chapter seems to suffer from this. The generated documentation seems more suited for a slightly technical PM moreso than a software engineer. This can probably be mitigated by refining the prompt.
The prompt would also maybe be better if it encouraged variety in diagrams. For somethings, a flow chart would fit better than a sequence diagram (e.g., a durable state machine workflow written using AWS Step Functions).
exactly it is. I'd rather impressive but at the same time the audience is always going to be engineers, so perhaps it can be curated to still be technical to a degree? I can't imagine a scenario where I have to explain to the VP my ETL pipeline
I built browser use. Dayum, the results for our lib are really impressive, you didn’t touch outputs at all?
One problem we have is maintaining the docs with current codebase (code examples break sometimes). Wonder if I could use parts of Pocket to help with that.
As a maintainer of a different library, I think there’s something here. A revised version of this tool that also gets fed the docs and asked to find inaccuracies could be great. Even if false positives and false negatives are let’s say 20% each, it would still be better than before as final decisions are made by a human.
Woah, this is really neat.
My first step for many new libraries is to clone the repo, launch Claude code, and ask it to write good documentation for me. This would save a lot of steps for me!
Really nice work and thank you for sharing. These are great demonstrations of the value of LLMs which help to go against the negative view on the impacts to junior engineers.
This helps bridge the gap of most projects lacking updated documentation.
At the top are some neat high-level stuffs, but, below that, it quickly turns into code-written-in-human-language.
I think it should be possible to extract some more useful usage patterns by poking into related unit tests. How to use should be what matters to most tutorial readers.
Exactly, kudos to the author because AI didn’t came up with that.
But that’s what they sell, that AI could do what the author did with AI.
The question is, is it worth to put all that money and energy in AI. MS sacrificed its CO2 goals for email summaries and better autocomplete not to mention all the useless things we do with AI
do you have plans to expand this to include more advanced topics like architecture-level reasoning, refactoring patterns, or onboarding workflows for large-scale repositories?
Yes! This is an initial prototype. Good to see the interest, and I'm considering digging deeper by creating more tailored tutorials for different types of projects. E.g., if we know it's web dev, we could generate tutorials based more on request flows, API endpoints, database interactions, etc. If we know it's a more long-term maintained projects, we can focus on identifying refactoring patterns.
Have you ever seen komment.ai? Is so did you have any issues with the limitation of the product?
I haven't used it, but it looks like it's in the same space and I've been curious about it for a while.
I've tried my own homebrew solutions, creating embedding databases by having something like aider or simonw's llm make an ingests json from every function, then using it as a rag in qdrant to do an architecture document, then using that to do contextual inline function commenting and make a doxygen then using all of that once again as an mcp with playwright to hook that up through roo.
It's a weird pipeline and it's been ok, not great but ok.
I'm looking into perplexica as part of the chain, mostly as a negation tool
One thing to note is that the tutorial generation depends largely on Gemini 2.5 Pro. Its code understanding ability is very good, combined with its large 1M context window for a holistic understanding of the code. This leads to very satisfactory tutorial results.
However, Gemini 2.5 Pro was released just late last month. Since Komment.ai launched earlier this year, I don't think models at that time could generate results of that quality.
I've been using llama 4 Maverick through openrouter. Gemini was my go to but I switched basically the day it came out to try it out.
I haven't switched back. At least for my use cases it's been meeting my expectations.
I haven't tried Microsoft's new 1.58 bit model but it may be a great swap out for sentencellm, the legendary all-MiniLM-L6-v2.
I found that if I'm unfamiliar with the knowledge domain I'm mostly using AI but then as I dive in the ratio of AI to human changes to the point where it's AI at 0 and it's all human.
Basically AI wins at day 1 but isn't any better at day 50. If this can change then it's the next step
Yeah, I'd recommend trying Gemini 2.5 Pro. I know early Gemini weren't great, but the recent one is really impressive in terms of coding ability. This project is kind of designed around the recent breakthrough.
I suppose I'm just a little bit bothered by your saying you "built an AI" when all the heavy lifting is done by a pretrained LLM. Saying you made an AI-based program or hell, even saying you made an AI agent, would be more genuine than saying you "built an AI" which is such an all-encompassing thing that I don't even know what it means. At the very least it should imply use of some sort of training via gradient descent though.
The Linux repository has ~50M tokens, which goes beyond the 1M token limit for Gemini 2.5 Pro.
I think there are two paths forward: (1) decompose the repository into smaller parts (e.g., kernel, shell, file system, etc.), or (2) wait for larger-context models with a 50M+ input limit.
Some huge percentage of that is just drivers. The kernel is likely what would be of interest to someone in this regard; moreover, much of that is architecture specific. IIRC the x86 kernel is <1M lines, though probably not <1M tokens.
The prompt would also maybe be better if it encouraged variety in diagrams. For somethings, a flow chart would fit better than a sequence diagram (e.g., a durable state machine workflow written using AWS Step Functions).
Put in postgres or redis codebase, get a good understanding and get going to contribute.
Like at least one other person in the comments mentioned, I would like a slightly different tone.
Perhaps good feature would be a "style template", that can be chosen to match your preferred writing style.
I may submit a PR though not if it takes a lot of time.
https://news.ycombinator.com/item?id=42542512
The latter is what this thread claims ^
I think it should be possible to extract some more useful usage patterns by poking into related unit tests. How to use should be what matters to most tutorial readers.
The hype is that AI isn’t a tool but the developer.
https://news.ycombinator.com/item?id=41542497
But that’s what they sell, that AI could do what the author did with AI.
The question is, is it worth to put all that money and energy in AI. MS sacrificed its CO2 goals for email summaries and better autocomplete not to mention all the useless things we do with AI
Can you give an example of what you meant here? The author did use AI. What does "AI coming up with that" mean?
In the few years we will see complaints that it’s not AI that built a power station and a datacenter, so it doesn’t count as well.
I haven't used it, but it looks like it's in the same space and I've been curious about it for a while.
I've tried my own homebrew solutions, creating embedding databases by having something like aider or simonw's llm make an ingests json from every function, then using it as a rag in qdrant to do an architecture document, then using that to do contextual inline function commenting and make a doxygen then using all of that once again as an mcp with playwright to hook that up through roo.
It's a weird pipeline and it's been ok, not great but ok.
I'm looking into perplexica as part of the chain, mostly as a negation tool
One thing to note is that the tutorial generation depends largely on Gemini 2.5 Pro. Its code understanding ability is very good, combined with its large 1M context window for a holistic understanding of the code. This leads to very satisfactory tutorial results.
However, Gemini 2.5 Pro was released just late last month. Since Komment.ai launched earlier this year, I don't think models at that time could generate results of that quality.
I haven't switched back. At least for my use cases it's been meeting my expectations.
I haven't tried Microsoft's new 1.58 bit model but it may be a great swap out for sentencellm, the legendary all-MiniLM-L6-v2.
I found that if I'm unfamiliar with the knowledge domain I'm mostly using AI but then as I dive in the ratio of AI to human changes to the point where it's AI at 0 and it's all human.
Basically AI wins at day 1 but isn't any better at day 50. If this can change then it's the next step