Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ya, vector search is certainly the most common hammer to reach for when you're trying to figure out what subset of your full dataset to share with an LLM. And you're right, it's probably a piece of the puzzle for coding with GPT against a large codebase.

But I think code has such a semantically useful structure that we should probably try and exploit that as much as possible before falling back to "just search for stuff that seems similar".

Check out the "future work" section near the end of the writeup I linked above. I have a few possible improvements to this basic "repo map" concept that I'm experimenting with now.



No I agree the distilled map is most useful in context. However I wonder if providing a vector store of the total code base amplifies the effect. You could also pull in vector stores of all dependencies as well. Regardless amazing work and looking forward to seeing your future work as outlined.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: