Constructing a Search Engine for Programmers

Structure a Search Engine for Programmers
37 remarks
Hey HN, I have actually just recently started dealing with a side-project. It’s basically a vertical online search engine for programmers. You’ll be able to rapidly explore paperwork, GitHub repos and stack overflow. It’ll understand what language you’re utilizing and what task you’re working on and tailor results accordingly.

What other features would you want to see in this tool?

Here’s an usage case you may or may not be interested in. Security Research study.

As a security analyst, when I’m attempting to find out what a malicious script/ executable does, it typically includes looking for unusual strings I discover in files that seem like they would be quite distinct. And even specific sets of strings, even if it isn’t extremely easy.

Google used to be incredible for this. Now it is still the very best that I know, however it has actually gotten gradually worse throughout the years. Essentially the best tool for a job is one I would describe as infuriatingly bad.

I believe the issue is that it does excessive to attempt to safeguard a person from destructive stuff on the web. It likewise does excessive to think what you might in fact desire rather of offering you what you requested.

Most likely the biggest single thing that Google has done to screw it up is that it no longer respects quotes. Maybe there’s a workaround however I haven’t figured it out. 10-15 years ago if I wrapped something in quotes Google would offer me exactly what I desired. Now its extremely finicky and 70%of the time it gives me what it believes I want. These guesses are often incorrect.

That being said, there is likewise an usefulness to being able to look for a code bit or another string and get things that are very comparable, even when they aren’t a specific match. I believe having several modes would be useful.

I believe there may be some overlap between what I explained above being useful for security researchers and what may be beneficial for programmers.

wow this is exceptionally informative. Security Research might be an actually good/niche starting point too.

I don’t want to be checking boxes or toggles, or retrying the question with different text over & over to get the engine to comprehend these distinctions or when one class of responses is suitable vs the other.

Actual search, including punctuation and spaces.

Absolutely this, google strips an engineers search to make it a consumer search, refraining from doing this would be a huge value include

Tools surrounding search history. Things like picking outcomes as your personal responses to a query, attempting to identify when you’re asking something you have actually asked previously.

This is based on something Hillel Wayne wrote (@hillelogram) composed on twitter not that long back. The essence of it was it’s ok that we’re all utilizing Google as part of our programming workflows, however why in the world should we ever require to ask the very same concern two times? If you can discover those threads, there might be more there than what I simply stated.

Obviously personal privacy is an issue here, but while I’m leery of Google knowing whatever I do, I’m a lot better with a technical search engine knowing which pieces of syntax I can’t keep in mind.

It’s great for me if Google sees some basically confidential query for “apple” from me, and sees what I selected, and learns in aggregate what searches for “apple” indicates, possibly even with extra context that assists differentiate different modes in the outcomes.

It begins being a big problem if the next time I go to, Google first identifies it’s me, looks up an unique index of my chosen outcomes, and modifies the results to my query to be those.

– It would be great to detect the sydtem you’re using rather of composing “ Ubuntu 19.0464 bit” prior to inquiries.

– Would be even cooler to detect the IDE, or perhaps the error message itself (obviously be careful not to leakage sensitive info)

I often find that obscure article typically have more extensive answers to extremely specific niche issues than SO.

Also, the marketplace definitely exists. A great deal of time, if I am not sure how to formulate my question yet, google really draws. It likewise frequently stops working to find projects/tricks that I understand exists and am able to find through my GitHub stars or internet browser history after some tedious browsing.

I feel this! I utilize GitHub stars more than I confess to, especially when it pertains to distinct libraries that I’ll never ever find through google.

If including code snippets, it would be excellent to easily export it to an online playground or other sandbox.

Oftentimes I find an example or paperwork, and I copy it to a playground to modify it/ try out whatever function I am executing.

Besides date ranking, which has been requested, maybe ranking code samples and GH issues on some proxy for code quality? Significance, I would prefer to see a bit of django or numpy prior to some aadwark repo

Finest possible sneak peek snippets would be necessary when trying to find trivial stuff: cant keep in mind exactly how to do sth however knew it once (google does that for the very best SO match i believe)

Exception messages might be an essential thing to concentrate on. That is the second thing i when online search engine matter to me typically (support fuzzy search here: abstract away the too concrete things however keep the actual message).

It would be excellent if you would comprehend versioning of documenations: it constantly takes me a while to understand if the docs use to the variation i am in fact utilizing.

browsing the most relevant variations are agonizing!

Google and DuckDuckGo both appear quite bad at surfacing more recent variation of source code and documents, so there’s a lot of room for improvement there.

I would want to have the ability to ideal click on an error in my IDE (I utilize vscode) and after that run a search on that (filtered for the present language, env etc)

wow totally! Xcode kind of does this but this would save so much time if it worked appropriately!

So a mix of easy search of the wayback machine or a search in all online books.

wow! never ever considered books as a source of details for this. This might be super beneficial, something that google does not currently do.

A VS Code extension that contributes a command to browse it which opens the search results in the ideal pane. I assume you may currently have a VS Code extension in the works, as that would be the best way to learn what project/language is being dealt with.

yep my solution was more signin with GitHub and choose your current job however VS code extension is a no-brainer!

I would want google searches that have actually been filtered for the present language, tools and so on

yeah! the search engine must understand what project you’re dealing with. Looks like such a quite standard thing that hasn’t been developed yet haha.

I believe developers invest a lot of time looking for options online and making that process more instinctive will be a huge win for everybody haha.

I might see this as being really beneficial. If you’re looking for assistance let me know – I wish to hang out on something like this.

Learn More

Share this post

Leave a Reply

Your email address will not be published. Required fields are marked *

scroll to top