logoalt Hacker News

Show HN: StartupWiki – A Free Alternative to Crunchbase

220 pointsby shpranyesterday at 3:59 PM66 commentsview on HN

I've been building StartupWiki, a free startup database designed to make it easier to discover and research companies.

The original motivation was frustration with how difficult it can be to find information on early-stage startups. Most databases need accounts, or subscriptions, ro just feel too cluttered. I wanted a website that felt like Wikipedia, no accounts, no subscriptions, no weird metrics, just go in, the info is on the page.

The project is still very early, but currently includes:

Startup profiles Search and filtering Company categorization Public API (in progress)

I'm especially interested in feedback on:

What information you look for when researching startups Features missing from existing startup databases API use cases

I'd love to hear feedback.


Comments

tlbyesterday at 7:25 PM

0 for 10 on some startups (large and small, YC and not) that came to mind.

It's easy to scrape YC startups from https://www.ycombinator.com/companies. Scrape that and a dozen other investors' portfolio pages and you'll have a useful fraction of startups.

show 2 replies
networkedtoday at 11:55 AM

What is your content/data license? I don't see anything about this on the site. For something to feel like a community wiki, the community needs to co-own the content and be able to fork. If you think the content is in the public domain because of AI, applying a license like CC BY or CC BY-SA won't hurt, but the content is copyrighted, not applying a license will. (This isn't legal advice.) See "WP:CRANDO" (https://en.wikipedia.org/wiki/Wikipedia:Copyrights#Contribut...) for how Wikipedia does it.

CharlesWyesterday at 5:55 PM

I expected the VERIFIED badges to link to some sort of provenance information. That seems like a must, otherwise (given the "assume everything's incorrect" disclaimers) I'm not sure why one would take that badge seriously.

show 2 replies
adrianwajtoday at 12:47 AM

It's a good idea. Why not ask startups to upload a startup.txt (as opposed to robots.txt) to their web root and collect from that? Pre-filled text forms can be downloaded. Also, as with CB, collect data on individuals through a similar opt-in. Enable users to ping your site when it's ready to collect.

You could have a "traction" stat and ask for a JS snippet be installed on homepages or a set of pages. Old school and unreliable. Registered users is also a good way to assess traction. Not sure how that information could be readily obtained.

In my previous comment I mentioned attaching a crypto address to domains - you could do that too. That'd be interesting. One feature you could add long-term is crowdfunding. Either for new features, code releases, media, documents - whatever.

Crowdfunding activity on startups and individuals would be a great way to measure traction.

show 3 replies
dgrin91yesterday at 5:33 PM

It sounds like none of the data will be reliable? Ai and community seems like very little will be true and I will have no idea which part will be true.

show 1 reply
clapthewindtoday at 5:47 AM

Build trust, collect data from cdrowdsource, if you want to succeed on this.

Build trust by: truly making this a public good, by open sourcing it. Be the maitnainer. Data dump every week as a zipball/tarball. These will ensure you can't rugpull.

With this trust, offer an extension (open source of course) to all, which whever a user goes through crunchbase, traxn, etc, sends any factual data (hence non-copyrightable) to you. If you gained trust, I would also do this.

You get the right to be a maintainer, and figure out if you also want to make a business with it on top.

Calgaryptoday at 1:03 PM

Hello ! Great project. Do you plan to make it open source as it is already free to use ? If so already, I didn't find the github repository.

chaidhatyesterday at 8:59 PM

How about expose an API so that users can put the name of a startup and it goes through your AI agent pipeline to acquire an estimate? That way, you don’t need to know every startup under the sun and focus on optimizing your pipeline instead.

pi-victoryesterday at 9:46 PM

a random complain on my part would be the log in with google. hate that. looks great, otherwise. i don't even have a problem creating an account, honestly. but i try to not use the google for anything unless i have to.

show 1 reply
zoppertoday at 9:09 AM

Really cool concept but so much of the data is wrong. Anthropic ARR is an order of magnitude higher, Replicate did a Series B as well which is not mentioned. There is probably a lot more.

wetttoday at 2:03 AM

Would you consider allowing people to login with OpenRouter?

https://openrouter.ai/docs/guides/overview/auth/oauth

Would be a good way to have others absorb some of your inference limits and fill in missing data that they need. A call to action on a blank search would be a great flow.

shpranyesterday at 8:10 PM

just added a agent ledger, it shows exactly what the agents were doing during the pipeline, u can find it at the top of the sources tab. (it truncates part of the ledger sometimes though, working on fixing that bug)

lowkey_today at 2:30 AM

Looks super cool and love the idea!

I saw Clerk and noticed that it says that they have verified 250 months runway. Maybe true but sounds crazy high.

Maybe if there's a specific article a verification is attributed to, you could add it being cited?

Anyways thanks for making this.

shprantoday at 1:07 AM

I just approved a whole bunch of micro businesses! I do not get payed for this, and these aren't ads, instead I do it so people can find under the radar early companies.

chirauyesterday at 7:49 PM

you may be relying on AI to do the heavy lifting for you too much. If you are sending out agents, you should have strict rules around the recency of the data they are aggregating. Otherwise, you will end up with outdated and useless data.

brianbreslintoday at 3:25 AM

Can you add founder university affiliation? That's the only thing I begrudgingly pay crunchbase for and its wildly inaccurate.

holistioyesterday at 6:15 PM

It is unclear how I can list my company here. Are small companies coming later?

show 1 reply
sixtyjyesterday at 6:47 PM

https://news.ycombinator.com/item?id=48572472

Why do you ask again for feedback after three days?

show 2 replies
djvdqyesterday at 6:38 PM

I see quite outdated data. Anthropic listed with valuation 18B and latest round at 4b? Just to compare, their real latest round was 65b with valuation 965b.

show 2 replies
brokensegueyesterday at 8:34 PM

you should link your data to wikidata which will get you free connection back to crunchbase and other sources e.g. https://www.wikidata.org/wiki/Q97041185

You could even back some of the data from there

androiddrewyesterday at 10:52 PM

Yeah tried my own start up and found nothing. I don't know where your sources come from

show 1 reply
LewisVerstappenyesterday at 7:16 PM

Mobile view is not working on my iPhone. Scroll is messed up and the page is not properly fitting in the view.

show 1 reply
rkwapyesterday at 6:11 PM

Nice initiative. but, I am concerned about the reliability of the data. how are you gonna take care of that?

show 2 replies
physixtoday at 12:04 AM

How about adding some stats to your landing page?

show 1 reply
dineshmendheyesterday at 8:22 PM

I wonder why there are No Micro Companies Yet on the platform?

show 1 reply
hoomanmoyesterday at 10:56 PM

The search for Luma and Saronic didn't work

show 1 reply
anandukchyesterday at 7:22 PM

How are you going to take care of the genuineness of the data

show 1 reply
shpranyesterday at 7:37 PM

just added roughly 20 startups, focusing on biotech

_el1s7today at 8:20 AM

Looks like a vibe coded slop.