Show HN: Satellite imagery object detection using text prompts

6 points • by eyasu6464 • yesterday at 7:52 AM • 0 comments • view on HN

I built a browser-based tool for detecting objects in satellite imagery using vision-language models (VLMs). You draw a polygon on the map and enter a text prompt such as "swimming pools", "oil tanks", or "buses". The system scans the selected area tile-by-tile and returns detections projected back onto the map as GeoJSON.

Pipeline: select area and zoom level, split the region into mercantile tiles, run each tile with the prompt through a VLM, convert predicted bounding boxes to geographic coordinates (WGS84), and render the results back on the map.

It works reasonably well for distinct structures in a zero-shot setting. occluded objects are still better handled by specialized detectors like YOLO models.

There is a public demo and no login required. I am mainly interested in feedback on detection quality, performance tradeoffs between VLMs and specialized detectors, and potential real-world use cases.

alt Hacker News

Show HN: Satellite imagery object detection using text prompts

Comments