Genkan
A free, English, map-first explorer for cheap Japanese property — ~9,900 vacant-home (akiya) listings across 46 prefectures plus the courts' foreclosure auctions — built from public government data. A working answer to “what's the moat of a property-listings site?”
The thesis
Akiya-mart's moat isn't its listings — those are a commodity, aggregated from the same public Japanese property portals and municipal akiya banks anyone can reach. The moat is the continuous ingest/normalize/freshness pipeline, the SEO, and a paid human concierge. So the interesting engineering question is: can you rebuild that pipeline from the same upstream sources? Genkan is the answer — and then it goes further, on the things an aggregator under-serves for foreign buyers: risk legibility, yield, and court auctions in English.
The data pipeline
Crawl
Node + PlaywrightPolite, resumable crawlers (honest UA, ≥3.5s spacing, backoff, no concurrency) over Japan's national vacant-home bank and the courts' foreclosure-auction site. Facts only; photos/PDFs are linked, never re-hosted.
Geocode
GSI (国土地理院) via curlEvery address → street-level lat/lng with the free government geocoder. Shelling out to curl beat Node fetch, which hung on stale sockets under WSL — a real bug, documented in the playbook.
Normalize + translate
Deterministic JA→ENRe-derive price from source text (man-yen decimals, drop all-9s sentinels), rule-based translation of transport/type/layout, era dates (令和/平成) → ISO. No LLM, no API — instant and 100% covered.
De-ad photos
Frequency ruleListing galleries are riddled with shared ad/banner images — one ad appeared on 2,226 listings. Any photo URL seen on ≥3 listings is dropped. Real property photos are unique to one listing; that frequency signal is the reliable filter.
Payload split
slim + full JSONA slim ~1.2MB-gzip client payload (map/card fields only) + a full server-side set for listing pages and SEO. CDN-cached on Vercel; the browser never downloads the 26MB full dataset.
Enrichment an aggregator skips
Risk legibility
Per-listing deep links to government hazard maps (flood/landslide), J-SHIS earthquake probability, and a clear rebuild/zoning (市街化調整区域) flag — the things foreign buyers can't easily check.
Investment snapshot
Transparent rent + gross-yield estimate. A ¥500k akiya naively computes to 172% yield; folding in a renovation budget brings it to a credible 8.6% — honest, not a fabricated AirDNA number.
Court auctions in English
The courts' 競売 data with a plain-English "how auctions work" explainer, occupancy + deposit flags, and a minimum-bid label that's never shown like an asking price. No English competitor does this.
Map-first explorer
MapLibre + free OpenFreeMap tiles, price-label pins, viewport-synced list, distinct layers for for-sale vs auction, USD/JPY + m²/ft² toggles.
How it was built
Competitive read
Started from a question: what's Akiya-mart's moat? Conclusion — not the listings (commodity, from public upstreams) but the ingest/freshness treadmill + SEO + a human concierge. So: rebuild the pipeline from the same public sources; don't scrape the incumbent.
Legally-clean sourcing
Chose the MLIT-subsidised national akiya bank + the courts' BIT auction site — public-mission data where facts aren't copyrightable. Photos and court PDFs are hotlinked with attribution, never re-hosted. The whole posture is documented and auditable.
Two sources, one safe map
A second data source (court auctions) has a totally different shape — a session-POST crawl, minimum bids, occupancy risk. The hard part was UX safety: an auction floor price must never read like a normal asking price. Distinct badges, a mandatory explainer, a toggleable map layer.
Honest enrichment
Every estimate is labelled indicative and shows its assumptions. The yield estimator caught its own fantasy 172% number and corrected for renovation cost. Data-quality red flags (¥64 "land", scanned PDFs with no text) were found by probing, not assumed.
By the numbers
~9,900
Listings, 46 prefectures
99%
With real (de-ad'd) photos
2
Independent data sources
470+
SEO pages (pref + city)
What's next
- Freshness treadmill — scheduled re-crawl with new/sold/price-changed diffs
- City-level yield + short-term-rental (民泊) legality flags by municipality
- GEO dominance — be the cited English source on Japanese akiya for LLMs and search