KwadMarket Docs
Product Roadmap

Shop Scrapers

Expanding scraper coverage, price history and product matching

Current state

Two shops configured in apps/scraper/src/scraper/site-configs.ts:

  • drone-fpv-racer.com (DFR) — PrestaShop, PLP + PDP scraping
  • studiosport.fr — OASIS Commerce, GTM dataLayer parsing

The architecture is extensible via SITE_REGISTRY — adding a shop means defining selectors + pagination config.

Target shops (FR market)

ShopPlatformPriority
drone-fpv-racer.comPrestaShop✅ Done
studiosport.frOASIS✅ Done
lacameraembarquee.frCustom/WooCommerceHigh — large catalog, popular FR shop
rcmodelisme.frPrestaShopMedium — big catalog, multi-category
flashrc.comPrestaShopMedium — good FPV section
team-blacksheep.comCustomMedium — official TBS store
getfpv.com / racedayquads.comShopifyLow — US-based, USD prices
dfrshop.comPrestaShopLow — sub-brand of DFR

Per-shop config shape:

site-configs.ts
{
  key: 'LACAMERAEMBARQUEE',
  label: 'La Camera Embarquee',
  baseUrl: 'https://www.lacameraembarquee.fr',
  categories: [
    { url: '/fpv/chassis', category: 'FRAME' },
    { url: '/fpv/moteurs', category: 'MOTOR' },
  ],
  selectors: {
    plp: { productCard, name, price, url, image, pagination },
    pdp: { name, price, description, images, specs, sku, availability, brand }
  }
}

Price history

Track price changes over time — powers price trends on product pages and the price-history charts roadmap item. Cheap to start recording now, valuable later.

model PriceHistory {
  id        Int      @id @default(autoincrement())
  shopId    Int
  shop      Shop     @relation(fields: [shopId], references: [id])
  price     Decimal
  currency  String
  available Boolean
  scrapedAt DateTime @default(now())

  @@index([shopId, scrapedAt])
}

Product matching

Scraping creates ScrapedProduct records needing matching to Product entries. Improve auto-matching: fuzzy name matching (Levenshtein), brand + category + name combo, SKU/EAN when available, and an admin UI showing suggested matches with confidence scores.

Scheduling & hygiene

  • Daily scrape for price/availability; weekly full catalog scrape for new products; immediate re-scrape on admin trigger.
  • Stagger scrapes + delays between requests (polite scraping); respect robots.txt, identify the bot — see the scraping posture note in the launch checklist.
  • Scraper health dashboard (last run, success rate, product count) + alert on repeated failures (DOM structure changed) — pairs with Sentry on the scraper.

Tasks

  • La Camera Embarquee, rcmodelisme.fr, flashrc.com configs
  • PriceHistory model + recording on syncPrice
  • Improved auto-matching + confidence scores in the admin UI
  • Cron scheduling (daily price, weekly catalog); rate limiting
  • Health dashboard + failure alerts

On this page