Case Study - Powerful Web Scraping App Built in Go

Supacrawler is an open-source web scraping app built with Go, Playwright, and Redis. It crawls entire websites, scrapes pages in real time, and outputs data in LLM-readable formats like JSON or Markdown. It supports screenshots, streaming scraping pipelines, and integration with Supabase/S3.

Client
Open Source / Startup
Year
Service
SaaS & Web Crawling Platform Development

Overview

Supacrawler started as an open-source project aimed at solving common challenges in web scraping: efficiency, real-time crawling, and data formatting for AI/LLM workflows. Built in Go with Playwright and Redis, it allows developers to crawl entire websites, scrape pages immediately upon discovery, and export structured data in JSON or Markdown formats.

Oh, and it's open-source! View on GitHub

Supacrawler Dashboard

Open-Source Advantage

The open-source release allowed a focus on:

  1. Asynchronous Job Handling with Redis: Ensuring that multiple scraping tasks run concurrently without blocking the main process.
  2. Playwright Integration: Supporting JS-heavy websites and dynamic content scraping.
  3. Pipeline Optimization: Streaming crawled links immediately to the scraping service using Go channels, minimizing build size and ensuring efficient execution.

Startup / SaaS Deployment

When deploying Supacrawler as a SaaS platform, additional challenges emerged:

  1. VPS Hosting & CI/CD:

    • Implementing CI/CD pipelines with GitHub Actions.
    • Configuring permissions via Google Cloud and creating a dedicated service account.
    • Hosting via Docker Stack and integrating with Supabase for storage and auth.
  2. Frontend Development:

    • Built entirely in Next.js using React Query for optimal data fetching patterns.
    • Authentication interface integrated with auth-go / supabase-go.
    • Extensive optimizations for responsiveness and performance.
  3. Backend Testing & Optimization:

    • Used K6 to smoke test endpoints.
    • Settled on a lightweight VPS configuration: 0.2 vCPU, 1536MB memory with replicas, handling high loads efficiently.
Supacrawler Frontend Interface

Development Journey

Challenges Overcome

  1. Integrating Asynchronous Jobs with Redis: Ensuring concurrency without race conditions.
  2. Optimizing Real-Time Pipelines: Minimizing build size and streaming discovered links for immediate scraping.
  3. Seamless Frontend & Backend Communication: Combining Next.js frontend with auth-go / supabase-go for fast and secure interactions.
  4. Efficient Hosting & CI/CD: Deploying on VPS with Docker stack while automating builds and testing.

By solving these challenges, Supacrawler emerged as a robust, developer-friendly, and SaaS-ready web scraping platform, demonstrating efficiency and scalability at every stage.

Our Approach

  • Backend Development (Go, Redis, Playwright)
  • Frontend Development (Next.js, React Query)
  • CI/CD & Hosting (Docker, GitHub Actions, VPS)
  • Testing & Optimization (K6, Load Testing)

Supacrawler was built to empower developers with efficient scraping pipelines and SaaS-ready architecture. The open-source release allows the community to contribute while the SaaS deployment showcases its production-level readiness.

Antoine Ross
Developer of Supacrawler
Daily Active Users
100+
Efficient Build
0.2 vCPU
Links Crawled
1+ Million
Community Contributions
Open Source

More case studies

A Powerful Next.js SaaS Starter Template

Engineered a powerful and feature-rich Next.js SaaS starter to accelerate development processes and provide a solid foundation for modern web applications.

Read more

Streamlining Business With Chatbots and Automation

Ollabot is a platform that allows businesses to create chatbots and automate their customer support and lead generation.

Read more

Tell us about your project

My offices

  • Manila

    Metro Manila, Philippines
  • Quezon City

    Quezon City, Philippines