homeblog
tags
  • Published on
    November 4, 2024 (Today)

    The Infra to handle 10M Requests in 10 Minutes for $0.0116

    InfrastructureKubernetesTerraformCloud-ComputingRedisDistributed-Systems
    A comprehensive guide to setting up a highly efficient, scalable infrastructure to process 10 million requests in 10 minutes at a minimal co...
  • Published on
    October 30, 2024 (1mo ago)

    27.6% of the Top 10 Million Sites are Dead

    internet-decaydomain-analysisinactive-websitestop-domainsweb-crawlerkubernetes
    An analysis of the top 10 million websites reveals that over a quarter are inactive, highlighting the web's shifting landscape. Using a high...
  • Published on
    October 14, 2023 (1y ago)

    Web Crawling at Scale: Navigating Billions of URLs with Efficiency

    KubernetesWeb-CrawlerGolangNodejsDistributed-System
    Dive into the world of distributed web crawling with Golang, Docker, and Redis. Learn the logic behind efficient code, use Bloom filters for...
  • Published on
    October 13, 2023 (1y ago)

    The Architecture of a Web Crawler: Building a Google-Inspired Distributed Web Crawler. Part 1

    KubernetesWeb-CrawlerGolangNodejsDistributed-System
    Unlock the potential of the web with a Google-inspired distributed web crawler. Explore scalable solutions using Kubernetes, Golang, Python,...
  • Published on
    July 31, 2023 (1y ago)

    How to efficiently scrape millions of Google Businesses on a large scale using a distributed crawler

    DockerDevOpsFabricDistributed-SystemCrawler
    Explore building a powerful distributed crawler using Crawlee, a JavaScript-based headless browser, for efficient web scraping of Google Map...
1 of 2Next
© 2024 Built with 💖 by Tony Wang • With TypeScript, Next.js, Tailwind • Inspired by Leerob
Support me • Contact me