PDX Rust Meetup — Spidering Wikipedia Politely In Async Rust

[Rescheduled after snow day]

How many pages are reachable from Wikipedia's page on the Rust programming language in two hops? Around 30,000, it turns out, including pages on wheat flour, Welsh orthography, and the zombie apocalypse.

As it turns out, it's super easy to do this exploration using asynchronous Rust code. Wikipedia offers a cute little REST API for querying links, and it's easy to use Serde to generate requests and parse replies. And if you're feeling guilty about flooding a precious public resource with silly API requests, it's also super easy to do rate limiting.

Jim Blandy will show how to wire up Tokio, Reqwest, and Serde to do the spidering, and whip up a mock server for testing using Warp. The techniques shown work nicely for all kinds of REST API scripting, including, say, GitHub.

Tags: programming languages, rust, computer programming, open source, rustlang, meetup:event=306270454

Imported from: http://calagator.org/events/1250481808

February 27, 2025 06:30 PM

-

February 27, 2025 09:00 PM
Maseeh College of Engineering, Portland State University: 1930 SW 4th Ave #500, Portland OR 97201 United States

Location