pixel (pinterface) wrote,
pixel
pinterface

Parsing robots.txt Files with Common Lisp

Do you use Common Lisp? Do you scrape web pages, and need to deal with the robots.txt file to play nicely?

Well have I got a deal for you! For the low, low price of absolutely free you too can parse robots.txt files with my Common Lisp library machina-policy1. See the readme for a brief overview.

Don't forget to join the mailing list to offer ideas, submit bugs, and make requests for improvement.

For more web-scraping fun, you might try some libraries with which I am completely unaffiliated and haven't yet tried: cl-web-crawler, cl-kappa, and css-selectors; or my own Oh, Ducks!.

  1. machina-policy was very briefly named robots.txt. How not to name a library, ladies and gentlemen.
Tags: lisp, robots.txt, web scraping
Subscribe

  • CL-FTP Bugs Squashed

    One of the perks of being unemployed are that I have a reasonable amount of downtime again, so I've been spending some of it updating libraries…

  • Clojure Is For Me, Go Is For My Team

    The Basic Problem The Method Inheritance Is… Uh… Missing? Clojure Go Subjective Comparisons Recommendation Available…

  • I'm Giving A Talk Today!

    Apologies for the short notice--I've been rushed just trying to get the darn thing ready--but I will be giving a talk at Iowa Code Camp on…

  • Post a new comment

    Error

    default userpic
    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 3 comments