The boilerpipe library for Java provides algorithms to detect and remove the surplus clutter (boilerplate, templates) around the main textual content of a web page.