PHP Classes

Crawler: Extract links and images from remote Web pages

Recommend this page to a friend!
     
  Info   View files Files   Install with Composer Install with Composer   Download Download   Reputation   Support forum   Blog    
Ratings Unique User Downloads Download Rankings
StarStarStar 52%Total: 6,443 All time: 327 This week: 488Down
Version License PHP version Categories
crawler 1.1Freely Distributable4.0HTML, Web services
Description 

Author

This class can be used to extract links and images from remote Web pages.

It can access Web pages, parse the pages HTML and extract the URLs of the links and the images.

If necessary, the class may access a login page and emulate the submission of a login form to subsequent accesses can be done on behalf of the logged user.

Innovation Award
PHP Programming Innovation award nominee
March 2008
Number 7


Prize: One copy of Delphi for PHP
Retrieving Web pages from remote sites is a relatively easy task in PHP.

If you want to crawl a site to search for something in its pages, you only need to retrieve the site pages, use some regular expressions to extract the site links, and retrieve the linked pages until all pages were followed.

However, if some pages can only be accessed by authenticated users, the problem is no longer so simple.

This package provides a more complete solution to the problem of crawling site pages by automatically authenticating, so it can access all pages restricted to logged users.

Manuel Lemos
Picture of Md. Shaiful islam
Name: Md. Shaiful islam <contact>
Classes: 1 package by
Country: United States United States
Innovation award
Innovation award
Nominee: 1x

  Files folder image Files (4)  
File Role Description
Plain text file Crawler.php Class The Class
Accessible without login Plain text file ExampleCrawlImage.php Example Crawl Image form http://www.phpclasses.org/ site
Accessible without login Plain text file ExampleCrawlLink.php Example Crawl links form http://www.phpclasses.org/ site
Accessible without login Plain text file ExampleLoginCrawlLink.php Example Login and CrawlLink from a site

The PHP Classes site has supported package installation using the Composer tool since 2013, as you may verify by reading this instructions page.
Install with Composer Install with Composer
 Version Control Unique User Downloads Download Rankings  
 0%
Total:6,443
This week:0
All time:327
This week:488Down
User Ratings User Comments (3)
 All time
Utility:75%StarStarStarStar
Consistency:69%StarStarStarStar
Documentation:-
Examples:76%StarStarStarStar
Tests:-
Videos:-
Overall:52%StarStarStar
Rank:2462
 
exellent!
3 years ago (Jeff Dudas)
70%StarStarStarStar
Does not work for linked in
12 years ago (Mansoor Rana)
12%Star
Lacking recursion, it doesn't actually crawl.
16 years ago (wahoo frankinson)
32%StarStar