Over at SEOmoz, Dr. Pete has come up with quite an informative and interesting post, that gives suggestions as to how you can beat around Google's 1,000 page hurdle. As per the post, one of the hurdles that is faced by every SEO firm working on big websites is that Google only allows you to view a small part of the search index and Webmaster Tools has the same problem as well. This fraction includes only 1,000 pages of any result.
When you are looking at a website with more than 10,000 indexed pages, it becomes increasingly frustrating if you are able to view only 1,000 pages per search result, especially when you are trying to get pages discovered, struggling with duplicate content, confirming robots.txt changes, or doing advanced index sculpting.
So, you need something special too get past the 1,000 pages limitation. Let's see what exactly can be done to do just that. For the following part of this post, I am going to use www.example.com as an explanatory tool:
The Tools Ă˘â‚¬â€ś Site: and Inurl:
When you give Ă˘â‚¬Ĺ“Site:Ă˘â‚¬Âť command, it returns the indexed pages from any given domain or sub-domain. The other tool at your disposal is the Ă˘â‚¬Ĺ“inurl:Ă˘â‚¬Âť command, that when paired with other search terms restricts the results to only those containing a specific keyword in the URL. So, in this regard, when you pair "site:" command and the Ă˘â‚¬Ĺ“inurl:Ă˘â‚¬Âť command, Google only reveals indexed pages which contain those URL keywords.
The Process- Index Deconstruction:
Suppose if example.com has 12,000 indexed pages, it would be an arduous task to find out which pages are included in the roughly 12,000-page index when we can only see those pages 1,000 at a time. But the solution this problem are the last three words. If you construct your searches in different ways, the 1,000 displayed pages won't be the same next time. It means that if you keep changing your index searches logically, then you can break the full index up into separate doable divisions. For this to happen, you need to use "inurl:" to force the "site:" command to show the index through smaller windows.
It all comes to down to that, you cal always make use of multiple "inurl:" statements (one for each word) in your search. Using "-inurl:" to exclude specific URL keywords from any given search is another possibility to get the job done. In the end, you are also left with the option of combining "site:", "inurl:" and stand-alone keywords to target indexed pages by URL and content keywords in one statement.