Insideout Google Page 


could be nice 

 Creating a browsable, searchable copy of long threads.

Embarking on an 1176 build, I became a little frustrated with the search capability at the Lab. Usually, this was because a search returned something like the “All Things G1176 – the new ‘repost’ thread”, which is currently 59 pages long. You have no idea where in the 59 pages the bits you want are, and so have to trawl through them all…again and again and again. Or save each and every page using “file->save as” etc, which is pretty laborious and still doesn’t give you much in the way of useful searching or browsing.

Reading various threads, it became apparent that I was not the only one wanting a more accurate search of long threads.

So here is my answer, which gives you:

  • A fully browsable, text searchable, local archive of whatever thread you choose. Now enter your search word(s) in and get the pages you want !
  • The ability to combine all the pages into one or more documents, which can then be printed out.
  • With a bit of Excel trickery, bulk auto download of all the pages in a Prodigy-Pro forum thread.

Required:

Firefox web browser http://www.mozilla.com

The Firefox ScrapBook extension from https://addons.mozilla.org/firefox/427/ or the homepage http://amb.vis.ne.jp/mozilla/scrapbook/

 

I won’t go into installing Firefox and ScrapBook other than to say tools->extensions is a good place to start in Firefox.

Once ScrapBook is installed, for any page open in Firefox, you can simply right click and go “Capture Page”, which makes a copy in ScrapBook. Pretty easy. You can use the "Capture As" option to define the depth you want to follow links to and the types of files you want to download.

 

To see your captured pages, in Firefox, open ScrapBook, and you will get a panel on the left hand side with the pages you have captured, which you can trivially browse and search.

 

 

Here’s a shot of my search for “polyester”, which you can see returned 4 pages from the 59 captured from the “All Things G1176” Thread. You can also see on the left the index of  pages that have been captured, which are browsable simply by clicking on them. Those of you who have despaired at your search always returning a long thread can now rejoice and get to what you want :-) Your search terms will be highlighted.

Also, if you consult the ScrapBook help, you will see that it capable of boolean searches, so you can "and", "or" and "not" both words and phrases.

 

 

Worst case – with 59 manual “captures” you can have a nicely searchable, browsable archive of the thread. Read on to see how even that can be made easier.

 

Now some tricks:

 

  1. Auto downloading lots of pages

ScrapBook can grab URL’s you have on the clipboard, so you can copy and paste in a text file of URL’s and grab the lot, so avoiding the 59 manual captures to get the G1176 thread. Here is how:

The 1176 thread URL works like this

http://www.prodigy-pro.com/forum/viewtopic.php?t=646 (page 1)

Each page then adds a “start=” number that increments the number by 15, so you get a series of URLs like this

 

http://www.prodigy-pro.com/forum/viewtopic.php?t=646&postdays=0&postorder=asc&start=0 (p1)

http://www.prodigy-pro.com/forum/viewtopic.php?t=646&postdays=0&postorder=asc&start=15 (p2)

http://www.prodigy-pro.com/forum/viewtopic.php?t=646&postdays=0&postorder=asc&start=30 (p3)

http://www.prodigy-pro.com/forum/viewtopic.php?t=646&postdays=0&postorder=asc&start=45 (p4)

 

In Excel it is dead easy to enter in the first few, then drag down 59 rows to increment each by 15 and you have all the URLs required. Copy that into the ScrapBook->tools->capture multiple URLs and you can grab the lot!

Other threads on Prodigy-Pro work exactly the same way, just with a different base URL.

After copying the 59 cells from Excel, ScrapBook detects them from the clipboard and loads them all up. Select your target folder and just go tools->“Capture All Tabs”:

 

  1. Saving to particular directories.

This is a bit confusing initially. If you right click->capture, you have the option of making a new folder to save to. However this is only a new folder under the default root directory, not an arbitrary location of your choice. Initialy I got around this by going to the ScrapBook->tools->settings->advanced section of Scrapbook and changing the root directory. Or, after capturing what you want, you can go ScrapBook->tools->export and move your captured pages somewhere else.

But even better - now I'm getting the hang of ScrapBook more - is to create multiple scrapbooks with their root folders in whatever location you want. Then to save there, you select the scrapbook of choice and capture to it.

But to do this, first you must go into the advanced tab of the tools->settings and "Enable Multi-ScrapBook" - it is not on by default:

When enabled, you get a new dropdown list of your different scrapbooks, plus the "configure" option which lets you add new ones in different root locations:


 

  1. Combining captured pages

I like this, for example I have turned the 59 pages of the G1176 thread into 6 documents of 10 pages each, which I printed out to read. Go to ScrapBook->tools->combine, select the captured pages you want to combine, select a location to save it to. I made another folder for combined documents, and did not tick the “delete after combine” box, in case something went wrong. It can take a while to process, but worked fine.

Combining the first 10 pages of G1176:

 

  1. Refreshing ScrapBook’s cache

For a while I couldn’t figure why I was only getting results from 3 pages. This fixed it:

ScrapBook->magnifying glass in the search bar->tools->update cache for full text search.

 

ScrapBook has lots of other good features I won’t go into, (eg - insert your own comments and notes into the pages) but my hope is that this will help those of us who want to get to the detail in large threads easily.