Recently the torrent download seems to have picked up again. So here are some hints for the other 20 users that are trying to get the full Geocities Torrent as well (all assuming you are running some sort of Unix and want to serve the Geocities copy through a web server):

Decrunching

The torrent directories UPPERCASE, LOWERCASE and NUMBERS contain huge amounts of archives. Look into these directories to find out which archives don’t have files that end in “.part” – this means they are already completely downloaded and can be decrunched. Use the following command to create a list of these archives:

$ find . -iregex ".*geocities-[a-j].*001" > list.txt

The regular expression is looking for archives from the letter a to the letter j. Then run this perl script in the same directory:

open FILE, "< list.txt";
open LOG, "> log.txt";
while (<FILE>) {
    chop; chomp;
    my $tarname = substr($_, 0, -7) . "\n";
    print `7z -y x $_`;
    print `tar -xf $tarname`;
    print `rm $tarname`;
    print LOG $_."\n";
};
close FILE;
close LOG;

Try monitoring the progress by tailing log.txt.

Serving

The torrent’s directory www.geocities.com contains a lot of file links to the YAHOOIDS directory decrunched before. Unfortunately, the file links make assumptions about where you put the torrent data in your file system. For example

1969bronco -> /geocities/YAHOOIDS/1/9/1969bronco

means that there should be a folder named “geocities” in the top hierarchy of your hard disk. LOL WUT! Relative file links would have been a better option. To fix them, first cd into www.geocities.com and save the absolute file links in a list:

$ ls -l > list.txt

Then run the following perl script:

open LIST, "< list.txt";
while (<LIST>) {
    if (~/(\S+) -> \/geocities(\S+)/) {
        print `rm -v $1`;
        print `ln -sv ..$2 $1`;
    }
}

The resulting file links look like this:

1969bronco -> ../YAHOOIDS/1/9/1969bronco

A last hint: You can speed up the loading of the www.geocities.com index in your web browser by navigating to its address on your web server, wait once for the index being created, and then saving the generated HTML as “index.html” in the www.geocities.com directory.

Browsing

Firefox is recommended to browse the contents of the Geocities Torrent, because it supports userscripts. Many URLs on Geocities pages point to absolute locations that no longer exists. The Geocities Torrent Link Fixer will change URLs in links, framesets, images and background images to point to your local copy. Inside the script, change the base URL “localhost/geocities/” according to your hosting setup.

??¿

If you have questions please use the comments.


9 Responses to Tips for Torrenters


Leave a Reply

Your email address will not be published. Required fields are marked *