Backing up Hubpages hubs to your computer

75

By Pcunix


I hope you have back up copies of all your hubs. You just never know what could happen - while I'm sure HubPages backs up their servers regularly, it's ultimately your content that is at risk and therefore you need to be certain that you can resurrect your hubs if they are lost.

You might also decide at some later date that you want your hubs published somewhere other than HubPages. That's not a common decision, and not something I anticipate that I would ever do, but once again, you just never know. You've made an investment: protect it.

I'm going to present a number of ways to back up your hubs, both very simple and much more advanced. While I am dealing specifically with HubPages here, some of these methods could be useful for other sites that you may write at.

We'll start with the simplest method of all: a "Save As" from your browser.



Save As


Almost all browsers offer some choice that lets you save a local copy of any web page you are viewing. It may also offer you multiple ways to save the page; for example, here is Firefox offering four different options after I clicked on File->Save Page. These choices are explained at "Saving a web page" in the Firefox support documents.


Save dialog in Firefox
See all 5 photos
Save dialog in Firefox


Usually, you'd choose either the first or the second of those choices. The first stores an exact copy of the page, but will not save any referenced pictures. The second saves the pictures, but also changes the structure of the links that point at those pictures so that the local copies will be used if you view the local page (File->Open).

Your browser may only offer one or two choices - for example, Chrome, a popular alternative to Firefox, only offers the first two. Consider that any archive of your hubs is better than none at all!

Saving backups this way is handy, but it is clumsy if you wanted to regularly take fresh copies of all your hubs so that new comments and other changes you may have made are preserved locally. If you have several hundred hubs, it would also be quite time consuming.

There is another slight disadvantage in that you may be saving much more than you need. Both Firefox and Chrome store every Javascript file associated with a page in addition to the pictures. All of this is stored in a directory unique for each page, so there will be duplication of those scripts and any reused pictures. That wastes disk space, and you may not care about the Javascript files at all as they are specific to HubPages.

It would be nice to have some automated method to save backups, wouldn't it?

Which hubs?


If we're going to automate this, the first thing we need is a list of the hubs to save. Fortunately, HubPages makes that fairly easy for us. You may never have noticed this, but your HubPages Statistics has an "Export to CSV" option. The yellow arrow in the picture below shows where you would click to get that.


Export hubs
Export hubs


That will save a "Comma Separated Value" file to your computer. You could open that in any spreadsheet program (Excel, Open Office or another) and as you can see in this picture, it handily includes the actual URL's of your hubs.


hubs.csv in spreadsheet
hubs.csv in spreadsheet

Once we have a list of URL's, automating a backup of your HubPages hubs becomes much easier. For example, on Mac OS X, I could simply open a Terminal window and do

for i in `cat mylistofhubs`
do
curl -OL $i
done


to get every URL copied locally. On Linux, I'd probably use "wget" instead (and I could install that on Mac or Windows also). Note that "mylistofhubs" is NOT the hubs.csv we downloaded - it's a separate text file, possibly created from that download or even by hand.

Windows also has "VisualWget". You can paste your list of hubs into its Multiple Downloads list - I show that with just two URL's below, but you could put your entire list in.


VisualWget
VisualWget


Perl Scripts


I have mentioned Perl in some other articles here (see "Perl Scripts for Adsense", for example). Perl is included with Linux and Mac OS X, but can easily be added to Windows.

We can use Perl to automate this whole procedure from nothing more than the saved "hubs.csv" file.

Our first script only downloads the HTML files. It's short and simple:

#!/usr/bin/perl
use LWP::Simple;  
mkdir "Hubpages" unless -e "Hubpages";
open(HUBS,"/Users/apl/Downloads/hubs.csv") or die "No hub list!";
@hubs=<HUBS>;
foreach (@hubs) {
   next if not /http:/;
   next if not /Not Published/;
   ($url,$title,$junk)=split /","/;
   # a little cleanup
   $title=~s/"//g;
   $url=~s/"//g;
   $out=$url;
   $out=~s?http://hubpages.com/hub/??;
   # strip to basename
   open(OUT,">hubpages/$out") or die "Can't create $out $!";
   print " Fetching $title\n";
   $content = get $url;     
   print OUT $content;
   close OUT;
}

This creates a directory "Hubpages" and simply downloads each of your hubs there.

That doesn't get our images, though, so a slightly more complicated script is what I use. This will require downloading the WWW::Mechanize module from CPAN.

CPAN is part of Perl. Unfortunately, Windows is not a friendly environment for this. You CAN use CPAN on Windows, it's just a little more difficult. On a Mac or Linux system, this can be as simple as typing "install WWW::Mechanize" within the CPAN shell. Yet another reason to prefer Macs over Windows.

The script is admittedly a bit more advanced:

#!/usr/bin/perl
use WWW::Mechanize;
my $mech = WWW::Mechanize->new();
my $pageimage = WWW::Mechanize->new();
mkdir "Hubpages" unless -e "Hubpages";
mkdir "Hubpages/Images" unless -e "Hubpages/Images";
open(HUBS,"/Users/apl/Downloads/hubs.csv") or die "No hub list!";
@hubs=<HUBS>;
foreach (@hubs) {
   next if not /http:/;
   next if not /Not Published/
   ($url,$title,$junk)=split /","/;
   $title=~s/"//g;
   $url=~s/"//g;
   $out=$url;
   $out=~s?http://hubpages.com/hub/??;
   open(OUT,">:utf8","Hubpages/$out") or die "Can't create $out $!";
   print " Fetching $url\n";
   $mech->get($url);
   print OUT $mech->content();
   close OUT;
   foreach my $link ($mech->images) {
      $image=$link->url;
      next if $image !~ /hubimg.com/;
      $imagesave=$image;
      $imagesave=~s/.*\///;
      $imagesave="Hubpages/Images/$imagesave";
      next if -e $imagesave;
      print "\tFetching $image\n";
      $pageimage->get($image);
      $pageimage->save_content($imagesave);
}
}


This saves images, but to avoid duplication, it stores them all to one directory (Images).

Many of the images will not be ours, of course. You can see that in this snap from my computer. It would be possible to parse the page and only download those pictures that are in fact ours, but that's a much more complicated script and would be very specific to HubPages. As it is, this script could be used for any list of pages, not just Hubs.


Pictures from HubPages
Pictures from HubPages

I hope this gives you some ideas about getting backups of your pages. There are many other things we could do that would be HubPages specific. For example, we could parse the page looking for the specific text, photo and other modules we created and only save those. Again, that's a much more complicated script, but it might be worth the trouble.

Some people have noted that they create their hubs in Word or Notepad or whatever and that is their backup. Perhaps so, but that may not include edits you made later and won't show you where and how you inserted pictures and so on. It also will not include comments!  Be safe: secure your pages with a local copy.



Do you know someone who should be reading this? Click the Share button below to send it to them easily or to post it to Facebook or Twitter.


Comments

Sophia Angelique profile image

Sophia Angelique Level 6 Commenter 16 months ago

Hot hub!!! As soon as I have time, I'm going to do this!!! Thanks.

prettydarkhorse profile image

prettydarkhorse Level 2 Commenter 16 months ago

Thank You I will share this at facebook. Bookmarked it.

MrKnowledge profile image

MrKnowledge 16 months ago

Great Hub! I write all of my hubs in notepad before I post them, so I have .txt files of all of my hubs. As for the pictures, I have local copies of them, as well, if they aren't stock images that I can just pull down at any point in time

Pandoras Box profile image

Pandoras Box 16 months ago

Great info, PC! Thanks for posting! And thank you pdh for the link!

Pcunix profile image

Pcunix Hub Author 16 months ago

I write mine in Notepad too (well, the Mac equivalent) but as I often edit in place after, I need the backup.

ns1209 profile image

ns1209 16 months ago

Wow this is awesome and I will do this. I will use the export to csv option. And try and work out this Perl code!

Pcunix profile image

Pcunix Hub Author 16 months ago

Good for you - are you using Windows? If so, you might want to stick with VisualWget (or the command line wget) if you aren't adventurous.

Don Simkovich profile image

Don Simkovich Level 4 Commenter 16 months ago

I'm bookmarking this. I write all my Hubs in Word documents and save those, of course. I want to try this simply for the sake of learning new ways to use the computer.

Pcunix profile image

Pcunix Hub Author 16 months ago

Keep in mind that these methods also save the comments on a page.

barryrutherford profile image

barryrutherford Level 5 Commenter 16 months ago

Pcunix

Will take me time to digest this not use to backing up other than simple word doc. Great information thx your a gem !

vicki simms profile image

vicki simms 16 months ago

This is great advice I am going to back everything up, like everyone else I have also bookmarked the page :)

Pcunix profile image

Pcunix Hub Author 16 months ago

I'm glad this has served to remind a few people to do this!

Pcunix profile image

Pcunix Hub Author 16 months ago

I had asked in the forums about suggested tags for this and got some good suggestions: Store, secure, archive, protect hubs, back-up (with a dash) hubs.

I explained there that this is not for outside traffic - it is strictly pro bono for hubbers.

LillyGrillzit profile image

LillyGrillzit 16 months ago

Thank you Pcunix, with your many years experience, it is sure that this advice is saving many of us tons of grief. Thank you for the explanation. Bookmarked!

GmaGoldie profile image

GmaGoldie Level 7 Commenter 16 months ago

Pcunix,

Great Hub!

As far as tags..."Online back solution" and "online backup" come up on the keyword tool - is this competition or appropriate? Also "archiving files".

If I do this - then can I save the file to my laptop and work offline on revising it?

Pcunix profile image

Pcunix Hub Author 16 months ago

No, you cannot work offline because HP has no method for you to put your file(s) back.

'Online backup" has an entirely different meaning - nothing at all to do with this. That would be you backing up your computer to some Internet storage.

simeonvisser profile image

simeonvisser 16 months ago

That's a good idea, I was doing it manually so far after publishing a hub. Automating it is a good idea and this also includes updates made later to the hubs. I'll have to take making backups more seriously in the long run.

tonymac04 profile image

tonymac04 16 months ago

I think it is a very good idea to backup my Hubs and to automate the process. Just not sure that I have the requisite skills to do it! Will have to see if I can follow your advice here.

Thanks so much.

Love and peace

Tony

Autumn Lynn profile image

Autumn Lynn 16 months ago

Great information. It will take me a few reads to figure it all out but it certainly seems to be the best way to secure or hubs. Thanks again.

mega1 profile image

mega1 Level 3 Commenter 16 months ago

thanks PC!

Manna in the wild profile image

Manna in the wild Level 3 Commenter 16 months ago

These are good ideas and it's a very good strategy to automate the backups.

viking305 profile image

viking305 Level 6 Commenter 16 months ago

I too only use the microsoft Word document to back up my hubs. Now that you have shown me the way to do it better in this hub I will certailnly be doing it that way.

Thanks for a very interesting and informatitve hub.

Tammy L profile image

Tammy L Level 1 Commenter 16 months ago

Another great idea and something else I need to tinker with. Thanks a bunch. :)

rich_hayles profile image

rich_hayles Level 1 Commenter 16 months ago

Excellent ideas.

I never thought to back up by hub details in a spreadsheet before, off to start doing it now.

Pcunix profile image

Pcunix Hub Author 16 months ago

The spreadsheet isn't a backup of the hubs. I expect you understand that, but I need to make that clear in case someone else doesn't,

Maria Cecilia profile image

Maria Cecilia Level 4 Commenter 16 months ago

I haven't realized the possibilities until I read this, you are right, we must always have a back up copy of our works.. wish I can do this soon..

oceansnsunsets profile image

oceansnsunsets Level 7 Commenter 16 months ago

I think the idea of saving a hub to some sort of back up is a wise idea. Thank you for sharing the options. Great hub.

Taleb80 profile image

Taleb80 Level 4 Commenter 16 months ago

Thank you for the advice.

I voted "Awesome"

Have a good day.

DzyMsLizzy profile image

DzyMsLizzy Level 7 Commenter 15 months ago

Hmmm... I usually copy/paste into Word, save in a folder "Hub Pages Articles" and in the same folder, archive the photos that go with the articles.

Sometimes, I do it the other way around, and write the copy first in Word, and copy/paste it into HP capsules. ;-)

Scripts, I'm not even going to attempt--way too advanced for my skill level.

Then about once a week, I run a backup for "new and changed files" to be archived to my USB external hard drive...which is NEVER plugged in while I'm browsing the 'net.

Pcunix profile image

Pcunix Hub Author 15 months ago

It's a lot easier this way :)

nicomp profile image

nicomp Level 6 Commenter 14 months ago

File / Save As is a great idea. It's a backup, after all, and if it's a little difficult to deal with, well, it's a backup, after all.

Pcunix profile image

Pcunix Hub Author 14 months ago

Well, yes. Having it is better than not having it.

Howard S. profile image

Howard S. Level 2 Commenter 14 months ago

OK, suppose I've backed-up pages and then want to republish one that has been unpublished and deleted. Do any of these methods circumvent redoing all the capsule work?

If you need a scenario, I changed my initial identity. I copied 15-20 hubs with Word and then did all that work again in the present identity.

Pcunix profile image

Pcunix Hub Author 14 months ago via iphone

No, these don't help that.

sofs profile image

sofs Level 7 Commenter 13 months ago

Great information here, I have bookmarked your hub for future reading and reference.

TheMagician profile image

TheMagician Level 2 Commenter 6 months ago

Question: If you do the first option, assume the website were to go off the web... would you still have that web page saved even though the site may not exist anymore?

Pcunix profile image

Pcunix Hub Author 6 months ago

Yes. It is saved on your computer.

onlinecashdigest profile image

onlinecashdigest Level 1 Commenter 2 months ago

You have provided a very important info for all Hubbers. Since we earn through our own hubs....any digital disaster would affect our income.

Thanks for the reminder.

RichardPac profile image

RichardPac Level 2 Commenter 2 weeks ago

I see you're using a Mac, there is a much simpler way to accomplish this:

http://richardpac.hubpages.com/video/How-To-Easily

molometer profile image

molometer Level 8 Commenter 13 days ago

Handy tips on backup options.

Submit a Comment
Members and Guests

Sign in or sign up and post using a hubpages account.



    • No HTML is allowed in comments, but URLs will be hyperlinked
    • Comments are not for promoting your Hubs or other sites

    Please wait working