20110814 bulk importing of html links into wordpress - plembo/onemoretech GitHub Wiki

title: Bulk importing of html links into WordPress link: https://onemoretech.wordpress.com/2011/08/14/bulk-importing-of-html-links-into-wordpress/ author: lembobro description: post_id: 804 created: 2011/08/14 03:30:41 created_gmt: 2011/08/14 03:30:41 comment_status: closed post_name: bulk-importing-of-html-links-into-wordpress status: publish post_type: post

Bulk importing of html links into WordPress

After getting most of the "easy" stuff out of the way (converting flatpress articles to WordPress posts in two of my blogs (this and another), I reached the point where I had to bulk import some html links into a WordPress blogroll. One thing I didn't know before I started was that WordPress will only bulk import links that have been formatted in OPML (Outline Processor Markup Language). Since I'd never heard of OPML, this discovery raised my blood pressure a little. I finally dove in and after a couple of hours of mostly interrupted (by my kids) programming time, came up with the following little perl script that uses WWW::Mechanize to parse out the HTML link data and XML::OPML to write a file in the correct format. `

#!/usr/bin/perl
use strict;
use WWW::Mechanize;
use XML::OPML;

my $HOME = $ENV{'HOME'};
my($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
my $datetime = sprintf("%04d-%02d-%02dT%02d:%02d:%02d+%02d:%02d",$year + 1900, $mon + 1, $mday, $hour, $min, $sec, 0, 0);
my $url  = "file:////$HOME/phils-links.html";
# my $url   = "http://www.domain.com/webpage.html";
my $outfile = "$HOME/phils-links.opml";

my $mech  = WWW::Mechanize->new();
$mech->get( $url );

my $opml = XML::OPML->new(version =>"1.1");
$opml->head(
    title =>'Phils Links',
    dateCreated =>$datetime,
    dateModified =>$datetime,
    ownerName =>'Phil Lembo',
    ownerEmail =>'[email protected]',
);

my @links = $mech->links();
my($text,$htmlurl);

foreach my $link (@links) {
   $text = $link->text();
   $htmlurl = $link->url();
   $opml->add_outline(
       text =>$text,
       htmlUrl =>$htmlurl,
       xmlUrl =>""
   );
}

$opml->save($outfile);

__END__;

`

Copyright 2004-2019 Phil Lembo