Skip to main content

Migrating From Blogger to Hugo

·4 mins

For a while now I had been meaning to migrate my (not recently updated) food blog from Blogger to a static site. This week I finally got around to doing it!

It turned out to be relatively straightforward using Blogger's XML export facility and a handy script I found on the Internet, blog2md, to convert this XML into Markdown.

This gives you a set of files that you could use immediately in Hugo, but all the image links are still pointing at the Blogger CDN and there were a few hard-coded links to the old Blogger site. I also wanted to remove all of the Amazon affiliate links and tracking images. Of course I wrote a Guile script to do this extra clean-up...

The rest of this post outlines the process, from the Blogger export to a published Hugo site.

Export content from Blogger #

I began by following these instructions to export the blog posts to XML.

Create a new Hugo site #

I already had Hugo installed, so I just ran:

hugo new site start-again-at-zero

I decided to use the fairly minimal Congo theme for the blog, which I installed as a git submodule:

cd start-again-at-zero
git init .
git submodule add -b stable https://github.com/jpanther/congo.git themes/congo

Next I removed the generated hugo.toml file and copied the Congo configuration into place:

rm hugo.toml
cp -r themes/congo/config .

...and made a few small changes to the configuration for my site. The most important thing is to add theme = "congo" to config/_default/config.toml but I also changed the base URL and blog name, enabled recent posts on the home page, and enabled search.

Convert the Blogger export to Markdown #

Install the blog2md utility:

cd ..
git clone https://github.com/palaniraja/blog2md.git
cd blog2md
# Install dependencies
npm install

Run the Markdown conversion:

node index.js b ~/Downloads/blog-08-27-2024.xml ../start-again-at-zero/content/posts/

At this point you have a working Hugo site that you can test locally by running:

cd ../start-again-at-zero
hugo serve

Additional clean-up #

As I mentioned earlier, the images and links in the Markdown are still pointing to Blogger, and the content contains some Amazon affiliate links and tracking that I don't want to keep. I wrote a Guile script that updates the Markdown files to:

  • Remove Amazon affiliate tracking images.
  • Remove Amazon affiliate links.
  • Download images from Blogger to the Hugo static/ folder and replace the image source with a relative link.
  • Replace links to images in Blogger with a relative link, downloading the image if necessary.
  • Replace internal links to Blogger with relative links to the new blog.

It's a bit of an ugly script, relying a lot on regular expressions, but it got the job done.

I'm still relatively new to Guile and made use of a couple of new (to me) libraries in this script. The first was SRFI 26, which was recommended by someone on the #guile IRC channel. It provides a cut function that is similar to, but more flexible than, Clojure's partial. For example, the scandir function takes an optional select? argument to filter the returned files, and we can use cut to build a select function:

(cut string-suffix ".md" <>)

The equivalent long-hand would be:

(lambda (x) (string-suffix ".md" x))

We can use the <> placeholder in any argument position, while Clojure's partial only allows us to specify the leading arguments. To build an equivalent function in Clojure, we would have to use the #() reader macro:

#(clojure.string/ends-with? % ".md")

The second library that came in handy was SRFI 197 that provides pipeline operators similar to Clojure's -> (thread first) and ->> (thread last) macros, but again more flexible as you can specify where the placeholder goes. I used this to chain together a number of document transforms:

(define (process-file path)
  (format #t "process-file ~a~%" path)
  (let ((doc (chain (call-with-input-file path get-string-all)
                    (remove-amazon-tracking-images _)
                    (remove-amazon-links _)
                    (replace-images _)
                    (replace-image-links _)
                    (replace-self-links _))))
    (call-with-output-file path (cut put-string <> doc))))

This function also shows another use of cut, this time creating a lambda to pass to call-with-output-file.

Publishing the Blog #

My domain and web site is already hosted by Mythic Beasts so I simply headed over to their control panel to configure web hosting for a new subdomain. I added shell access to my hosting package so I can publish content using rsync:

# Generate the static site
hugo
# Push to the Mythic Beasts hosting server
rsync -rvct --delete public/. \
    bobcat.mythic-beasts.com:www/start-again-at-zero.1729.org.uk

In case you don't have shell access, they also support upload by SFTP or (not recommended) FTP.

That's it! The site is now live at https://start-again-at-zero.1729.org.uk.