Migrating WordPress to Gatsby

April 10th 2020

WordPress has dominated the web development landscape for the last 10 years becoming the largest content management system in use, comprising over 30% of websites. Although there are ways to optimize WordPress through pairing back plugins, server side caching, minifying files and serving front-end files through a CDN; WordPress by nature is inherently slow compared to static site generator technologies such as Gatsby.

There are many static site generators, but Gatsby has become a popular solution because it’s built on top of the popular React Framework. It also has a strong dependency for it’s data layer on GraphQL which is an emerging query layer that is increasingly replacing traditional REST APIs. Since Gatsby is pre-rendered and statically generated, it is extremely fast compared to monolithic content management system counterparts.

In this tutorial we will discuss the steps it takes to migrate your WordPress Blog to Gatsby.

In this tutorial we will be:

  • Creating a basic Gatsby starter site
  • Downloading JSON data from our WordPress Post JSON endpoint
  • Using gatsby-source-filesystem and gatsby-transformer-json to transform our JSON to GraphQL nodes
  • Downloading our WordPress Images to our Gatsby Project
  • Associating the Images to our GraphQL nodes
  • Using createPages API from Gatsby to create our blog pages and defining our component we want to use for our posts.

Setting Up Basic Gatsby Project

We can use npx utility to create a basic Gatsby starter project. In the command line run:

npx gatsby new wordpress-to-gatsby

Once done installing we can change into our directory and start the development server:

cd wordpress-to-gatsby && yarn develop

Now navigate to http://localhost:8000/ to see the starter site. Great! We have a basic Gatsby site installed. We can kill the development server and continue building out our site.

Creating Data Directory and Downloading WordPress JSON File

We are going to use the WordPress REST API to get the posts in JSON format from our WordPress blog and download it to our ./src/data directory. Let’s create the directory mkdir src/data && cd src/data. We can query the endpoint /wp-json/wp/v2/posts in our WordPress installation to get a JSON format for our posts. We want to call our output file Post.json because we are going to follow the singular Type Definition for GraphQL and we will be using the filename in our gatsby-transformer-json configuration. We can use the curl command to download our file:

curl -o Post.json https://example.com/wp-json/wp/v2/posts?per_page=10

Make sure you replace example.com with your WordPress url and we are also only downloading the recent 10 posts but you can change this query parameter to download up to 100 posts. Now that we have our post content downloaded to ./src/data/Post.json we can install and configure gatsby-transformer-json plugin. Make sure you are in the project directory and run:

yarn add gatsby-transformer-json

Now we can open up our gatsby-config.js to set configuration options for our transformer plugin:

module.exports = {
  ...
  plugins: [
    ...
    {
      resolve: `gatsby-source-filesystem`,
      options: {
        name: `images`,
        path: `${__dirname}/src/images`,
      },
    },
    {
      resolve: `gatsby-source-filesystem`,
      options: {
        name: `data`,
        path: `${__dirname}/src/data`,
      },
    },
    {
      resolve: `gatsby-transformer-json`,
      options: {
        typeName: ({ node }) => node.name,
      },
    },
    ...
  ],
}

We’re first adding a new instance to our gatsby-source-filesystem for the plugin to recognize our Post.json data file. We are then adding the configuration for gatsby-transformer-json and we are defining the typeName in our options object to be the name of the file. Essentially, this will create a GraphQL type of Post which we’ll be able to query such as:

query getPosts {
  allPost {
    edges {
      node {
        title {
          rendered
        }
      }
    }
  }
}

Which will return all of our posts with the title field from our Post.json file. Now that we have our Post content downloaded we will want to download the featured_images that are associated with our posts.

Downloading WordPress Featured Images

We have a field in our Post.json nodes called featured_image which is a url to the featured image for our post. We need to download all of our images from our WordPress blog and save them to our ./src/images/ directory. We can use FTP or login to our server to get all of the files in our uploads directory. We’re going to SSH into our server and navigate to our wp-content directory and zip the uploads folder. Once we’re in the wp-content directory we can run:

zip -r uploads.zip uploads

Now we have our uploads.zip folder we can download it to our local machine and unzip it in our ./src/images/ directory of our project folder. All of your post’s media featured images should be available now as File nodes in your GraphQL queries. If we have our development server running we can navigate to our GraphiQL interface at: http://localhost:8000/___graphql we can run a query to get all of our image files:

query allImage {
  allFile(filter: {sourceInstanceName: {eq: "images"}}) {
    nodes {
      name
      relativePath
    }
  }
}

Associate WordPress Featured Image with Post

We need to associate all of the posts in our Post.json file with the correct image in our ./src/images/uploads directory. We can do this by creating a field definition by using the createTypes API for Gatsby and then defining a custom field resolver to populate the value of the field we created.

First we’ll need to use the createTypes API to create a custom field called image which we will eventually use a custom field resolver on to define the value for the field. In the project directory open gatsby-node.js and add:

/**
 * Implement Gatsby's Node APIs in this file.
 *
 * See: https://www.gatsbyjs.org/docs/node-apis/
 */

exports.sourceNodes = ({ actions: { createTypes } }) => {
  createTypes(`
    type Post implements Node @infer {
      image: File 
    }
  `)
}

What we’re doing here is adding a field to our GraphQL Type definition for our Post type which is being defined using the gatsby-transformer-json plugin. We are using @infer to infer the other field types in our JSON file.

@infer – run inference on the type and add fields that don’t exist on the defined type to it.

https://www.gatsbyjs.org/docs/actions/#createTypes

Next we are adding our image field and setting it as a type File which is already a type definition defined by the plugin gatsby-source-filesystem. With this setup we’re able to restart our development server and navigate to our GraphiQL interface and we are able to “select” our new field in a query definition:

query MyQuery {
  allPost {
    nodes {
      image {
        name
      }
    }
  }
}

Now when we run this query all of the post nodes will return null for the image field since we haven’t defined it yet. We can define the value for our field by creating a custom field resolver for our image field. We can do this using the createResolvers API which allows us to define custom field resolvers to resolve the value of our field. First we will create a helper function inside our createResolvers hook to get the basename from a featured_image field that we will be passing to the function. In our gatsby-node.js file we can add:

const path = require('path')

exports.sourceNodes = ({ actions: { createTypes } }) => {
...
}

exports.createResolvers = ({ createResolvers }) => {
  const getName = (url) => {
    let filename = path.basename(url)
    let ext = filename.split('.').pop()
    return path.basename(filename, `.${ext}`)
  }
}

In this helper function getName we are using the node utility path.basename to get the full filename of our url that we’re passing in. We are then splitting the file to get the file extension and finally we are using the path.basename this time passing the extension so we get the “naked” filename without the extension which is what our custom resolver will expect to filter out the File that we need to associate to our post. Now we will define our custom resolver field inside the createResolver api hook:

/**
 * Implement Gatsby's Node APIs in this file.
 *
 * See: https://www.gatsbyjs.org/docs/node-apis/
 */
const path = require('path')

exports.sourceNodes = ({ actions: { createTypes } }) => {
...
}

exports.createResolvers = ({ createResolvers }) => {
  const getName = (url) => {
    let filename = path.basename(url)
    let ext = filename.split('.').pop()
    return path.basename(filename, `.${ext}`)
  }

  createResolvers({
    Post: {
      image: {
        type: "File",
        resolve: (source, args, context) => {
          return context.nodeModel.runQuery({
            query: {
              filter: {
                name: {
                  eq: getName(source.featured_image)
                }
              }
            },
            type: "File",
            firstOnly: true,
          })
        }
      }
    }
  })
}

The important part of the resolver definition is that we are using the context.nodeModel.runQuery which is running a GraphQL query and we are passing in the source.featured_image which is the Post featured_image (defined in our Post.json file downloaded from WordPress) value and we are passing that to our helper function getName so the filter will end up looking like (note the extension of the image and absolute URL has been removed by our helper function):

filter: {
  name: {
    eq: "aws-s3-glacier-delete-vault"
  }
}

Now the context.nodeModel.runQuery will resolve to the right File node and relate it to each of our Post nodes! We can restart our GraphiQL and run the query:

query MyQuery {
  allPost {
    nodes {
      image {
        publicURL
      }
    }
  }
}

You will see that what was once our featured_image is now the correct path in our local filesystem.

{
  "data": {
    "allPost": {
      "nodes": [
        {
          "featured_image": "https://blog.hashinteractive.com/wp-content/uploads/2020/04/aws-s3-glacier-delete-vault.jpg",
          "image": {
            "publicURL": "/static/f1e8088dad3ab7c5dd976531d8d2cb28/aws-s3-glacier-delete-vault.jpg"
          }
        },
      ]
    }
  }
}

This is exactly what we want as now we’ll be able to access the publicURL of our post in our template. Speaking of templates, let’s use the createPages api to create a page route and define a template component for each of our posts.

Gatsby createPages API for Post Template

In Gatsby any component you put in the ./src/pages directory will automatically create a page route for you.

Gatsby core automatically turns React components in src/pages into pages

https://www.gatsbyjs.org/docs/creating-and-modifying-pages/

But what if we want to dynamically create pages for our posts that we imported from WordPress? We can use the createPages API to programmatically create pages and define the page route and template component that we want the route to use for rendering! Let’s see what that looks like.

First, we will need hook into the exports.createPages API by exporting a function in which we’ll need to destructure two arguments the graphql function for making GraphQL queries and createPage function for programmatically creating pages. We will also make this an async function since we will be resolving a promise from our GraphQL query:

/**
 * Implement Gatsby's Node APIs in this file.
 *
 * See: https://www.gatsbyjs.org/docs/node-apis/
 */
const path = require('path')

exports.sourceNodes = ({ actions: { createTypes } }) => {
...
}

exports.createResolvers = ({ createResolvers }) => {
...
}

exports.createPages = async ({ graphql, actions: { createPage } }) => {

}

Now we will need to make a GraphQL query to get allPost nodes and we need to await for this data to return. We will then destructure the returned query data into usable variables:

/**
 * Implement Gatsby's Node APIs in this file.
 *
 * See: https://www.gatsbyjs.org/docs/node-apis/
 */
const path = require('path')

exports.sourceNodes = ({ actions: { createTypes } }) => {
...
}

exports.createResolvers = ({ createResolvers }) => {
...
}

exports.createPages = async ({ graphql, actions: { createPage } }) => {
  const { data: { allPost = {} } } = await graphql(`
    query {
      allPost {
        edges {
          node {
            id
            slug
          } 
        }
      } 
    } 
  `)
}

Next we will need to make sure that we actually have nodes within our graphql data that was returned. If we do, we will loop through all of the nodes use the createPage function to dynamically create a post “page” with a defined URL path and component template:

/**
 * Implement Gatsby's Node APIs in this file.
 *
 * See: https://www.gatsbyjs.org/docs/node-apis/
 */
const path = require('path')

exports.sourceNodes = ({ actions: { createTypes } }) => {
...
}

exports.createResolvers = ({ createResolvers }) => {
...
}

exports.createPages = async ({ graphql, actions: { createPage } }) => {
  const { data: { allPost = {} } } = await graphql(`
    query {
      allPost {
        edges {
          node {
            id
            slug
          } 
        }
      } 
    } 
  `)
  if(Object.keys(allPost).length){
    allPost.edges.forEach(({ node: { id, slug } }) => {
      createPage({
        path: `/blog/${slug}`,
        component: require.resolve(`./src/templates/post.js`),
        context: {
          // Data passed to context is available
          // in page queries as GraphQL variables.
          id
        }
      })
    }) 
  }
}

Inside our createPage invocation we are passing in the path which will become the URL such as: http://localhost:8000/blog/aws-s3-glacier-vault-deletion which inherits the slug that was defined in our Post.json file for the respective post node. We are also defining a component to use for each post path. Finally, we are passing a context object in with the value of { id } which will become a variable available to us in our pageQuery definition that we will define within our ./src/templates/post.js template.

The context parameter is optional, though often times it will include a unique identifier that can be used to query for associated data that will be rendered to the page. All context values are made available to a template’s GraphQL queries as arguments prefaced with $, so from our example above the slug property will become the $slug argument in our page query:

https://www.gatsbyjs.org/docs/programmatically-create-pages-from-data/

Finally, we need to create our template file posts.js that all post pages will use for component rendering.

Creating Post Template

We defined the component template for our programmatically post pages as ./src/templates/post.js so we will need to create that file in our project directory. Inside the file we will be defining a React component for rendering and we will also make a graphql query using the context that we passed in using createPage to fulfill the content for that page. Let’s see what that template would look like:

import React from 'react'
import { graphql } from 'gatsby'

import Layout from "../components/layout"
import SEO from "../components/seo"

export default ({ data: { post } }) => {
  const { image: { publicURL }, title: { rendered: renderedTitle }, content: { rendered: renderedContent } } = post
  return (
    <Layout>
      <SEO title={renderedTitle} />
      <h1>{ renderedTitle }</h1>
      <img src={publicURL} />
      <p
       dangerouslySetInnerHTML={{
         __html: renderedContent 
       }}>
      </p>
    </Layout>
  )
}
export const query = graphql`
  query ($id: String!) {
    post(id: { eq: $id }) {
      id
      image {
        publicURL
      }
      title {
        rendered
      }
      content {
        rendered
      }
    }
  }
`

We are exporting a graphql query which Gatsby will automatically use to inject as a data prop within our component. We are then including the default <Layout> component and <SEO> component from the starter project and finally we’re destructuring our post data into the fields that we will need for our post. If there is html in your post’s content we can use dangerouslySetInnerHTML attribute to inject the html into the post template. Now if we navigate to our development url http://localhost:8000 we can navigate to a page that will throw a 404, such as http://localhost:8000/not-a-url-404 and list all of our posts pages with links to all of our posts. Click one of the links to see our new template!

Conclusion

We’ve successfully migrated our WordPress Blog to Gatsby website! To recap what we did:

  • Creating a basic Gatsby starter site.
  • Downloaded JSON data from our WordPress Post JSON endpoint.
  • Using gatsby-source-filesystem and gatsby-transformer-json we transformed our JSON Post.json to GraphQL nodes.
  • Downloaded our WordPress Images to our Gatsby Project.
  • Associated the WordPress Images to our GraphQL Post nodes.
  • Used createPages API from Gatsby to create our blog pages and defining our component we want to use for our posts.
  • Created a posts.js component to use to render our individual post pages.

Hopefully this is just the beginning and you are able to continuing developing out your website with Gatsby now that you have transferred from a WordPress installation! Until next time, stay curious, stay creative.