151 Views

With about 40.58% of developers choosing React for creating applications, React stands out as one of the most extensively used frameworks in current web development. However, prioritizing SEO considerations while building websites with this framework is crucial. In the world of SEO, React often poses challenges. Nevertheless, with careful optimization, it can become an SEO-friendly option. In this blog, we will discuss the SEO-friendliness of React websites and how to make a React website SEO-friendly. So, let’s dive into the details:

What is React?

At its core, ReactJS is a well-known open-source JavaScript library created for developing both mobile and web applications. Its distinguishing characteristics are its component-based architecture, declarative methodology, and seamless facilitation of DOM (Document Object Model) operations. These features contribute to React’s popularity and usefulness in contemporary web development.

Why Choose React for Your Web Development?

If you’re considering React JS for your web development project, here are some compelling reasons to choose it:

How Google Crawls and Indexes Webpages

Since Google receives over 90% of all online searches, it’s worthwhile to understand its crawling and indexing process.

Here’s a simplified snapshot from Google’s documentation:

Google Indexing Steps:

  1. Googlebot maintains a crawl queue containing all the URLs it needs to crawl and index.
  2. When the crawler is idle, it picks up the next URL, makes a request, and fetches the HTML.
  3. After parsing the HTML, Googlebot determines if it needs to fetch and execute JavaScript to render the content. If yes, the URL is added to a render queue.
  4. The renderer fetches and executes JavaScript to render the page and sends the rendered HTML back to the processing unit.
  5. The processing unit extracts all the URLs’ <a> tags on the webpage and adds them to the crawl queue.
  6. The content is added to Google’s index.

Googlebot distinguishes between the Processing stage that parses HTML and the Renderer stage that executes JavaScript. This is because executing JavaScript is expensive, given Googlebots need to look at more than 130 trillion webpages. When Googlebot crawls a webpage, it parses the HTML immediately and then queues the JavaScript to run later. Google’s documentation mentions that a page stays in the render queue for a few seconds, though it may be longer.

It’s also worth mentioning the concept of crawl budget. Google’s crawling is limited by bandwidth, time, and availability of Googlebot instances. It allocates a specific budget or resources to index each website. If you are building a large content-heavy website with thousands of pages (e.g., an e-commerce website) and these pages use a lot of JavaScript to render the content, Google may not be able to read as much content from your website.

You can read Google’s guidelines for managing your crawl budget here.

Why React SEO Remains Challenging

This overview of Googlebot’s crawling and indexing only scratches the surface. Software engineers should identify the potential issues faced by search engines trying to crawl and index React pages.

Here’s a closer look at the challenges and how developers can address them:

Empty First-pass Content:

React applications rely heavily on JavaScript and often face issues with search engines. React employs an app shell model by default, where the initial HTML does not contain meaningful content. A user or bot must execute JavaScript to see the page’s actual content. Googlebot detects an empty page during the first pass, and the content is seen only when the page is rendered. For websites with thousands of pages, this delays the indexing of content.

Load Time and User Experience:

Fetching, parsing, and executing JavaScript takes time. JavaScript may also need to make network calls to fetch the content, causing users to wait before viewing the requested information. Google’s core web vitals related to user experience are used in its ranking criteria. Longer load times can affect the user experience score, leading to lower rankings.

Page Metadata:

Meta tags allow Google and social media websites to show appropriate titles, thumbnails, and descriptions for pages. These websites rely on the <head> tag of the fetched webpage for this information; they do not execute JavaScript for the target page. React renders all content, including meta tags, on the client. Since the app shell is the same for the entire website/application, adapting the metadata for individual pages can be challenging.

Non-React SEO Considerations

These considerations relate to setting up good SEO practices in general:

  • Have an optimal URL structure to give humans and search engines a sense of what to expect on the page.
  • Optimize the robots.txt file to help search bots understand how to crawl pages on your website.
  • Use a CDN to serve all static assets like CSS, JS, fonts, etc., and use responsive images to reduce load times.

Many of these problems can be addressed by using server-side rendering (SSR) or pre-rendering. Let’s review these approaches below.

Enter Isomorphic React

The dictionary definition of isomorphic is “corresponding or similar in form.” In React terms, this means that the server has a similar form to the client. In other words, you can reuse the same React components on the server and client.

This isomorphic approach enables the server to render the React app and send the rendered version to users and search engines so they can view the content instantly while JavaScript loads and executes in the background. Frameworks like Next.js or Gatsby have popularized this approach. Note that isomorphic components can look substantially different from conventional React components, including code that runs on the server instead of the client and even API secrets (though server code is stripped out before being sent to the client).

These frameworks abstract away much of the complexity but introduce an opinionated way of writing code. We will explore the performance trade-offs in another section. We will also do a matrix analysis to understand the relationship between render paths and website performance. But first, let’s review some basics of measuring website performance.

Metrics for Website Performance

Here are some factors that search engines use to rank websites:

  • Users should be able to access content without too much waiting time.
  • It should become interactive to a user’s actions early.
  • It should load quickly.
  • It should not fetch unnecessary data or execute expensive code to prevent draining a user’s data or battery.

These features map roughly to the following metrics:

Time to First Byte (TTFB): The time between clicking a link and the first bit of content arriving.

Largest Contentful Paint (LCP): The time when the requested article becomes visible. Google recommends keeping this under 2.5 seconds.

Time to Interactive (TTI): The time at which a page becomes interactive (users can scroll, click, etc.).

Bundle Size: The total number of bytes downloaded and code executed before the page becomes fully visible and interactive.

We will revisit these metrics to better understand how various rendering paths may affect each one.

Solving the Problem

There are a couple of ways to make a React app SEO-friendly: by creating an isomorphic React app or by using prerendering.

Isomorphic React Apps:

An isomorphic JavaScript application can run on both the client side and the server side. With isomorphic JavaScript, you can run the React app and capture the rendered HTML file that’s normally rendered by the browser. This HTML file can then be served to everyone who requests the site (including Googlebot). On the client side, the app can use this HTML as a base and continue operating on it in the browser as if it had been rendered by the browser. When needed, additional data is added using JavaScript, as an isomorphic app is still dynamic.

An isomorphic app defines whether the client can run scripts or not. When JavaScript is off, the code is rendered on the server, so a browser or bot gets all meta tags and content in HTML and CSS. When JavaScript is on, only the first page is rendered on the server, so the browser gets HTML, CSS, and JavaScript files. Then JavaScript starts running, and the rest of the content is loaded dynamically. Thanks to this, the first screen is displayed faster, the app is compatible with older browsers, and user interactions are smoother compared to when websites are rendered on the client side.

Building an isomorphic app can be time-consuming. Luckily, there are frameworks that facilitate this process. The two most popular solutions for SEO are Next.js and Gatsby.

Next.js: A framework that helps you create React apps generated on the server side quickly and without hassle. It allows for automatic code splitting and hot code reloading. Next.js can do full-fledged server-side rendering, meaning HTML is generated for each request right when the request is made.

Gatsby: A free open-source compiler that allows developers to make fast and powerful websites. Gatsby doesn’t offer full-fledged server-side rendering. Instead, it generates a static website beforehand and stores generated HTML files in the cloud or on the hosting service. Let’s take a closer look at their approaches.

Gatsby vs Next.js:

The SEO challenge for GatsbyJS is solved by generating static websites. All HTML pages are generated in advance, during the development or build phase, and are then simply loaded to the browser. They contain static data that can be hosted on any hosting service or in the cloud. Such websites are very fast since they aren’t generated at runtime and don’t wait for data from the database or API.

But data is only fetched during the build phase. So if your web app has any new content, it won’t be shown until another build is run. This approach works for apps that don’t update data too frequently, such as blogs. But if you want to build a web app that loads hundreds of comments and posts (like forums or social networks