AngularJS, dotCMS and SEO integration

AngularJS, dotCMS and SEO integration

Before launching your website, you need to ensure that your content will be indexed by search engine crawlers/bots. If you are using JavaScript to render content, this task is not as simple as it might seem. JavaScript rendering requires a browser session to load the actual content of your site, yet these search engine crawlers look directly at the source code when the visit to index. In order for them to see content, as opposed to JavaScript functions, you need to take server side snapshots of your rendered pages to allow search bots see your content. This requires running a dedicated service, either hosted on your server or subscribed from a third party paid service. These rendering services must be configured to work with your JavaScript pages using a server side render list, which then must be checked to confirm your content is successfully presented and available to be search indexed.

When we first adopted AngularJS as a framework for a project a few years back, we were excited about all the efficiencies promoted with such a framework, but were surprised by the difficulties that would introduce in terms of SEO and indexing. We use dotCMS as our content management system of choice and wanted to connect an AngularJS front end in a search engine friendly manner, so we want to share the lessons we learned (the hard way) about how best to accomplish this given the limitations of JavaScript content mentioned above.

post image

dotCMS and AngularJS

To successfully index a page built upon the dotCMS Java framework and AngularJS JavaScript frontend framework, the follow adjustments to the application were required.

We had to update elements of website navigation to keep the proper deep linking of the site, but wanted to avoid the use of hashbangs in urls in favor of having pretty url links instead.

In the angular application we turned on HTML5 mode

$locationProvider.html5Mode(true)


And in the html template source we added

<base href="/">


To ensure proper linking, we had to exclude the native dotCMS Java application links (which use hashbangs by default), but this single page application navigation process was violating strict java application rules and required a creative workaround.

Since we developed our project as a separate dotCMS plugin, it allowed us to set any rewrite rule in a valid manner by adding an entry to the 
url-rewrite.xml file. Once we deployed the plugin, the new rule was now present:

<rule match-type="regex">

            <condition name="host">^genb.com</condition>

            <condition type="request-uri" operator="notequal">^/(home/index.html|support|app|admin|bower_components|c/(.*)|api|(.*).js|(.*).css|(.*).jpg|(.*).gif|(.*).ico|(.*).png|(.*).woff|(.*).ttf)</condition>

            <from>^([a-z\-\/]{1,})$</from>

            <to type="proxy" last="true">https://www.genb.com/home/index.html</to>

</rule>

Rewrite rule in the urlrewrite-ext.xml file.

In the rule body, we proxied all non-dotCMS life-cycle related linking to our website. This way we were able to allow each section and assets being rendered to be performed in the AngularJS application. We then added a sitemap.xml generator to the page so that we could include it for indexing in the Google search console. Finally, we needed to include custom html title/descriptions in our page, so we made a custom angular component to update the html head when the page section was changed as a user (or search engine bot) navigated throughout the site.

Conclusions

We successfully got our pages indexed, but it took a lot more time than initially assumed. While lots of pages got indexed right away, we found that some pages were indexing partially, while some were not indexed at all. Since that time many things have improved with the way search engines index dynamic sites, but still the amount of time required for a JavaScript page to be indexed is significantly longer than a pure HTML site. Therefore, it’s worth keeping in mind that if you don’t really have to build your application using a JavaScript framework, you should think twice before adopting such a framework.

It is worth mentioning that there are some new solutions one the market that are made to help resolve such problems, namely Universal Angular -  https://universal.angular.io/ - it is based on node.js server and the Angular2+ framework and it provides an intuitive way of server side page rendering, that should resolve a lot of these headaches in most scenarios.

If you still find yourself struggling with SEO issues related to JavaScript frameworks, get in touch with us at us_office@genb.com or +1 (202) 657-4362, we are happy to help you ;)

Post on