Content Audit Tool

Content Audit Tool


The hypothesis is that Google rewards websites with greater visibility if their textual content is relevant and informative to the end user. This process aims to discover SEO gaps resulting from texts that can have negative effects on organic positioning.

Description 💬

The content audit tool is project I started while working in Pro Web Consulting to help the SEO team identify and correct text data on clients' websites that could negatively affect their organic position on the SERP (search engine result pages).

The tool was written in Python and it was based on the following logic:

  • scrape the clients' assets
  • store the data in a database
  • clean the data
  • apply TF-IDF vectorization to the texts
  • apply K-Means clustering
  • find anomalies in the clusters
  • perform topic modeling with LDA

The client would receive a report that documented the findings and action items on how to resolve the issues.

This project served more than 15 customers through an automized pipeline.

Goal of the project 🎯

The goal is to two-fold:

  • aid clients optimize their content creation efforts
  • support the SEO team in the identification, clustering and optimization process of texts

Team 🤼

  • Backend developer (me)
  • Data scientist (me)
  • SEO specialist * 2
  • Project manager

Features 🦓

The software offers the following features

  • scraping capabilities with inlink / backlink following
  • data cleaning
  • creation on data visualizations for .pptx embedding
  • exporting of comprehensive report for the client

Want to know more?

Drop me a line on my Twitter, LinkedIn or contact me through the form in the homepage. You can also reach Pro Web Consulting's website at