For my first article on Search Engine Land, I'll start by quoting Ian Lurie: Log file analysis is a lost art. But it can save your SEO butt! Wise words. However, extracting the data we need Industry Email List from the server log files is usually laborious: Huge log files require robust data ingestion pipelines, reliable cloud storage infrastructure, and a solid querying system Meticulous data modeling is also required to convert cryptic, raw log data into readable bits suitable Industry Email List for exploratory data analysis and visualization. In the first post of this two-part series,
I'll show you how to easily scale your analytics to larger datasets and extract meaningful SEO insights from your server logs. All of this with just a dash of Industry Email List Python and a dash of Google Cloud! Here is our detailed action plan: #1 - I'll start by giving you some background: What are log files and why they are important for SEO How to Industry Email List enter them Why Python Alone Doesn't Always Cut It When It Comes to Server Log Analysis #2 - Next we'll set things up: Create a Google Cloud Platform account Create a Google Cloud Storage bucket to store our log files Use the command line to convert our files to a compatible format for querying Transfer our files to Google Cloud Storage, manually and programmatically #3 -
Finally, we'll get into the nitty-gritty of python - we'll: Query Industry Email List our log files with Bigquery, in Colab! Create a data model that makes our raw logs more readable Create categorical columns that will improve our analyzes later down the line Filter and export our results to In Part 2 of this series (available later this year), we'll discuss more advanced data modeling techniques in Python for evaluating: Bot crawl volume Waste of exploration budget Duplicate URL crawling Industry Email List I'll also show you how to aggregate and join log data to Search Console data, and create interactive visualizations with Plotly Dash! Excited? Let's crack! Required configuration