ALwrity On Page SEO Analyzer AI Tool - AJaySi/AI-Writer GitHub Wiki
On-Page SEO Analyzer
Overview
The on_page_seo_analyzer.py
module is a comprehensive tool designed to analyze the on-page SEO of a website. It leverages various libraries such as requests
, streamlit
, bs4
, cloudscraper
, and more to fetch, parse, and analyze the content of a webpage to provide detailed SEO insights.
Features
Fetch and Parse HTML
fetch_and_parse_html(url)
: Fetches HTML content from the given URL using CloudScraper and parses it with BeautifulSoup.
Meta Data Extraction
extract_meta_data(soup)
: Extracts metadata such as title, description, robots directives, viewport, charset, and language from the parsed HTML.
Heading Analysis
analyze_headings(soup)
: Analyzes the headings (H1 to H6) on the webpage.
Readability Check
check_readability(text)
: Checks the readability score of the text using thetextstat
library.
Image Analysis
analyze_images(soup, url)
: Analyzes the images on the webpage, including their src and alt text.
Link Analysis
analyze_links(soup)
: Identifies broken internal and external links on the webpage.
Call-to-Action (CTA) Suggestions
suggest_ctas(soup)
: Suggests call-to-action phrases present on the webpage.
Canonical and Alternate URLs
extract_alternates_and_canonicals(soup)
: Extracts canonical URL, hreflangs, and mobile alternate links from the parsed HTML.
Schema Markup Extraction
extract_schema_markup(soup)
: Extracts schema markup data from the parsed HTML.
Content Data Extraction
extract_content_data(soup, url)
: Extracts content data such as text length, headers, and insights about images and links.
Open Graph Data
extract_open_graph(soup)
: Extracts Open Graph data from the parsed HTML.
Social Tags Extraction
extract_social_tags(soup)
: Extracts Twitter Card and Facebook Open Graph data from the parsed HTML.
Page Speed Check
check_page_speed(url)
: Fetches and analyzes page speed metrics using the Google PageSpeed Insights API.
Mobile Usability Check
check_mobile_usability(soup)
: Checks if the website is mobile-friendly based on viewport and other elements.
Alt Text Check
check_alt_text(soup)
: Checks if all images have alt text.
Fetch SEO Data
fetch_seo_data(url)
: Fetches SEO-related data from the provided URL and returns a dictionary with results.
CSV Download
download_csv(data, filename='seo_data.csv')
: Downloads the SEO data as a CSV file.
Analyze On-Page SEO
analyze_onpage_seo()
: Main function to analyze on-page SEO using Streamlit.
Usage
Installation
To use this module, you need to have the following Python packages installed:
requests
streamlit
beautifulsoup4
cloudscraper
pandas
plotly
tenacity
validators
readability
textstat
Pillow
You can install these packages using pip:
pip install requests streamlit beautifulsoup4 cloudscraper pandas plotly tenacity validators readability textstat Pillow
Example
import streamlit as st
from on_page_seo_analyzer import analyze_onpage_seo
if __name__ == "__main__":
analyze_onpage_seo()
Detailed Function Descriptions
fetch_and_parse_html(url)
Fetches HTML content from the given URL using CloudScraper and parses it with BeautifulSoup.
extract_meta_data(soup)
Extracts meta data like title, description, and robots directives from the parsed HTML.
analyze_headings(soup)
Analyzes the headings on the webpage.
check_readability(text)
Checks the readability score of the text.
analyze_images(soup, url)
Analyzes the images on the webpage, including their src and alt text.
analyze_links(soup)
Identifies broken internal and external links on the webpage.
suggest_ctas(soup)
Suggests call-to-action phrases present on the webpage.
extract_alternates_and_canonicals(soup)
Extracts canonical URL, hreflangs, and mobile alternate links from the parsed HTML.
extract_schema_markup(soup)
Extracts schema markup data from the parsed HTML.
extract_content_data(soup, url)
Extracts content data such as text length, headers, and insights about images and links.
extract_open_graph(soup)
Extracts Open Graph data from the parsed HTML.
extract_social_tags(soup)
Extracts Twitter Card and Facebook Open Graph data from the parsed HTML.
check_page_speed(url)
Fetches and analyzes page speed metrics using Google PageSpeed Insights API.
check_mobile_usability(soup)
Checks if the website is mobile-friendly based on viewport and other elements.
check_alt_text(soup)
Checks if all images have alt text.
fetch_seo_data(url)
Fetches SEO-related data from the provided URL and returns a dictionary with results.
download_csv(data, filename='seo_data.csv')
Downloads the data as a CSV file.
analyze_onpage_seo()
Main function to analyze on-page SEO using Streamlit.
License
This project is licensed under the MIT License. See the LICENSE file for more details.
Contributing
Contributions are welcome! Please open an issue or submit a pull request to contribute to this project.