Tools Reference - JinsongRoh/pydoll-mcp GitHub Wiki

🛠️ Tools Reference

PyDoll MCP Server provides 75+ powerful tools for complete browser automation. This page is a detailed reference guide for all tools.

📚 Table of Contents

🌐 Browser Management (8 tools)

start_browser - Start Browser

Start Chrome/Edge browser with advanced configurations.

Usage:

Start a new Chrome browser
Start browser in headless mode
Start browser with ad blocking mode

Key Options:

  • browser_type: Choose chrome or edge
  • headless: No screen display
  • block_ads: Ad blocking
  • stealth_mode: Detection prevention
  • window_size: Window size settings

stop_browser - Stop Browser

Safely terminate browser with resource cleanup.

Usage:

Stop the browser
Close all browser windows

new_tab - Create New Tab

Create isolated tabs with custom settings.

Usage:

Open a new tab
Open a new tab in background
Open a new tab to https://example.com

close_tab - Close Tab

Close specific tab and release resources.

Usage:

Close the current tab
Close the first tab

list_browsers - List Browsers

Display all browser instances and their status.

Usage:

Show list of open browsers

list_tabs - List Tabs

Display detailed tab information.

Usage:

Show all tab lists
Show tabs of current browser

set_active_tab - Change Active Tab

Support smooth switching between tabs.

Usage:

Switch to the first tab
Activate the second tab

get_browser_status - Browser Status

Provide comprehensive browser status report.

Usage:

Check browser status
Show browser performance information

🧭 Navigation and Page Control (10 tools)

navigate_to - Navigate to Page

Intelligent URL navigation with load detection.

Usage:

Navigate to https://google.com
Go to Naver

refresh_page - Refresh Page

Intelligent page refresh with cache control.

Usage:

Refresh the page
Refresh page ignoring cache

go_back / go_forward - Browser History Navigation

Support navigation through browser history.

Usage:

Go back
Go forward
Go back 2 steps

wait_for_page_load - Wait for Page Load

Provide advanced page readiness detection.

Usage:

Wait until page load is complete
Wait until DOM content is loaded

get_current_url - Get Current URL

Return current page URL with validation.

Usage:

Tell me the current page URL

get_page_source - Extract Page Source

Extract complete HTML source.

Usage:

Get page source
Show HTML code

get_page_title - Page Title and Metadata

Extract page title and metadata.

Usage:

Tell me the page title

wait_for_network_idle - Monitor Network Activity

Monitor and wait for network activity.

Usage:

Wait until network activity is complete

set_viewport_size - Set Viewport Size

Provide viewport size settings for responsive design testing.

Usage:

Set viewport size to 1920x1080
Set viewport to mobile size

get_page_info - Page Information Analysis

Provide comprehensive page analysis.

Usage:

Analyze page information
Show all page metadata

🎯 Element Finding and Interaction (15 tools)

find_element - Find Element

Revolutionary element finding using natural language attributes.

Usage:

Find the login button
Find the email input field
Find button with text "Search"

Supported Attributes:

  • text: Text content
  • id: Element ID
  • class_name: CSS class
  • tag_name: HTML tag
  • css_selector: CSS selector
  • xpath: XPath expression
  • placeholder: Input field placeholder
  • aria_label: Accessibility label

click_element - Click Element

Click functionality simulating human-like behavior.

Usage:

Click the login button
Double-click the search button
Right-click the menu

Click Types:

  • left: Left click (default)
  • right: Right click
  • double: Double click
  • middle: Middle click

type_text - Type Text

Provide realistic text input simulation.

Usage:

Type "Python programming" in search box
Slowly type "[email protected]" in email field
Clear password field and type "newpassword"

Options:

  • clear_first: Clear existing text
  • human_like: Human-like typing
  • typing_speed: Typing speed control

press_key - Press Key

Provide advanced keyboard input handling.

Usage:

Press Enter key
Press Ctrl+C
Press Tab key 3 times

Supported Keys:

  • Regular keys: a, b, 1, 2, etc.
  • Special keys: Enter, Tab, Escape, Space
  • Arrow keys: ArrowUp, ArrowDown, ArrowLeft, ArrowRight
  • Function keys: F1, F2, etc.
  • Combination keys: Ctrl+C, Ctrl+V, Alt+Tab

get_element_text - Extract Element Text

Provide intelligent text extraction.

Usage:

Get text from title element
Extract text from all links

get_element_attribute - Extract Element Attribute Value

Extract attribute values from elements.

Usage:

Get href attribute from link
Show all attributes of image

wait_for_element - Wait for Element

Provide intelligent element waiting conditions.

Usage:

Wait until loading spinner disappears
Wait until login button appears

Wait Conditions:

  • visible: Element is visible
  • hidden: Element is hidden
  • enabled: Element is enabled
  • disabled: Element is disabled

scroll_to_element - Scroll to Element

Smooth scroll functionality with viewport management.

Usage:

Scroll to bottom button
Scroll to page center

hover_element - Hover Element

Provide natural mouse hover simulation.

Usage:

Hover over menu item
Hover on dropdown menu

select_option - Select Dropdown

Handle dropdown and select boxes.

Usage:

Select "South Korea" from country selection
Select "Korean" from language options

check_element_visibility - Check Element Visibility

Provide comprehensive visibility testing.

Usage:

Check if login button is visible
Check if error message is displayed

drag_and_drop - Drag and Drop

Support advanced drag and drop operations.

Usage:

Drag file to upload area
Drag item to new position

📸 Screenshots and Media (6 tools)

take_screenshot - Take Screenshot

Provide full page capture with options.

Usage:

Take screenshot of current page
Take full page screenshot
Take viewport-only screenshot

Options:

  • full_page: Capture full page
  • viewport_only: Capture viewport only
  • format: Choose PNG or JPEG
  • quality: JPEG quality settings

take_element_screenshot - Element Screenshot

Provide precise element capture.

Usage:

Take screenshot of login form
Take screenshot of ad area

generate_pdf - Generate PDF

Provide professional PDF generation.

Usage:

Save current page as PDF
Generate PDF in A4 size

save_page_content - Save Page Content

Provide complete page archiving.

Usage:

Save entire page as HTML
Save page including resources

capture_video - Video Capture

Provide screen recording functionality.

Usage:

Start screen recording
Record screen for 10 seconds
Stop screen recording

extract_images - Extract Images

Provide image extraction and processing.

Usage:

Extract all images from page
Extract only PNG images

⚡ JavaScript and Advanced Scripting (8 tools)

execute_javascript - Execute JavaScript

Provide complete JavaScript execution environment.

Usage:

Change page title with JavaScript
Scroll to bottom of page

inject_script_library - Inject Script Library

Support external library injection.

Usage:

Inject jQuery library
Add Lodash to page

create_data_extractor - Create Data Extractor

Create and execute custom data extraction scripts.

Usage:

Extract all product names and prices from page
Extract news article titles and links

automate_form_filling - Automate Form Filling

Automate form completion with provided data.

Usage:

Automatically fill login form
Fill registration form with data

monitor_page_changes - Monitor Page Changes

Monitor DOM changes and trigger callbacks.

Usage:

Monitor page changes
Detect changes in specific element

execute_script_sequence - Execute Script Sequence

Execute script sequences with conditional logic.

Usage:

Execute multiple scripts sequentially
Execute different scripts based on conditions

create_custom_function - Create Custom Function

Create and register custom JavaScript functions.

Usage:

Create custom function
Register function for specific task

analyze_performance - Analyze Performance

Analyze page performance metrics and provide optimization suggestions.

Usage:

Analyze page performance
Measure loading time

🛡️ Security Bypass and Stealth (12 tools)

enable_stealth_mode - Enable Stealth Mode

Provide advanced anti-detection features.

Usage:

Enable stealth mode
Turn on detection prevention

bypass_cloudflare - Bypass Cloudflare

Provide automatic Turnstile solving.

Usage:

Bypass Cloudflare security
Solve Turnstile captcha

bypass_recaptcha - Bypass reCAPTCHA

Provide intelligent reCAPTCHA v3 bypass.

Usage:

Bypass reCAPTCHA
Automatically solve captcha

simulate_human_behavior - Simulate Human Behavior

Simulate realistic user patterns.

Usage:

Simulate human-like behavior
Set natural mouse movements

randomize_fingerprint - Randomize Browser Fingerprint

Randomize browser fingerprint.

Usage:

Randomize browser fingerprint
Generate new browser ID

handle_bot_challenges - Handle Bot Challenges

Provide common challenge solving.

Usage:

Handle bot challenges
Pass security checks

evade_detection - Evade Detection

Provide comprehensive evasion techniques.

Usage:

Evade detection
Bypass bot detection

monitor_protection_status - Monitor Protection Status

Provide real-time security analysis.

Usage:

Monitor security status
Analyze security systems

proxy_rotation - Proxy Rotation

Provide dynamic IP address changing.

Usage:

Rotate proxies
Change IP address

user_agent_rotation - User Agent Rotation

Randomize user agent.

Usage:

Change user agent
Disguise as different browser

header_spoofing - Header Spoofing

Manipulate request headers.

Usage:

Change HTTP headers
Spoof browser headers

timing_randomization - Timing Randomization

Apply human-like timing patterns.

Usage:

Randomize click timing
Apply natural delay times

🌐 Network Control and Monitoring (10 tools)

network_monitoring - Network Monitoring

Provide comprehensive traffic analysis.

Usage:

Monitor network traffic
Track all HTTP requests

intercept_requests - Intercept Requests

Provide real-time request modification.

Usage:

Intercept API requests
Modify specific URL requests

extract_api_responses - Extract API Responses

Provide automatic API capture.

Usage:

Extract API response data
Capture JSON responses

modify_headers - Modify Headers

Provide dynamic header injection.

Usage:

Modify request headers
Add Authorization header

block_resources - Block Resources

Provide resource blocking for performance.

Usage:

Block ad resources
Block image loading

simulate_network_conditions - Simulate Network Conditions

Simulate throttling and latency.

Usage:

Simulate slow network
Set to mobile network speed

get_network_logs - Get Network Logs

Provide detailed activity reports.

Usage:

Show network logs
Check failed requests

monitor_websockets - Monitor WebSockets

Provide WebSocket connection tracking.

Usage:

Monitor WebSocket connections
Track real-time data

analyze_performance - Analyze Performance

Analyze page performance metrics.

Usage:

Analyze page loading performance
Measure network performance

cache_management - Cache Management

Control browser cache.

Usage:

Clear browser cache
Disable cache

📁 File and Data Management (8 tools)

upload_file - Upload File

Provide advanced file upload handling.

Usage:

Upload document.pdf to file selection button
Upload image to upload area

download_file - Download File

Provide controlled downloads with progress indication.

Usage:

Download linked file
Download PDF document

extract_page_data - Extract Page Data

Provide structured data extraction.

Usage:

Extract table data
Extract list items
Extract product information in structured format

export_data - Export Data

Provide multi-format data export.

Usage:

Export data as CSV
Save data in JSON format

import_configuration - Import/Export Configuration

Provide configuration import/export.

Usage:

Import browser settings
Export current settings

manage_sessions - Session Management

Provide session state management.

Usage:

Save current session
Restore previous session

backup_browser_state - Backup Browser State

Provide complete state backup.

Usage:

Backup browser state
Save cookies and sessions

restore_browser_state - Restore Browser State

Provide state restoration.

Usage:

Restore browser state
Load backed up session

📋 Tool Usage Tips

Efficient Usage

1. Sequential Tasks

Start browser → 
Navigate to Google → 
Search for "Python" → 
Click first result

2. Conditional Execution

If login button is visible, click it
If error message appears, take screenshot

3. Data Collection

Extract all links from page
Export table data as CSV

Performance Optimization

1. Use Headless Mode

Start browser in headless mode

2. Block Resources

Load page with images and ads blocked

3. Network Monitoring

Load page while monitoring network traffic

Security and Stealth

1. Stealth Mode

Enable stealth mode and access site

2. Human Behavior Simulation

Click button with natural mouse movement

3. Captcha Bypass

Automatically bypass Cloudflare security

🔍 Advanced Usage Examples

E-commerce Automation

1. Navigate to shopping site
2. Search for products
3. Compare prices
4. Add to cart
5. Automate checkout process

Social Media Monitoring

1. Access social media platform
2. Search keywords
3. Collect post data
4. Extract data for sentiment analysis

Web Scraping

1. Enable stealth mode
2. Access target site
3. Bypass security
4. Extract data
5. Save in structured format

Test Automation

1. Execute test scenarios
2. Auto-fill forms
3. Verify results
4. Generate screenshots
5. Create test reports

📚 Additional Resources

Related Documentation

Practice Examples


Automate anything with 75+ powerful tools! 🚀
Basic Usage | Advanced Configuration | View Examples

⚠️ **GitHub.com Fallback** ⚠️