Basic Usage - JinsongRoh/pydoll-mcp GitHub Wiki
🎯 Basic Usage - Essential Commands and Examples
This guide provides basic commands and practical usage examples for PyDoll MCP Server integrated with Claude Desktop.
🚀 Before You Start
Configuration Check
Make sure PyDoll MCP Server is properly configured with Claude Desktop:
"Please check the PyDoll MCP Server status"
"Please verify if browser automation tools are available"
🌐 Basic Browser Operations
1. Starting and Stopping Browser
Starting Browser
"Please start the browser"
"Please start Chrome browser in headless mode"
"Please open a new browser window"
Stopping Browser
"Please close the browser"
"Please safely terminate the current browser session"
2. Website Navigation
Basic Navigation
"Please navigate to https://example.com"
"Please open Google homepage"
"Please refresh the current page"
Page Navigation
"Please click the back button"
"Please click the forward button"
"Please tell me the current URL"
Page Information
"Please tell me the page title"
"Please wait until the page is fully loaded"
"Please get the source code of the current page"
🔍 Element Finding and Interaction
1. Finding Elements
Basic Element Finding
"Please find the search box"
"Please find the login button"
"Please find the link with title 'Sign Up'"
Advanced Element Finding
"Please find the input field with class 'search-input'"
"Please find the button with id 'submit-btn'"
"Please find all buttons with text 'Next'"
2. Click Actions
Basic Clicking
"Please click the login button"
"Please click the first link"
"Please click the menu icon"
Advanced Clicking
"Please double-click the search button"
"Please right-click the settings menu"
"Please hover over the link"
3. Text Input
Basic Text Input
"Please type 'Python automation' in the search field"
"Please enter '[email protected]' in the username field"
"Please type text in the password field"
Advanced Text Input
"Please clear the input field first and then type new text"
"Please type slowly like a human"
"Please press Enter after typing"
📝 Form Automation
1. Login Form Completion
"Please fill out the login form:
- Username: [email protected]
- Password: secure123"
"Please enter information in the login form and submit"
2. Registration Form Completion
"Please fill out the registration form with the following information:
- Name: John Doe
- Email: [email protected]
- Password: password123
- Confirm Password: password123"
3. File Upload
"Please find the file selection button and upload 'document.pdf'"
"Please attach a file to the image upload field"
4. Dropdown Selection
"Please select 'United States' from the country dropdown"
"Please select '2024' from the year dropdown"
📸 Screenshots and Media
1. Taking Screenshots
"Please take a screenshot of the current page"
"Please capture a full page screenshot"
"Please take a screenshot of only the login form area"
2. PDF Generation
"Please save the current page as PDF"
"Please generate a PDF in A4 size"
3. Page Saving
"Please save the entire content of the current page"
"Please save the page as an HTML file"
🛡️ Advanced Features
1. Captcha Bypass
"Please bypass Cloudflare protection and access the site"
"Please automatically solve captcha and proceed"
"Please navigate the site while avoiding bot detection"
2. Stealth Mode
"Please enable stealth mode and access the site"
"Please navigate the site behaving like a human"
"Please collect data while avoiding detection"
3. Network Monitoring
"Please monitor all network requests for this site"
"Please capture API calls"
"Please analyze network activity"
🔧 Tab Management
1. Tab Operations
"Please open a new tab"
"Please switch to the second tab"
"Please close the current tab"
"Please list all open tabs"
2. Multi-tab Work
"Please open Google in the first tab and Naver in the second tab"
"Please perform searches simultaneously in each tab"
⌨️ Keyboard and Mouse Operations
1. Keyboard Input
"Please press the Enter key"
"Please press Ctrl+C to copy"
"Please press F5 to refresh"
"Please press Tab to move to the next field"
2. Mouse Operations
"Please scroll down the page"
"Please scroll to a specific element"
"Please drag and drop the element"
🎯 Practical Usage Examples
Example 1: Online Shopping
"Please access the shopping site and:
1. Search for 'laptop' in the search box
2. Click on the first product
3. Capture the product information
4. Add to cart"
Example 2: News Site Monitoring
"On the news site:
1. Take a screenshot of the main page
2. Navigate to the 'Technology' category
3. Collect the latest article titles
4. Get the full text of the first article"
Example 3: Social Media Automation
"On the social media site:
1. Please log in
2. Create a new post
3. Attach an image
4. Publish the post"
Example 4: Data Collection
"On the real estate site:
1. Search for apartments in Gangnam-gu, Seoul
2. Collect data from the first page of search results
3. Extract price and area information for each listing
4. Organize and display the results"
🔄 JavaScript Execution
1. Basic JavaScript Execution
"Please execute this JavaScript: alert('Hello World!')"
"Please count the number of all links on the page using JavaScript"
"Please change the current page title using JavaScript"
2. Advanced JavaScript Manipulation
"Please hide all images on the page"
"Please change the style of a specific element"
"Please add a new element to the page"
🕐 Waiting and Timing
1. Element Waiting
"Please wait until the loading icon disappears"
"Please wait until a specific element appears"
"Please wait until page loading is complete"
2. Time-based Waiting
"Please wait for 3 seconds"
"Please wait until network activity stops"
🔍 Text and Data Extraction
1. Text Extraction
"Please extract all text from the page"
"Please get the text content of a specific element"
"Please extract data from the table"
2. Attribute Extraction
"Please extract URLs of all links"
"Please get the src attribute of images"
"Please check the name attribute of form elements"
💡 Practical Tips
1. Efficient Command Usage
- Provide specific and clear instructions
- Break down tasks into steps
- Explain expected results along with requests
2. Error Handling
"If an error occurs, please take a screenshot and explain the situation"
"If an element cannot be found, please try alternative methods"
3. Performance Optimization
"Please run the browser in headless mode"
"Please disable image loading and navigate quickly"
"Please perform only necessary tasks and ignore unnecessary elements"
🎉 Getting Started
You can now leverage the powerful browser automation features of PyDoll MCP Server in Claude Desktop!
Start with basic commands and gradually expand to more complex automation tasks. Each command can be expressed in natural language, and Claude will convert them into appropriate browser automation tasks for execution.
If you have questions about advanced features or specific use cases, feel free to ask anytime!