Web Scraping Using Browser Console

Learn how to use browser developer tools to inspect and extract web page elements through the JavaScript console.

Basic Techniques

1. Accessing the Console

  • Chrome/Firefox: F12 or Right-click → "Inspect" → "Console" tab
  • Safari: Enable Develop menu in Preferences → ⌥⌘C

2. Selecting Elements

// Single element
document.querySelector('CSS_SELECTOR')

// Multiple elements
document.querySelectorAll('CSS_SELECTOR')

3. Common Selector Examples

// By class
document.querySelector('.product-item')

// By ID
document.querySelector('#main-content')

// Nested selectors
document.querySelectorAll('div.list-container > ul > li')

Data Extraction Methods

1. Basic Text Extraction

// Get text content
const title = document.querySelector('h1').innerText

// Get HTML content
const content = document.querySelector('.article').innerHTML

// Get attributes
const imageUrl = document.querySelector('img.thumbnail').src

2. Extracting Multiple Items

// Using spread operator + map
const prices = [...document.querySelectorAll('.price')]
  .map(element => element.innerText)

// Alternative Array.from approach
const titles = Array.from(document.querySelectorAll('.item-title'))
  .map(el => el.textContent.trim())

Advanced Techniques

1. Scrolling and Pagination

// Auto-scroll to load dynamic content
let scrollHeight = 0
const scrollInterval = setInterval(() => {
  window.scrollTo(0, document.body.scrollHeight)
  scrollHeight += 500
  if(scrollHeight > 5000) clearInterval(scrollInterval)
}, 1000)

2. Handling Dynamic Content

// Wait for element to exist
const waitForElement = (selector, timeout=3000) => {
  return new Promise((resolve) => {
    const start = Date.now()
    const check = () => {
      const el = document.querySelector(selector)
      if (el) {
        resolve(el)
      } else if (Date.now() - start < timeout) {
        setTimeout(check, 100)
      }
    }
    check()
  })
}

// Usage
waitForElement('.loaded-content').then(el => console.log(el))

Data Formatting & Export

1. JSON Output

const data = Array.from(document.querySelectorAll('.product'))
  .map(product => ({
    title: product.querySelector('.title').innerText,
    price: product.querySelector('.price').innerText,
    link: product.querySelector('a').href
  }))

console.log(JSON.stringify(data, null, 2))

2. Copy to Clipboard

// Direct copy of JSON
copy(data)

// Table format
console.table(data)

Practical Example: E-commerce Product Scrape

// Execute in console
const products = Array.from(document.querySelectorAll('.product-card'))
  .map(card => ({
    name: card.querySelector('h3').innerText,
    price: card.querySelector('.price').innerText.replace('$', ''),
    sku: card.dataset.productId,
    image: card.querySelector('img').src,
    rating: card.querySelector('.star-rating')?.style.width
  }))

console.table(products)
copy(products) // Copies to clipboard as JSON

Common Errors & Solutions

  • null results: Check selector accuracy
  • Missing dynamic content: Add wait conditions
  • CORS issues: Use browser extensions or proxies
  • Pagination limits: Implement scroll/click handlers