LaVOZs

The World’s Largest Online Community for Developers

'; r - Extracting lazy loaded content using Selenium? - LavOzs.Com

I'm using selenium to

  • navigate to a url
  • scroll down (so lazyloaded images load)

When a human watches the automated browser, the resulting HTML does contain the lazy loaded content, but when the browser is not on screen, the lazyloaded content isn't present in the HTML.

Notes

  • If I watch the browser while the automated browser is working (i.e. have chrome visible on my laptop screen) then the resulting HTML does contain the lazyloaded content
  • If the browser is not open on screen (e.g. if I look at something else, like the code executing in the terminal), then the lazyloaded content is not contained in the HTML!

Question

When a human 'looks' at the browser, the lazyloaded data is captured in the extract HTML, but when the same is done programatically, the lazyloaded data is missing. How come?

R Code

library(RSelenium)
library(dplyr)


url %>% remDr$navigate(.)

webElem <- remDr$findElement("css", "body")


for(i in 1:50) { webElem$sendKeysToElement(list(key = "down_arrow")); Sys.sleep(0.02) } # Scroll down
webElem$click(buttonId = 0); Sys.sleep(0.02) # Click on page

# Repeats above scroll/click 4 more times to get to very bottom of page


# Get html
remDr$getPageSource() %>% .[[1]] %>% .[1] %>% read_html(.)
Related
What is Lazy Loading?
What is lazy loading in Hibernate?
How to take screenshot with Selenium WebDriver
Wait for page load in Selenium
Get HTML Source of WebElement in Selenium WebDriver using Python
Selenium not detecting Sub menu object in small browser window
Selenium using Python - Geckodriver executable needs to be in PATH