Blog

Extracting all URLs on a Web Page with Chrome Developer Tools
Posted on April 1, 2021 in Google Chrome, JavaScript by Matt Jennings

Original Information

Thank you to Shan Eapen Koshy for positing a YouTube video on how to do this.

  1. In Chrome, go the website that you want to extract links from, like https://www.codeschool.com/.
  2.  Open Chrome Developer Tools by pressing Cmd + Opt + i (Mac) or F12 (Windows).
  3. Click the Console panel near the top of Chrome Developer Tools.
  4. Inside the Console panel paste the JavaScript below and press Enter:
    var urls = document.getElementsByTagName('a');
    
    for (url in urls) {
        console.log ( urls[url].href );
    }

    Now you will see all the links from that particular web page.

  5. You can also click the Undock into a separate window button (in the upper-right of Chrome Developer Tools and just left of the X that you can click to close Chrome Developer Tools). This will open a separate window that only displays Chrome Developer Tools along with the extracted links.

Updated (April 1, 2021)

Someone in the comments asked how how then can return only URLs containing “abc” or “defg”. The info below contains that and other information to make your code compatible with older browsers if you want to use this JavaScript snippet in a website.

  1. In Chrome go to a website you want to extract links from, like https://wordpress.org/.
  2. Follow steps through 3 in under the Original Information section above.
  3. Inside the Console panel paste the JavaScript below and press Enter:
    var urls = document.getElementsByTagName('a');
    
    for (var i = 0; i < urls.length; i++) {
        console.log ( urls[i].getAttribute('href') );
    }
  4. Or if on https://wordpress.org/ you want to find all links that contain specific text (like “showcase”) in the URL will on very older browsers (like Internet Explorer 9 and above):
    var urls = document.getElementsByTagName('a');
    
    for (var i = 0; i < urls.length; i++) {
      if(urls[i].getAttribute('href').indexOf('showcase') > -1) {
        console.log ( urls[i].getAttribute('href') );
      }
    }
  5. Or if you want to use modern code that will work in the Google Chrome browser but not very old browsers (like not Internet Explorer at all), use the code below to find all links that contain specific text (like “showcase”) on https://wordpress.org/:
    let urls = document.getElementsByTagName('a');
    
    for (let i = 0; i < urls.length; i++) {
      if(urls[i].getAttribute('href').includes('showcase')) {
        console.log ( urls[i].getAttribute('href') );
      }
    }

14 Responses

  1. Shwetha says:

    But when the same code is written for chrome extension it gives “undefined” as the result

  2. William Pate says:

    I love your code block in this post. Thus, I’m probably gonna steal it. 🙂

  3. Calvin Gooley says:

    Hi,

    This is awesome — how would I designate only return certain urls?

    For example, I want to return only URLs containing “abc” or “defg”.

    • Matt Jennings says:

      Hi Calvin,

      See my answer under the “Updated (April 1, 2021)” heading above. That includes the information you need.

  4. Sebastian says:

    Great code, however I have a problem with it. I want to list all e-mail addresses from a website, but after replacing “showcase” to “mailto:” I’m getting an error: “Uncaught TypeError: Cannot read property ‘includes’ of null
    at :3:34”. Is there a way to make it work?

    • Sebastian says:

      OK, YouTube comment section under the original video solved it for me 😉 Below code works just fine:

      filteredString = ‘mailto:’;
      urls = $$(‘a’); for (url in urls) if (urls[url].href.toLowerCase().includes(filteredString)) console.log ( urls[url].href );

      • Matt Jennings says:

        I assume you are using jQuery Sebastian when you rewrote:
        urls = $$(‘a’);

        You will just need to remove one “$” character in the line above so it looks like:
        urls = $(‘a’);

  5. Praveen says:

    Hi Matt,

    THanks for this code, I am trying to extract all the requests, like document request, xhr requests, Resource requests, and also like start time, complete time and load time for these requests.
    Please let me know, if I can achieve this.

    Praveen T

    • Matt Jennings says:

      Hi Praveen,

      Unfortunately I don’t know how to do this. Good luck with a Google search on how to do this.

  6. Tamil says:

    is there is a way to change the URL from the background for example:

    link from webpage
    https://www[dot]facebook[dot]com

    need to convert with the following format
    https://example[dot]com?url=https://facebook%5Bdot%5Dcom

  7. Kevin says:

    i know this is old but it was 1 in google serp lol

    hey how would i extract links say from tik tok comments

    from only the comment div block

  8. Drew says:

    Thank you so much for this code. You saved me a ton of time. A website linked 50+ pdf files individually. Your code, along with another fella from NOAA, helped me avoid right clicking each one to download. Instead a simple scrape of the pdf links, and a script through command prompt to download the list of links- voila! 50 pdfs downloaded in a matter of seconds.- using stock Windows nonetheless!

    Thank you again!!!!

Leave a Reply