IMPORTXML Function: Extract XML Data from Web Pages in Google Sheets

スポンサーリンク
スポンサーリンク

About the IMPORTXML Function

Overview of IMPORTXML

Extract Specific Data from Web PagesGoogle Sheets Function

=IMPORTXML( URL, XPath Query )

Summary The IMPORTXML function retrieves specific data from the HTML or XML structure of a webpage using an XPath query.

  • Highly customizable, allowing precise data extraction.
  • Ideal for obtaining real-time web data.
  • Useful for extracting targeted information from web pages.

When to Use IMPORTXML

  • When you want to extract specific information (e.g., titles, prices, dates) from a webpage.
  • When you need to extract static, non-dynamic data.
  • To utilize specific HTML elements or attributes as structured data.

How to Use IMPORTXML

The following table demonstrates the basic usage of the IMPORTXML function:

  A B C
1 Description Formula Result
2 Retrieve webpage title =IMPORTXML(“https://example.com”, “//title”) Page Title
3 Extract specific links =IMPORTXML(“https://example.com”, “//a/@href”) Link URLs

Results

  • Cell B2 displays the title of the specified webpage.
  • Cell B3 extracts a list of link URLs from the specified webpage.

Advanced IMPORTXML Examples

Using IMPORTXML, you can automate specific data retrieval. Below are some advanced examples:

  A B C
1 Use Case Formula Result
2 Fetch latest currency exchange rates =IMPORTXML(“https://example.com/forex”, “//rate[@id=’USD-EUR’]”) USD to EUR rate
3 Extract product prices =IMPORTXML(“https://example.com/product”, “//span[@class=’price’]”) Product Price

Points to Note

  • The webpage must use HTTPS; non-HTTPS URLs may not work.
  • Dynamic content generated by JavaScript cannot be extracted.
  • Ensure the correctness of the XPath query syntax to avoid errors.
  • Check the website’s terms of use to ensure compliance.

Conclusion

  • The IMPORTXML function is a powerful tool for extracting specific data from webpages into Google Sheets.
  • Using XPath enables flexible and precise data extraction.
  • While effective for real-time data retrieval, be mindful of website structure changes and usage policies.