If you're here, it's probably because you need to extract variables from an XML (like SOAP) response using JMeter. Best of all, it works with HTML Too.
XPath expressions are a language to select nodes within an XML document. XPath supports different kind of expressions to select various parts of the document.
Expression
Selects
name
All nodes with name name
/
starting from the root node
//
Selects nodes in the document from the current node that match the selection no matter where they are
We can apply the following XPath Expressions to select nodes:
Path Expression
Selects
/
The document node
/root
The 'root' element
/root/actors/actor
All 'actor' elements that are direct children of the 'actors' element
//foo:singer
All 'singer' elements regardless of their positions in the document
//foo:singer/@id
The 'id' attributes of the 'singer' elements regardless of their positions in the document
//actor[1]/text()
The textual value of first 'actor' element
//actor[last()]
The last 'actor' element
//actor[position() < 3]
The first and second 'actor' elements using their position
//actor[@id]
All 'actor' elements that have an 'id' attribute
//actor[@id='3']
The 'actor' element with the 'id' attribute value of '3'
//actor[@id<=3]
All 'actor' nodes with the 'id' attribute value lower or equal to '3'
/root/foo:singers/*
All the children of the 'singers' node.
//*
All the elements in the document.
//actor|//foo:singer
All the 'actor' elements AND the 'singer' elements.
name(//*[1])
Name of the first element in the document.
number(//actor[1]/@id)
Numeric value of the 'id' attribute of the first 'actor' element.
string(//actor[1]/@id)
String representation value of the 'id' attribute of the first 'actor' element.
string-length(//actor[1]/text())
Length of the first 'actor' element's textual value.
local-name(//foo:singer[1])
Select the local name of the first 'singer' element, i.e. without the namespace.
count(//foo:singer)
Select the number of 'singer' elements.
sum(//foo:singer/@id)
Select the sum of the 'id' attributes of the 'singer' elements.
As you can see, there is nothing difficult here. It's just a matter of practice to get things working the way you want. The trickiest part is to test XPath expressions on a given response.
Need to try some Xpath Expressions quickly outside JMeter? Give a try to XPath Tester. It supports configuring both the XML input and the XPath expression.
Let's now explore JMeter's XPath Extractor settings.
JMeter's XPath Extractor has some interesting advanced features:
Use Tidy (tolerant parser): If checked use Tidy to parse HTML response into XHTML,
Use Tidy" should be checked on for HTML response. Such response is converted to valid XHTML (XML compatible HTML) using Tidy,
Use Tidy" should be unchecked for both XHTML or XML response (for example RSS).
For HTML, CSS/JQuery Extractor is the correct and performing solution. Don't use XPath for HTML extractions.
Tidy should be used when receiving invalid XML responses. Tidy is much more tolerant than the built-in XML parser.
When Tidy is selected:
Quiet: Sets the Tidy Quiet flag ,
Report Errors If a Tidy error occurs, then set the Assertion accordingly,
Show warnings: Sets the Tidy showWarnings option.
When Tidy is not selected:
Use Namespaces: If checked, then the XML parser will use namespace resolution.(see note below on NAMESPACES) Note that currently only namespaces declared on the root element will be recognised. See below for user-definition of additional workspace names,
Validate XML: Check the document against its schema,
Ignore Whitespace: Ignore Element Whitespace,
Fetch External DTDs: If selected, external DTDs are fetched,
Return entire XPath fragment instead of text content? If selected, the fragment will be returned rather than the text content.
For example //title would return <title>Apache JMeter</title> rather than Apache JMeter. In this case, //title/text() would return Apache JMeter.
Let's take an example script we've setup for this purpose.
The script is configured with the following settings:
Thread Group: very simple one with just one concurrent user simulated,
Dummy Sampler: Simulates a request whose response is the XML from the example above. The plugin is not installed? Follow our JMeter Plugins Installation Guide.
You should be aware that using XPath expressions on XML / HTML responses consumes a significant amount of CPU and Memory. That being said, it may be much more practical to use XPath expressions (and not JMeter Regexp Extractor) when regular expressions are unsuitable. Typically, it happens when the content to extract is very similar to non-related other content.
Keep in mind the XPath extractor parses the HTML / XML Response and converts it to a DOM document. That's typically what web browsers do and it runs fine for a single client. But it often hurts performance as stated in our guide Optimize JMeter for Large Scale Tests.
Got only light load to simulate? Fine! Otherwise, keep an eye on the load generator's CPU and memory usage.