...
Info | ||
---|---|---|
| ||
The |
...
is not supported by the new pico engine. |
The query()
operator works on HTML strings or on arrays of HTML strings. The HTML is usually loaded by the ruleset using a dataset declaration:
dataset r_html:HTML <- "http://www.htmldog.com/examples/darwin.html"
dataset q_data:HTML <- "http://www.htmldog.com/examples/tablelayout1.html"
The :HTML after the name of the dataset is a hint to KRL that it can skip the JSON parsing stage that is the default when reading data sets.
The query() operator takes an argument that is a jQuery selector string, a comma-separated jQuery string, or an array of jQuery selector strings. query()supports only a subset of the jQuery selectors for now:
- element
- #id
- .class
- [attr]
- [attr=value]
I'll describe the use of each of these in the sections that follow.
element
An element selector is denoted by the element name. An element selector matches all elements of a particular type. For example:
r_html.query("h1"); // returns an array of all <h1> elements
r_html.query("caption,h1");
//returns an array of all <caption> or <h1> elements
#id
An #id selector, denoted by the ID value with the pound sign (#) prepended, matches all elements with a specific ID. For example:
q_html.query("#c_link");
// returns array of all elements like <... id="c_link">
#id selectors can be compounded with element selectors as follows:
q_html.query("a#c_link");
// returns array of all elements like <a id="c_link">
.class
A .class selector, denoted by the class value with a period (.) prepended, matches all elements with a specific class. For example:
q_html.query(".header");
// returns array of elements like <... class="header">
Again, .class selectors can be compounded with element selectors as follows:
q_html.query("p.header");
// returns array of elements like <p class="header">
Or you can combine #id and .class selectors:
q_html.query("#c_link.header");
// returns array of elements like <... id="c_link" class="header">
[attr]
An [attr], or attribute, selector is denoted by the attribute name enclosed in square brackets. The [attr] selector matches all elements with an attribute, even if the attribute value is empty.
q_html.query("[style]");
// returns array of elements like <... style="...">
Combinations of [attr]and other selectors work as you'd expect:
q_html.query("td[style]");
// returns array of elements like <td style="...">
[attr=value]
An [attr=value], or attribute value, selector is denoted by the attribute name and value (as they would appear in the HTML) enclosed in square brackets. The [attr=value] selector matches all elements with an attribute set to a specific value.
q_html.query("[align=center]");
// returns array of elements like <... align="center">
q_html.query("td[align=center]");
// returns array of elements like <td align="center">
Again, you can combine more than one [attr=value] specification:
q_html.query("[align=center][colspan=2]");
// returns array of elements like <... align="center" colspan="2">
Multiple Selectors
You can stack selectors. The examples you've seen had no spaces and thus selected a single element with all the required elements and attributes. If you put spaces between the selectors, they select separate, nested elements matching the specification. For example:
q_html.query("div#header span p[align=center]");
// returns array of elements like
// <div id="header">...<span>...<p align="center"/>...</span>...</div>
If query() is applied to an array of strings, the selector will be applied to each array element:
html_arr = [q_html,r_html];
combo_arr = html_arr.query("a");
// returns array of elements like <a> from both q_html and r_html
You can join multiple selectors together as one string separated by commas or as an array of selector strings:
r_html.query("caption,h1");
r_html.query(["caption","h1"]);
Note that these are different from the following:
r_html.query(["caption h1"]);
The former expressions match either <caption> or <h1>, whereas the latter matches <h1> elements enclosed within <caption> elements.
query() will return an empty array if no HTML matched the selector or the selector syntax was wrong.can be used to search structured persistent variables. The syntax is
Code Block | ||||
---|---|---|---|---|
| ||||
<persistent variable>.query(<hash path>, {
'requires' : <join operator>,
'conditions : [
{ 'search_key' : <path_to_field>,
'operator' : <mongo $ comparison op>,
'value' : <value> },
...
]
},| <extended result>
); |
<hash path>
will be the empty array, []
, if the key is at root.
search_key
is the path from the <hash path>
to the field that you want to compare. If that path does not exist for an entry, it will not be considered even as a null
For example to do a twitter timeline search where the entries have been assigned a unique key to transform the array into a hash:
Code Block | ||||
---|---|---|---|---|
| ||||
ent:tweets.query([],{
'requires' : '$and',
'conditions' : [
{
'search_key' : ['retweeted_status', 'favorite_count'],
'operator' : '$gt',
'value' : 5
},
{
'search_key' : ['retweeted_status', 'favorite_count'],
'operator' : '$lt',
'value' : 200
}
]}) |
This will return an array of <hash paths> (array of arrays) that is essentially an array of the keys to the matching values
Code Block | ||||
---|---|---|---|---|
| ||||
[
[ 'a32' ],
[ 'a31' ],
[ 'a30' ]
] |
You can then use the values to get the entire element with ent:tweets
{[ 'a32' ]}.
if the third argument to query()
is not null, the actual result, rather than a path to it will be returned.
The following function from Fuse uses query()
to return trips by their end date, given a start and end date.
Code Block | ||||
---|---|---|---|---|
| ||||
trips = function(start, end){
utc_start = common:convertToUTC(start);
utc_end = common:convertToUTC(end);
ent:trip_summaries.query([],
{
'requires' : '$and',
'conditions' : [
{
'search_key' : [ 'endWaypoint', 'timestamp'],
'operator' : '$gte',
'value' : utc_start
},
{
'search_key' : [ 'endWaypoint', 'timestamp' ],
'operator' : '$lte',
'value' : utc_end
}
]},
"return_values"
)
}; |
Note that KRL stores data in persistent variables in a "flat format" and therefore, all comparisons are string or numeric comparisons. The query() operator doesn't know about dateTime and other special formats for values.