Discussion:
QSA, the problem with ":scope", and naming
Alex Russell
2011-10-18 16:42:04 UTC
Permalink
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
short:

The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.

Discussions about a Scoped variant or ":scope" pseudo tacitly
acknowledge this, and the JS libraries are proof in their own right:
no major JS library exposes the QSA semantic, instead choosing to
implement a rooted search.

Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names. APIs gain
use both through naming and through use. On today's internet -- the
one where 50% of all websites include jQuery -- you could even go with
element.$("selector") and everyone would know what you mean: it's
clearly a search rooted at the element on the left-hand side of the
dot.

Ceteris peribus, shorter is better. When there's a tie that needs to
be broken, the more frequently used the API, the shorter the name it
deserves -- i.e., the larger the component of its meaning it will gain
through use and repetition and not naming and documentation.

I know some on this list might disagree, but all of the above is
incredibly non-controversial today. Even if there may have been
debates about scoping or naming when QSA was originally designed,
history has settled them. And QSA lost on both counts.

I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
correctly, a scoped search for multiple elements would be written as:

element.querySelectorAll(":scope > div > .thinger");

Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from. I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
above becomes:

element.findAll("> div > .thinger");

Out come the knives! You can't start a selector with a combinator!

Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
syntactic sugar, defined as:

HTMLDocument.prototype.find =
HTMLElement.prototype.find = function(rootedSelector) {
return this.querySelector(":scope " + rootedSelector);
}

HTMLDocument.prototype.findAll =
HTMLElement.prototype.findAll = function(rootedSelector) {
return this.querySelectorAll(":scope " + rootedSelector);
}

Of course, ":scope" in this case is just a special case of the ID
rooting hack, but if we're going to have it, we can kill both birds
with it.

Obvious follow up questions:

Q.) Why do we need this at all? Don't the toolkits already just do
this internally?
A.) Are you saying everyone, everywhere, all the time should need to
use a toolkit to get sane behavior from the DOM? If so, what are we
doing here, exactly?

Q.) Shorter names? Those are for weaklings!
A.) And humans. Who still constitute most of our developers. Won't
someone please think of the humans?

Q.) You're just duplicating things!
A.) If you ignore all of the things that are different, then that's
true. If not, well, then no. This is a change. And a good one for the
reasons listed above.

Thoughts?
Alex Russell
2011-10-18 16:47:26 UTC
Permalink
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
Discussions about a Scoped variant or ":scope" pseudo tacitly
no major JS library exposes the QSA semantic, instead choosing to
implement a rooted search.
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names. APIs gain
use both through naming and through use.
Sorry, this should say "meaning". APIs gain *meaning* through both use
and naming.
Post by Alex Russell
On today's internet -- the
one where 50% of all websites include jQuery -- you could even go with
element.$("selector") and everyone would know what you mean: it's
clearly a search rooted at the element on the left-hand side of the
dot.
Ceteris peribus, shorter is better. When there's a tie that needs to
be broken, the more frequently used the API, the shorter the name it
deserves -- i.e., the larger the component of its meaning it will gain
through use and repetition and not naming and documentation.
I know some on this list might disagree, but all of the above is
incredibly non-controversial today. Even if there may have been
debates about scoping or naming when QSA was originally designed,
history has settled them. And QSA lost on both counts.
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
  element.querySelectorAll(":scope > div > .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from. I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
  element.findAll("> div > .thinger");
Out come the knives! You can't start a selector with a combinator!
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
 HTMLDocument.prototype.find =
 HTMLElement.prototype.find = function(rootedSelector) {
    return this.querySelector(":scope " + rootedSelector);
  }
  HTMLDocument.prototype.findAll =
  HTMLElement.prototype.findAll = function(rootedSelector) {
    return this.querySelectorAll(":scope " + rootedSelector);
  }
Of course, ":scope" in this case is just a special case of the ID
rooting hack, but if we're going to have it, we can kill both birds
with it.
Q.) Why do we need this at all? Don't the toolkits already just do
this internally?
A.) Are you saying everyone, everywhere, all the time should need to
use a toolkit to get sane behavior from the DOM? If so, what are we
doing here, exactly?
Q.) Shorter names? Those are for weaklings!
A.) And humans. Who still constitute most of our developers. Won't
someone please think of the humans?
Q.) You're just duplicating things!
A.) If you ignore all of the things that are different, then that's
true. If not, well, then no. This is a change. And a good one for the
reasons listed above.
Thoughts?
Matt Shulman
2011-10-18 17:25:13 UTC
Permalink
I think the query selector functionality is important enough that one
could easily justify adding additional APIs to make this work
better/faster, even if they overlap with existing APIs. But, it would
be unfortunate if more APIs were added to the DOM and libraries still
weren't able to use them because the semantics didn't end up being
quite right.
It seems like the right approach would be to take jquery and rewrite
it to use this new API and then see empirically whether it gives the
same selection behavior as before and see how much of a performance or
simplicity gain there is after doing this.
(I think it's a good thing to allow selectors to start with
combinators. That seems very useful.)
Post by Alex Russell
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
Discussions about a Scoped variant or ":scope" pseudo tacitly
no major JS library exposes the QSA semantic, instead choosing to
implement a rooted search.
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names. APIs gain
use both through naming and through use.
Sorry, this should say "meaning". APIs gain *meaning* through both use
and naming.
Post by Alex Russell
On today's internet -- the
one where 50% of all websites include jQuery -- you could even go with
element.$("selector") and everyone would know what you mean: it's
clearly a search rooted at the element on the left-hand side of the
dot.
Ceteris peribus, shorter is better. When there's a tie that needs to
be broken, the more frequently used the API, the shorter the name it
deserves -- i.e., the larger the component of its meaning it will gain
through use and repetition and not naming and documentation.
I know some on this list might disagree, but all of the above is
incredibly non-controversial today. Even if there may have been
debates about scoping or naming when QSA was originally designed,
history has settled them. And QSA lost on both counts.
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
  element.querySelectorAll(":scope > div > .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from. I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
  element.findAll("> div > .thinger");
Out come the knives! You can't start a selector with a combinator!
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
 HTMLDocument.prototype.find =
 HTMLElement.prototype.find = function(rootedSelector) {
    return this.querySelector(":scope " + rootedSelector);
  }
  HTMLDocument.prototype.findAll =
  HTMLElement.prototype.findAll = function(rootedSelector) {
    return this.querySelectorAll(":scope " + rootedSelector);
  }
Of course, ":scope" in this case is just a special case of the ID
rooting hack, but if we're going to have it, we can kill both birds
with it.
Q.) Why do we need this at all? Don't the toolkits already just do
this internally?
A.) Are you saying everyone, everywhere, all the time should need to
use a toolkit to get sane behavior from the DOM? If so, what are we
doing here, exactly?
Q.) Shorter names? Those are for weaklings!
A.) And humans. Who still constitute most of our developers. Won't
someone please think of the humans?
Q.) You're just duplicating things!
A.) If you ignore all of the things that are different, then that's
true. If not, well, then no. This is a change. And a good one for the
reasons listed above.
Thoughts?
Alex Russell
2011-10-19 00:00:01 UTC
Permalink
Hi Matt,
Post by Matt Shulman
I think the query selector functionality is important enough that one
could easily justify adding additional APIs to make this work
better/faster, even if they overlap with existing APIs.  But, it would
be unfortunate if more APIs were added to the DOM and libraries still
weren't able to use them because the semantics didn't end up being
quite right.
It seems like the right approach would be to take jquery and rewrite
it to use this new API and then see empirically whether it gives the
same selection behavior as before and see how much of a performance or
simplicity gain there is after doing this.
No need to wait. We had something nearly identical for this in Dojo
using an ID prefix hack. It looked something like this:

(function(){
var ctr = 0;
query = function(query, root){
root = root||document;
var rootIsDoc = (root.nodeType == 9);
var doc = rootIsDoc ? root : (root.ownerDocment||document);

if(!rootIsDoc || (">~+".indexOf(query.charAt(0)) >= 0)){
// Generate an ID prefix for the selector
root.id = root.id||("qUnique"+(ctr++));
query = "#"+root.id+" "+query;
}

return Array.prototype.slice.call(
doc.querySelectorAll(query)
);
};
})();

This is exactly the same dance that ":scope" does.
Post by Matt Shulman
(I think it's a good thing to allow selectors to start with
combinators.  That seems very useful.)
Post by Alex Russell
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
Discussions about a Scoped variant or ":scope" pseudo tacitly
no major JS library exposes the QSA semantic, instead choosing to
implement a rooted search.
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names. APIs gain
use both through naming and through use.
Sorry, this should say "meaning". APIs gain *meaning* through both use
and naming.
Post by Alex Russell
On today's internet -- the
one where 50% of all websites include jQuery -- you could even go with
element.$("selector") and everyone would know what you mean: it's
clearly a search rooted at the element on the left-hand side of the
dot.
Ceteris peribus, shorter is better. When there's a tie that needs to
be broken, the more frequently used the API, the shorter the name it
deserves -- i.e., the larger the component of its meaning it will gain
through use and repetition and not naming and documentation.
I know some on this list might disagree, but all of the above is
incredibly non-controversial today. Even if there may have been
debates about scoping or naming when QSA was originally designed,
history has settled them. And QSA lost on both counts.
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
  element.querySelectorAll(":scope > div > .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from. I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
  element.findAll("> div > .thinger");
Out come the knives! You can't start a selector with a combinator!
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
 HTMLDocument.prototype.find =
 HTMLElement.prototype.find = function(rootedSelector) {
    return this.querySelector(":scope " + rootedSelector);
  }
  HTMLDocument.prototype.findAll =
  HTMLElement.prototype.findAll = function(rootedSelector) {
    return this.querySelectorAll(":scope " + rootedSelector);
  }
Of course, ":scope" in this case is just a special case of the ID
rooting hack, but if we're going to have it, we can kill both birds
with it.
Q.) Why do we need this at all? Don't the toolkits already just do
this internally?
A.) Are you saying everyone, everywhere, all the time should need to
use a toolkit to get sane behavior from the DOM? If so, what are we
doing here, exactly?
Q.) Shorter names? Those are for weaklings!
A.) And humans. Who still constitute most of our developers. Won't
someone please think of the humans?
Q.) You're just duplicating things!
A.) If you ignore all of the things that are different, then that's
true. If not, well, then no. This is a change. And a good one for the
reasons listed above.
Thoughts?
Timmy Willison
2011-10-19 13:39:52 UTC
Permalink
From the perspective of building a selector engine, I think all selector
engines need something like .findAll, and not something like :scope.
No need to wait. We had something nearly identical for this in Dojo
(function(){
var ctr = 0;
query = function(query, root){
root = root||document;
var rootIsDoc = (root.nodeType == 9);
(root.ownerDocment||document);
if(!rootIsDoc || (">~+".indexOf(query.charAt(0)) >= 0)){
// Generate an ID prefix for the selector
root.id = root.id||("qUnique"+(ctr++));
query = "#"+root.id+" "+query;
}
return Array.prototype.slice.call(
doc.querySelectorAll(query)
);
};
})();
This is exactly the same dance that ":scope" does.
Sizzle and Slick do the same thing. As far as I can tell, nwmatcher doesn't
deal with it. We can't just add :scope to all selections (for many reasons)
and adding just before QSA would require the same logic that Alex has
demonstrated above.

All of the selector engines do predictions at loadtime on whether QSA will
work. They continue differently beyond that, but one thing every library
has in common is a try/catch around the call to QSA that falls back to
manual parsing if it throws an exception (intentionally avoiding the need
for complete parsing before calling QSA). The point is it is a
misconception that selector engines parse selectors before delegating to
QSA. The number of things libraries want to do before getting to the QSA
call is very minimal. The one that hurts us all the most is this need for
scoping and ':scope' would simply never be used in a selector engine, since
the id trick already works everywhere. The case Alex wrote above is pretty
much the only case where the selector is parsed beyond checking for tag
only, id only, or class only and it is due to what all of the js libraries
has considered a design flaw in QSA. A method like findAll would fix that,
leaving as much parsing as possible in the hands of the browser.

PS - I should say I don't necessarily think the name 'findAll' would work. I
agree it should be short. The equivalent of querySelector would be find and
in library land 'find' selects more than one thing, but I'm not as concerned
about the name.
Brian Kardell
2011-10-20 11:21:13 UTC
Permalink
So I spoke with Borris about this at some length offline yesterday and was
really shocked to discover that in the interest of supporting docs that do
not conform, there appears to be 100% implementation agreement in CSS (and
therefore qsa) that id selectors must match against all elements with the
same id and that getElementById must always return the first (which makes it
hyper easy by comparison to match). This is in conflict with specs dating
way back in terms of whether what the spec says is normative - but its
definitely implemented that way everywhere. We have submitted errata on
that, but if it helps shed some light on earlier comments, that's why.

He said earlier that it would be easy enough to optimize in the case that
there is only one element with a given id, but in other cases get element by
id would remain faster though not conforming with all css implementations or
existing rec intents.

I'm curious to hear comments on whether that would satisfy jquery et all as
well as what course of action they will take re: the disparity that now
apparently exists between css and jquery selector implementations of this.
Given that no current jquery app could possibly rely on this difference and
that jquery always reurns a list it seems like you could can parity, but
risk performance.
Post by Timmy Willison
From the perspective of building a selector engine, I think all selector
engines need something like .findAll, and not something like :scope.
No need to wait. We had something nearly identical for this in Dojo
(function(){
var ctr = 0;
query = function(query, root){
root = root||document;
var rootIsDoc = (root.nodeType == 9);
(root.ownerDocment||document);
if(!rootIsDoc || (">~+".indexOf(query.charAt(0)) >= 0)){
// Generate an ID prefix for the selector
root.id = root.id||("qUnique"+(ctr++));
query = "#"+root.id+" "+query;
}
return Array.prototype.slice.call(
doc.querySelectorAll(query)
);
};
})();
This is exactly the same dance that ":scope" does.
Sizzle and Slick do the same thing. As far as I can tell, nwmatcher doesn't
deal with it. We can't just add :scope to all selections (for many reasons)
and adding just before QSA would require the same logic that Alex has
demonstrated above.
All of the selector engines do predictions at loadtime on whether QSA will
work. They continue differently beyond that, but one thing every library
has in common is a try/catch around the call to QSA that falls back to
manual parsing if it throws an exception (intentionally avoiding the need
for complete parsing before calling QSA). The point is it is a
misconception that selector engines parse selectors before delegating to
QSA. The number of things libraries want to do before getting to the QSA
call is very minimal. The one that hurts us all the most is this need for
scoping and ':scope' would simply never be used in a selector engine, since
the id trick already works everywhere. The case Alex wrote above is pretty
much the only case where the selector is parsed beyond checking for tag
only, id only, or class only and it is due to what all of the js libraries
has considered a design flaw in QSA. A method like findAll would fix that,
leaving as much parsing as possible in the hands of the browser.
PS - I should say I don't necessarily think the name 'findAll' would
work. I agree it should be short. The equivalent of querySelector would be
find and in library land 'find' selects more than one thing, but I'm not as
concerned about the name.
Erik Arvidsson
2011-10-18 17:00:10 UTC
Permalink
Post by Alex Russell
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
 HTMLDocument.prototype.find =
 HTMLElement.prototype.find = function(rootedSelector) {
    return this.querySelector(":scope " + rootedSelector);
  }
  HTMLDocument.prototype.findAll =
  HTMLElement.prototype.findAll = function(rootedSelector) {
    return this.querySelectorAll(":scope " + rootedSelector);
  }
I like the way you think. Can I subscribe to your mailing list?

One thing to point out with the desugar is that it has a bug and most
JS libs have the same but. querySelectorAll allows multiple selectors,
separated by a comma and to do this correctly you need to parse the
selector which of course requires tons of code so no one does this.
Lets fix that by building this into the platform.
--
erik
Alex Russell
2011-10-18 23:38:31 UTC
Permalink
Post by Erik Arvidsson
Post by Alex Russell
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
 HTMLDocument.prototype.find =
 HTMLElement.prototype.find = function(rootedSelector) {
    return this.querySelector(":scope " + rootedSelector);
  }
  HTMLDocument.prototype.findAll =
  HTMLElement.prototype.findAll = function(rootedSelector) {
    return this.querySelectorAll(":scope " + rootedSelector);
  }
I like the way you think. Can I subscribe to your mailing list?
Heh. Yes ;-)
Post by Erik Arvidsson
One thing to point out with the desugar is that it has a bug and most
JS libs have the same but. querySelectorAll allows multiple selectors,
separated by a comma and to do this correctly you need to parse the
selector which of course requires tons of code so no one does this.
Lets fix that by building this into the platform.
I agree. I left should have mentioned it. The resolution I think is
most natural is to split on "," and assume that all selectors in the
list are ":scope" prefixed and that. A minor point is how to order the
items in the returned flattened list are ordered (document order? the
natural result of concat()?).
Brian Kardell
2011-10-18 23:45:42 UTC
Permalink
Some pseudos can contain selector groups, so it would be more than just
split on comma.
Post by Alex Russell
Post by Erik Arvidsson
Post by Alex Russell
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
HTMLDocument.prototype.find =
HTMLElement.prototype.find = function(rootedSelector) {
return this.querySelector(":scope " + rootedSelector);
}
HTMLDocument.prototype.findAll =
HTMLElement.prototype.findAll = function(rootedSelector) {
return this.querySelectorAll(":scope " + rootedSelector);
}
I like the way you think. Can I subscribe to your mailing list?
Heh. Yes ;-)
Post by Erik Arvidsson
One thing to point out with the desugar is that it has a bug and most
JS libs have the same but. querySelectorAll allows multiple selectors,
separated by a comma and to do this correctly you need to parse the
selector which of course requires tons of code so no one does this.
Lets fix that by building this into the platform.
I agree. I left should have mentioned it. The resolution I think is
most natural is to split on "," and assume that all selectors in the
list are ":scope" prefixed and that. A minor point is how to order the
items in the returned flattened list are ordered (document order? the
natural result of concat()?).
Alex Russell
2011-10-19 00:13:49 UTC
Permalink
Post by Brian Kardell
Some pseudos can contain selector groups, so it would be more than just
split on comma.
Yes, yes, of course. I've written one of these parsers. Just saying
that the impl would split selector groups and prefix them all with
":scope "
Post by Brian Kardell
Post by Alex Russell
Post by Erik Arvidsson
Post by Alex Russell
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
 HTMLDocument.prototype.find =
 HTMLElement.prototype.find = function(rootedSelector) {
    return this.querySelector(":scope " + rootedSelector);
  }
  HTMLDocument.prototype.findAll =
  HTMLElement.prototype.findAll = function(rootedSelector) {
    return this.querySelectorAll(":scope " + rootedSelector);
  }
I like the way you think. Can I subscribe to your mailing list?
Heh. Yes ;-)
Post by Erik Arvidsson
One thing to point out with the desugar is that it has a bug and most
JS libs have the same but. querySelectorAll allows multiple selectors,
separated by a comma and to do this correctly you need to parse the
selector which of course requires tons of code so no one does this.
Lets fix that by building this into the platform.
I agree. I left should have mentioned it. The resolution I think is
most natural is to split on "," and assume that all selectors in the
list are ":scope" prefixed and that. A minor point is how to order the
items in the returned flattened list are ordered (document order? the
natural result of concat()?).
Boris Zbarsky
2011-10-19 01:15:28 UTC
Permalink
The resolution I think is most natural is to split on ","
That fails with :any, with the expanded :not syntax, on attr selectors, etc.

You can split on ',' while observing proper paren and quote nesting, but
that can get pretty complicated.
A minor point is how to order the
items in the returned flattened list are ordered (document order? the
natural result of concat()?).
Document order.

-Boris
Ojan Vafai
2011-10-19 03:39:18 UTC
Permalink
Overall, I wholeheartedly support the proposal.

I don't really see the benefit of allowing starting with a combinator. I
think it's a rare case that you actually care about the scope element and in
those cases, using :scope is fine. Instead of element.findAll("> div >
.thinger"), you use element.findAll(":scope > div > .thinger"). That said, I
don't object to considering the :scope implied if the selector starts with a
combinator.
Post by Boris Zbarsky
The resolution I think is most natural is to split on ","
That fails with :any, with the expanded :not syntax, on attr selectors, etc.
You can split on ',' while observing proper paren and quote nesting, but
that can get pretty complicated.
Can we define it as a sequence of selectors and be done with it? That way it
can be defined as using the same parsing as CSS.
Post by Boris Zbarsky
A minor point is how to order the
items in the returned flattened list are ordered (document order? the
natural result of concat()?).
Document order.
Definitely.
Post by Boris Zbarsky
-Boris
Alex Russell
2011-10-19 08:24:53 UTC
Permalink
Post by Ojan Vafai
Overall, I wholeheartedly support the proposal.
I don't really see the benefit of allowing starting with a combinator. I
think it's a rare case that you actually care about the scope element and in
those cases, using :scope is fine. Instead of element.findAll("> div >
.thinger"), you use element.findAll(":scope > div > .thinger"). That said, I
don't object to considering the :scope implied if the selector starts with a
combinator.
Right, I think the argument for allowing a combinator start is two-fold:

1.) the libraries allow it, so should DOM
2.) we know the thing on the left, it's the implicit scope. Shorter is
better, so allowing the implicitness here is a win on that basis

I have a mild preference for argument #2. Shorter, without loss of
clarity, for common stuff should nearly always win.
Post by Ojan Vafai
Post by Boris Zbarsky
The resolution I think is most natural is to split on ","
That fails with :any, with the expanded :not syntax, on attr selectors, etc.
You can split on ',' while observing proper paren and quote nesting, but
that can get pretty complicated.
Can we define it as a sequence of selectors and be done with it? That way it
can be defined as using the same parsing as CSS.
Post by Boris Zbarsky
A minor point is how to order the
items in the returned flattened list are ordered (document order? the
natural result of concat()?).
Document order.
Definitely.
Post by Boris Zbarsky
-Boris
Sean Hogan
2011-10-19 11:58:34 UTC
Permalink
Post by Ojan Vafai
Overall, I wholeheartedly support the proposal.
I don't really see the benefit of allowing starting with a combinator.
I think it's a rare case that you actually care about the scope
element and in those cases, using :scope is fine. Instead of
element.findAll("> div > .thinger"), you use element.findAll(":scope >
div > .thinger"). That said, I don't object to considering the :scope
implied if the selector starts with a combinator.
I can think of two reasons one might ponder allowing :scope to be explicit.

1. so that the selector string can be a valid CSS selector string.
(":scope>div>.thinger" instead of ">div>.thinger"). But if this is
important then :scope should always be explicit, in which case we can
just use querySelectorAll().

2. to allow break-out behavior. e.g.

div.findAll("body div span"); // finds nothing
div.findAll("body div:scope span"); // finds span's that are descendants
of div

In this scenario, the :scope pseudo allows ancestors of div to be
matched against. (No-one would use body in this context, but it is easy
to imagine them using a .class selector which matches an ancestor of div.)

But if you want break-out behavior you might not know which part of the
selector to put the :scope pseudo on. Could it be:
body div:scope span
body *:scope div span
body div *:scope span

So for break-out behavior just use querySelectorAll().

I'm pretty sure previous discussions (before this thread) have covered
this more thoroughly, and shown that it has to be all or nothing with
the :scope pseudo-attribute. That is, either
a) :scope MUST be explicit, in which case just use querySelectorAll()
b) :scope MUST be implied at the start of every selector chain.

Sean
Brian Kardell
2011-10-18 19:59:51 UTC
Permalink
I know that there were discussions that crossed over into CSS about a
@global or a :context which could sort of include things outside the
scope as part of the query but not be the subject. Does any of that
relate here?

- Brian


PS
Post by Alex Russell
Out come the knives! You can't start a selector with a combinator!
Even on CSS lists this has been proposed inside of pseudos... Numerous
times and in numerous contexts. It seems to me that everyone (even
the people who disagree with the proposal) knows what it means
immediately - but you are right... That's always the response. So at
the risk of being stabbed by an angry mob: Can someone explain _why_
you can't - under absolutely any circumstances - begin a selector with
a combinator - even if there appears to be wide agreement that it
makes sense in a finite set of circumstances?
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
Discussions about a Scoped variant or ":scope" pseudo tacitly
no major JS library exposes the QSA semantic, instead choosing to
implement a rooted search.
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names. APIs gain
use both through naming and through use. On today's internet -- the
one where 50% of all websites include jQuery -- you could even go with
element.$("selector") and everyone would know what you mean: it's
clearly a search rooted at the element on the left-hand side of the
dot.
Ceteris peribus, shorter is better. When there's a tie that needs to
be broken, the more frequently used the API, the shorter the name it
deserves -- i.e., the larger the component of its meaning it will gain
through use and repetition and not naming and documentation.
I know some on this list might disagree, but all of the above is
incredibly non-controversial today. Even if there may have been
debates about scoping or naming when QSA was originally designed,
history has settled them. And QSA lost on both counts.
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
  element.querySelectorAll(":scope > div > .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from. I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
  element.findAll("> div > .thinger");
Out come the knives! You can't start a selector with a combinator!
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
 HTMLDocument.prototype.find =
 HTMLElement.prototype.find = function(rootedSelector) {
    return this.querySelector(":scope " + rootedSelector);
  }
  HTMLDocument.prototype.findAll =
  HTMLElement.prototype.findAll = function(rootedSelector) {
    return this.querySelectorAll(":scope " + rootedSelector);
  }
Of course, ":scope" in this case is just a special case of the ID
rooting hack, but if we're going to have it, we can kill both birds
with it.
Q.) Why do we need this at all? Don't the toolkits already just do
this internally?
A.) Are you saying everyone, everywhere, all the time should need to
use a toolkit to get sane behavior from the DOM? If so, what are we
doing here, exactly?
Q.) Shorter names? Those are for weaklings!
A.) And humans. Who still constitute most of our developers. Won't
someone please think of the humans?
Q.) You're just duplicating things!
A.) If you ignore all of the things that are different, then that's
true. If not, well, then no. This is a change. And a good one for the
reasons listed above.
Thoughts?
Alex Russell
2011-10-19 00:03:08 UTC
Permalink
Post by Brian Kardell
I know that there were discussions that crossed over into CSS about a
@global or a :context which could sort of include things outside the
scope as part of the query but not be the subject.  Does any of that
relate here?
I suppose it does, but only as an implementation detail. Nothing more
than the ID prefix hack or ":scope" are really necessary to get the
API we need.
Post by Brian Kardell
PS
Post by Alex Russell
Out come the knives! You can't start a selector with a combinator!
Even on CSS lists this has been proposed inside of pseudos... Numerous
times and in numerous contexts.   It seems to me that everyone (even
the people who disagree with the proposal) knows what it means
immediately - but you are right... That's always the response.  So at
the risk of being stabbed by an angry mob:  Can someone explain _why_
you can't - under absolutely any circumstances - begin a selector with
a combinator - even if there appears to be wide agreement that it
makes sense in a finite set of circumstances?
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
Discussions about a Scoped variant or ":scope" pseudo tacitly
no major JS library exposes the QSA semantic, instead choosing to
implement a rooted search.
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names. APIs gain
use both through naming and through use. On today's internet -- the
one where 50% of all websites include jQuery -- you could even go with
element.$("selector") and everyone would know what you mean: it's
clearly a search rooted at the element on the left-hand side of the
dot.
Ceteris peribus, shorter is better. When there's a tie that needs to
be broken, the more frequently used the API, the shorter the name it
deserves -- i.e., the larger the component of its meaning it will gain
through use and repetition and not naming and documentation.
I know some on this list might disagree, but all of the above is
incredibly non-controversial today. Even if there may have been
debates about scoping or naming when QSA was originally designed,
history has settled them. And QSA lost on both counts.
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
  element.querySelectorAll(":scope > div > .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from. I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
  element.findAll("> div > .thinger");
Out come the knives! You can't start a selector with a combinator!
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
 HTMLDocument.prototype.find =
 HTMLElement.prototype.find = function(rootedSelector) {
    return this.querySelector(":scope " + rootedSelector);
  }
  HTMLDocument.prototype.findAll =
  HTMLElement.prototype.findAll = function(rootedSelector) {
    return this.querySelectorAll(":scope " + rootedSelector);
  }
Of course, ":scope" in this case is just a special case of the ID
rooting hack, but if we're going to have it, we can kill both birds
with it.
Q.) Why do we need this at all? Don't the toolkits already just do
this internally?
A.) Are you saying everyone, everywhere, all the time should need to
use a toolkit to get sane behavior from the DOM? If so, what are we
doing here, exactly?
Q.) Shorter names? Those are for weaklings!
A.) And humans. Who still constitute most of our developers. Won't
someone please think of the humans?
Q.) You're just duplicating things!
A.) If you ignore all of the things that are different, then that's
true. If not, well, then no. This is a change. And a good one for the
reasons listed above.
Thoughts?
Yehuda Katz
2011-10-18 20:20:05 UTC
Permalink
I agree entirely.

I have asked a number of practitioner friends about this scenario:

<div id="parent">
<p id="child"><span id="inline">Content</span></p>
</div>

document.getElementById("child").querySelectorAll("div span"); // returns
#inline

In 100% of cases, people consider this behavior *broken*. Not just
"interesting, I wouldn't have expected that", but "who came up with that!?".
In all cases involving JavaScript practitioners, people expect
querySelectorAll to operate on the element as though the element was the
root of a new document, and where combinators are relative to the element.

We already knew this was true since all JavaScript libraries that implement
selectors implemented them in this way.

I also agree that the name querySelectorAll (like getElement(s)By*,
requestAnimationFrame, addEventListener, and most other DOM APIs), are
simply too long to use in day-to-day usage. This results in the need to use
libraries for day-to-day browser development simply to reduce this
borderline-comical verbosity.

I like find and findAll, as jQuery has a `find` which invokes the selector
engine. As to whether jQuery would benefit from the improvements, the
existing jQuery implementation (Sizzle) of the selector engine when qSA is
available is at:
https://github.com/jquery/sizzle/blob/master/sizzle.js#L1150-1233

There are a few categories of extensions:

- Speeding up certain operations like `#foo` and `body`. There is *no
excuse* for it being possible to implement userland hacks that improve on
the performance of querySelectorAll. This may be the result of browsers
failing to cache the result of parsing selectors or something else, but the
fact remains that qSA can be noticably slower than the old DOM methods, even
when Sizzle needs to parse the selector to look for fast-paths.
- Fixing the implementation mistake in qSA that we're discussing here.
- Bugs in specific browsers.

jQuery also handles certain custom pseudoselectors, and it might be nice if
it was possible to register JavaScript functions that qSA would use if it
found an unknown pseudo (this would make it possible to implement most of
jQuery's selector engine in terms of qSA), but that's a discussion for
another day.

Yehuda Katz
(ph) 718.877.1325
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are
mis-designed.
Discussions about a Scoped variant or ":scope" pseudo tacitly
no major JS library exposes the QSA semantic, instead choosing to
implement a rooted search.
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names. APIs gain
use both through naming and through use. On today's internet -- the
one where 50% of all websites include jQuery -- you could even go with
element.$("selector") and everyone would know what you mean: it's
clearly a search rooted at the element on the left-hand side of the
dot.
Ceteris peribus, shorter is better. When there's a tie that needs to
be broken, the more frequently used the API, the shorter the name it
deserves -- i.e., the larger the component of its meaning it will gain
through use and repetition and not naming and documentation.
I know some on this list might disagree, but all of the above is
incredibly non-controversial today. Even if there may have been
debates about scoping or naming when QSA was originally designed,
history has settled them. And QSA lost on both counts.
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
element.querySelectorAll(":scope > div > .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from. I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
element.findAll("> div > .thinger");
Out come the knives! You can't start a selector with a combinator!
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
HTMLDocument.prototype.find =
HTMLElement.prototype.find = function(rootedSelector) {
return this.querySelector(":scope " + rootedSelector);
}
HTMLDocument.prototype.findAll =
HTMLElement.prototype.findAll = function(rootedSelector) {
return this.querySelectorAll(":scope " + rootedSelector);
}
Of course, ":scope" in this case is just a special case of the ID
rooting hack, but if we're going to have it, we can kill both birds
with it.
Q.) Why do we need this at all? Don't the toolkits already just do
this internally?
A.) Are you saying everyone, everywhere, all the time should need to
use a toolkit to get sane behavior from the DOM? If so, what are we
doing here, exactly?
Q.) Shorter names? Those are for weaklings!
A.) And humans. Who still constitute most of our developers. Won't
someone please think of the humans?
Q.) You're just duplicating things!
A.) If you ignore all of the things that are different, then that's
true. If not, well, then no. This is a change. And a good one for the
reasons listed above.
Thoughts?
Boris Zbarsky
2011-10-18 20:40:37 UTC
Permalink
* Speeding up certain operations like `#foo` and `body`. There is *no
excuse* for it being possible to implement userland hacks that
improve on the performance of querySelectorAll.
Sure there is. One such "excuse", for example, is that the userland
hacks have different behavior from querySelectorAll in many cases. Now
the author happens to know that the difference doesn't matter in their
case, but the _browser_ has no way to know that.

The other "excuse" is that adding special cases (which is what you're
asking for) slows down all the non-special-case codepaths. That may be
fine for _your_ usage of querySelectorAll, where you use it with a
particular limited set of selectors, but it's not obvious that this is
always a win.
This may be the result of browsers failing to cache the result of parsing selectors
Yep. Browsers don't cache it. There's generally no reason to. I have
yet to see any real-life testcase bottlenecked on this part of
querySelectorAll performance.
or something else, but the fact remains that qSA can be noticably
slower than the old DOM methods, even when Sizzle needs to parse the
selector to look for fast-paths.
I'd love to see testcases showing this.
jQuery also handles certain custom pseudoselectors, and it might be nice
if it was possible to register JavaScript functions that qSA would use
if it found an unknown pseudo
This is _very_ hard to reasonably unless the browser can trust those
functions to not do anything weird. Which of course it can't. So your
options are either much slower selector matching or not having this.
Your pick.

-Boris
Brian Kardell
2011-10-18 21:01:30 UTC
Permalink
Post by Boris Zbarsky
This is _very_ hard to reasonably unless the browser can trust those
functions to not do anything weird.  Which of course it can't.  So your
options are either much slower selector matching or not having this. Your
pick.
This too has come up in some discussions on CSS (CSSOM I think) that I
have had. In the right context - I don't think it would actually be
that hard. It would require a way to provide a sand-boxed evaluation
(read only elements) and a pattern much like jquery's where it is a
filter which can only return true or false. True enough that it would
be slower than native for a few reasons - but perhaps still useful.
Post by Boris Zbarsky
 * Speeding up certain operations like `#foo` and `body`. There is *no
   excuse* for it being possible to implement userland hacks that
   improve on the performance of querySelectorAll.
Sure there is.  One such "excuse", for example, is that the userland hacks
have different behavior from querySelectorAll in many cases.  Now the author
happens to know that the difference doesn't matter in their case, but the
_browser_ has no way to know that.
The other "excuse" is that adding special cases (which is what you're asking
for) slows down all the non-special-case codepaths.  That may be fine for
_your_ usage of querySelectorAll, where you use it with a particular limited
set of selectors, but it's not obvious that this is always a win.
This may be the result of browsers failing to cache the result of parsing selectors
Yep.  Browsers don't cache it.  There's generally no reason to.  I have yet
to see any real-life testcase bottlenecked on this part of querySelectorAll
performance.
   or something else, but the fact remains that qSA can be noticably
   slower than the old DOM methods, even when Sizzle needs to parse the
   selector to look for fast-paths.
I'd love to see testcases showing this.
jQuery also handles certain custom pseudoselectors, and it might be nice
if it was possible to register JavaScript functions that qSA would use
if it found an unknown pseudo
This is _very_ hard to reasonably unless the browser can trust those
functions to not do anything weird.  Which of course it can't.  So your
options are either much slower selector matching or not having this. Your
pick.
-Boris
Boris Zbarsky
2011-10-18 21:04:29 UTC
Permalink
Post by Brian Kardell
This too has come up in some discussions on CSS (CSSOM I think) that I
have had. In the right context - I don't think it would actually be
that hard. It would require a way to provide a sand-boxed evaluation
(read only elements)
This is not that easy. Especially because you can reach all DOM objects
from elements, so you have to lock down the entire API somehow.
Post by Brian Kardell
and a pattern much like jquery's where it is a
filter which can only return true or false. True enough that it would
be slower than native for a few reasons - but perhaps still useful.
The slowness comes from not having a way to tell whether the world has
changed under you or not and therefore having to assume that it has, not
from the actual call into JS per se.

-Boris
Brian Kardell
2011-10-18 21:23:29 UTC
Permalink
Post by Brian Kardell
This too has come up in some discussions on CSS (CSSOM I think) that I
have had.  In the right context - I don't think it would actually be
that hard.  It would require a way to provide a sand-boxed evaluation
(read only elements)
This is not that easy.  Especially because you can reach all DOM objects
from elements, so you have to lock down the entire API somehow.
Right, you would need essentially, to pass in a node list which
iterated 'lite' read-only elements. Not impossible to imagine -
right? Maybe I'm way off, but actually seems not that difficult to
imagine the implementation.
Post by Brian Kardell
and a pattern much like jquery's where it is a
filter which can only return true or false.  True enough that it would
be slower than native for a few reasons - but perhaps still useful.
The slowness comes from not having a way to tell whether the world has
changed under you or not and therefore having to assume that it has, not
from the actual call into JS per se.
I imagine that they would be implemented as filters so if you had

div .x:foo(.bar) span

The normal CSS resolution would be to get the spans, narrow by .x's
then throw what you have so far to the filter, removing anything that
returned false and carrying on as normal. The slowness as I see it
would be that the filter would yes, call across the boundary and yes
have to build some intermediate and evaluating anything too complex in
the filter in that would be very slow by comparison probably - but you
don't have to do "much" to be useful... Is there something in that
pattern that I am missing in terms of what you are saying about
identifying what has changed out from underneath you? As far as I can
see it doesn't invalidate anything that already exists in CSS/selector
implementations in terms of indexes or anything - but I've been
looking for an answer to this exact question so if you know something
I'd be very interested in even a pointer to some code so I can
understand myself.
Boris Zbarsky
2011-10-18 21:32:12 UTC
Permalink
Post by Brian Kardell
Post by Boris Zbarsky
This is not that easy. Especially because you can reach all DOM objects
from elements, so you have to lock down the entire API somehow.
Right, you would need essentially, to pass in a node list which
iterated 'lite' read-only elements.
So the script would not get an actual DOM tree and not run in the Window
scope? The objects would not have an ownerDocument? What other
restrictions would they need to have?
Post by Brian Kardell
Maybe I'm way off, but actually seems not that difficult to
imagine the implementation.
If we're willing to pass in some totally-not-DOM data structure and run
in some sandbox scope, then sure.
Post by Brian Kardell
div .x:foo(.bar) span
The normal CSS resolution would be to get the spans, narrow by .x's
then throw what you have so far to the filter, removing anything that
returned false and carrying on as normal.
Normal CSS selector examines the .x part for each span as it finds it.
Otherwise selectors like "#foo > *" would require building up a list of
all elements in the DOM, no?
Post by Brian Kardell
The slowness as I see it would be that the filter would yes, call across the boundary and yes
have to build some intermediate and evaluating anything too complex in
the filter in that would be very slow by comparison probably - but you
don't have to do "much" to be useful... Is there something in that
pattern that I am missing in terms of what you are saying about
identifying what has changed out from underneath you?
_If_ the filter runs JS that can touch the DOM, then in your example for
every span you find you'd end up calling into the filter, and then you
have to worry about the filter rearranging the DOM under you.
Post by Brian Kardell
As far as I can see it doesn't invalidate anything that already exists in CSS/selector
implementations in terms of indexes or anything
At least the querySelectorAll implementations I have looked at (WebKit
and Gecko) traverse the DOM and for each element they find check whether
it matches the selector. If so, they add it to the result set.
Furthermore, selector matching itself has to walk over the tree in
various ways (e.g. to handle combinators). Both operations right now
assume that the tree does NOT mutate while this is happening.

-Boris
Brian Kardell
2011-10-18 22:05:50 UTC
Permalink
Post by Boris Zbarsky
Post by Brian Kardell
This is not that easy.  Especially because you can reach all DOM objects
from elements, so you have to lock down the entire API somehow.
Right, you would need essentially, to pass in a node list which
iterated 'lite' read-only elements.
So the script would not get an actual DOM tree and not run in the Window
scope?  The objects would not have an ownerDocument?  What other
restrictions would they need to have?
They would run in their own sandbox and they would have access to the
parameters passed into the function by way of pattern. I think that
that pattern would look a lot like jquery's selector plugin pattern
something like: The match itself, the index of the match, the
arguments to the selector itself. The 'match' in this case wouldn't be
a mutable DOM element. You can give it a smaller API by saying that
the 'lite' version of the element that is passed in has no properties
which might give you something mutable - or you can say that all
methods/properties would also return immutable shadows of themselves.
I would be happy to walk through more detailed ideas in terms of what
specifically that would look like if there were some kind of initial
"yeah, that might work - its worth looking into some more" :)
Post by Boris Zbarsky
Post by Brian Kardell
Maybe I'm way off, but actually seems not that difficult to
imagine the implementation.
If we're willing to pass in some totally-not-DOM data structure and run in
some sandbox scope, then sure.
Post by Brian Kardell
div .x:foo(.bar) span
The normal CSS resolution would be to get the spans, narrow by .x's
then throw what you have so far to the filter, removing anything that
returned false and carrying on as normal.
Normal CSS selector examines the .x part for each span as it finds it.
Otherwise selectors like "#foo > *" would require building up a list of all
elements in the DOM, no?
I'm not sure that I understand the distinction of what you are saying
here or if it matters. My understanding of the webkit code was that
it walks the tree (or subtree) once (as created/modifed) and optimizes
fastpath indexes on classes, ids and tags (also some other
optimizations for some slightly more complex things if I recall). I
would have expected the querySelector** stuff to re-use that
underlying code, but I don't know - it sounds like you are saying
maybe not.
Post by Boris Zbarsky
Post by Brian Kardell
The slowness as I see it would be that the filter would yes, call across
the boundary and yes
have to build some intermediate and evaluating anything too complex in
the filter in that would be very slow by comparison probably - but you
don't have to do "much" to be useful...  Is there something in that
pattern that I am missing in terms of  what you are saying about
identifying what has changed out from underneath you?
_If_ the filter runs JS that can touch the DOM, then in your example for
every span you find you'd end up calling into the filter, and then you have
to worry about the filter rearranging the DOM under you.
Post by Brian Kardell
As far as I can see it doesn't invalidate anything that already exists in CSS/selector
implementations in terms of indexes or anything
At least the querySelectorAll implementations I have looked at (WebKit and
Gecko) traverse the DOM and for each element they find check whether it
matches the selector.  If so, they add it to the result set. Furthermore,
selector matching itself has to walk over the tree in various ways (e.g. to
handle combinators).  Both operations right now assume that the tree does
NOT mutate while this is happening.
Yes - it absolutely can NOT mutate while this is happening, but it
shouldn't right? It would be kind of non-sensical if it did. It
doesn't have to mutate in order to be useful - even in jQuery's model,
its purpose is in order to determine what _should_ mutate, not to do
the mutation itself.
Post by Boris Zbarsky
-Boris
Boris Zbarsky
2011-10-19 00:50:45 UTC
Permalink
Post by Brian Kardell
They would run in their own sandbox and they would have access to the
parameters passed into the function by way of pattern.
OK; I think that people might have a pretty tough time with a
programming environment like that... but maybe.
Post by Brian Kardell
The 'match' in this case wouldn't be
a mutable DOM element. You can give it a smaller API by saying that
the 'lite' version of the element that is passed in has no properties
which might give you something mutable
So no properties at all?
Post by Brian Kardell
- or you can say that all
methods/properties would also return immutable shadows of themselves.
It'd have to be that...
Post by Brian Kardell
I would be happy to walk through more detailed ideas in terms of what
specifically that would look like if there were some kind of initial
"yeah, that might work - its worth looking into some more" :)
On my part it's a "yeah, it might work, with a huge amount of effort,
probably disproportionate to the utility". At least at first blush.

-Boris
Boris Zbarsky
2011-10-18 21:32:12 UTC
Permalink
Post by Brian Kardell
Post by Boris Zbarsky
This is not that easy. Especially because you can reach all DOM objects
from elements, so you have to lock down the entire API somehow.
Right, you would need essentially, to pass in a node list which
iterated 'lite' read-only elements.
So the script would not get an actual DOM tree and not run in the Window
scope? The objects would not have an ownerDocument? What other
restrictions would they need to have?
Post by Brian Kardell
Maybe I'm way off, but actually seems not that difficult to
imagine the implementation.
If we're willing to pass in some totally-not-DOM data structure and run
in some sandbox scope, then sure.
Post by Brian Kardell
div .x:foo(.bar) span
The normal CSS resolution would be to get the spans, narrow by .x's
then throw what you have so far to the filter, removing anything that
returned false and carrying on as normal.
Normal CSS selector examines the .x part for each span as it finds it.
Otherwise selectors like "#foo > *" would require building up a list of
all elements in the DOM, no?
Post by Brian Kardell
The slowness as I see it would be that the filter would yes, call across the boundary and yes
have to build some intermediate and evaluating anything too complex in
the filter in that would be very slow by comparison probably - but you
don't have to do "much" to be useful... Is there something in that
pattern that I am missing in terms of what you are saying about
identifying what has changed out from underneath you?
_If_ the filter runs JS that can touch the DOM, then in your example for
every span you find you'd end up calling into the filter, and then you
have to worry about the filter rearranging the DOM under you.
Post by Brian Kardell
As far as I can see it doesn't invalidate anything that already exists in CSS/selector
implementations in terms of indexes or anything
At least the querySelectorAll implementations I have looked at (WebKit
and Gecko) traverse the DOM and for each element they find check whether
it matches the selector. If so, they add it to the result set.
Furthermore, selector matching itself has to walk over the tree in
various ways (e.g. to handle combinators). Both operations right now
assume that the tree does NOT mutate while this is happening.

-Boris
Alex Russell
2011-10-19 00:08:33 UTC
Permalink
Post by Boris Zbarsky
 * Speeding up certain operations like `#foo` and `body`. There is *no
   excuse* for it being possible to implement userland hacks that
   improve on the performance of querySelectorAll.
Sure there is.  One such "excuse", for example, is that the userland hacks
have different behavior from querySelectorAll in many cases.  Now the author
happens to know that the difference doesn't matter in their case, but the
_browser_ has no way to know that.
The other "excuse" is that adding special cases (which is what you're asking
for) slows down all the non-special-case codepaths.  That may be fine for
_your_ usage of querySelectorAll, where you use it with a particular limited
set of selectors, but it's not obvious that this is always a win.
Most browsers try to optimize what is common. Or has that fallen out
of favor while I wasn't looking?
Post by Boris Zbarsky
This may be the result of browsers failing to cache the result of parsing selectors
Yep.  Browsers don't cache it.  There's generally no reason to.  I have yet
to see any real-life testcase bottlenecked on this part of querySelectorAll
performance.
   or something else, but the fact remains that qSA can be noticably
   slower than the old DOM methods, even when Sizzle needs to parse the
   selector to look for fast-paths.
I'd love to see testcases showing this.
jQuery also handles certain custom pseudoselectors, and it might be nice
if it was possible to register JavaScript functions that qSA would use
if it found an unknown pseudo
This is _very_ hard to reasonably unless the browser can trust those
functions to not do anything weird.  Which of course it can't.  So your
options are either much slower selector matching or not having this. Your
pick.
-Boris
Boris Zbarsky
2011-10-19 01:26:22 UTC
Permalink
Post by Alex Russell
Post by Boris Zbarsky
The other "excuse" is that adding special cases (which is what you're asking
for) slows down all the non-special-case codepaths. That may be fine for
_your_ usage of querySelectorAll, where you use it with a particular limited
set of selectors, but it's not obvious that this is always a win.
Most browsers try to optimize what is common.
Yes, but what is common for Yehuda may well not be globally common.

There's also the question of premature optimization. Again, I'd love to
see a non-synthetic situation where any of this matters. That would be
a much more useful point to reason from than some sort of hypothetical
faith-based optimization.

-Boris
Alex Russell
2011-10-19 08:22:46 UTC
Permalink
Post by Boris Zbarsky
Post by Alex Russell
Post by Boris Zbarsky
The other "excuse" is that adding special cases (which is what you're asking
for) slows down all the non-special-case codepaths.  That may be fine for
_your_ usage of querySelectorAll, where you use it with a particular limited
set of selectors, but it's not obvious that this is always a win.
Most browsers try to optimize what is common.
Yes, but what is common for Yehuda may well not be globally common.
Yehuda is representing jQuery. I'll take his opinion as the global
view unless he choses to say he's representing a personal opinion.
Post by Boris Zbarsky
There's also the question of premature optimization.  Again, I'd love to see
a non-synthetic situation where any of this matters.  That would be a much
more useful point to reason from than some sort of hypothetical faith-based
optimization.
The jQuery team did look to see what selector are "hottest" against
their engine at some point and explicitly optimize short selectors as
a result. The simple forms seem to be the most common.

Regards
Anne van Kesteren
2011-10-19 08:29:03 UTC
Permalink
Post by Alex Russell
Yehuda is representing jQuery. I'll take his opinion as the global
view unless he choses to say he's representing a personal opinion.
You misunderstand. Boris is contrasting with CSS. Selectors are used in
more than just querySelectorAll() and their usage differs wildly.
--
Anne van Kesteren
http://annevankesteren.nl/
Alex Russell
2011-10-19 10:55:52 UTC
Permalink
Post by Alex Russell
Yehuda is representing jQuery. I'll take his opinion as the global
view unless he choses to say he's representing a personal opinion.
You misunderstand. Boris is contrasting with CSS. Selectors are used in more
than just querySelectorAll() and their usage differs wildly.
Sure, of course, but suggesting that the optimizations for both need
to be the same is also a strange place to start the discussion from.
The QSA or find() implementation *should* differ to the extent that it
provides developer value and is a real-world bottleneck.
Boris Zbarsky
2011-10-19 14:47:20 UTC
Permalink
On Wed, 19 Oct 2011 17:22:46 +0900, Alex Russell
Post by Alex Russell
Yehuda is representing jQuery. I'll take his opinion as the global
view unless he choses to say he's representing a personal opinion.
You misunderstand. Boris is contrasting with CSS.
No, I'm talking purely about querySelector. The fact that at least
Gecko and WebKit implement querySelector in a braindead way because that
lets them reuse their selector matching code is a somewhat separate
kettle of fish.

What we're discussing her, in particular, are optimizations that make
use of the differences in use case between CSS selector matching (match
one node to a bazillion selectors) and querySelector (match one
selectors to possibly a bazillion nodes). There are ways to optimize
the latter by examining the structure of the selector and making use of
existing cached information in the browser that make no sense in the CSS
context and would be implemented as a preprocessing pass before falling
back on actual selector matching. WebKit does a few of these, of
varying utility. I've considered doing some in Gecko, but again want to
have hard data that they're actually needed before adding complexity.

-Boris
Yehuda Katz
2011-10-19 09:41:33 UTC
Permalink
Yehuda Katz
(ph) 718.877.1325
Post by Alex Russell
Post by Boris Zbarsky
Post by Alex Russell
Post by Boris Zbarsky
The other "excuse" is that adding special cases (which is what you're asking
for) slows down all the non-special-case codepaths. That may be fine
for
Post by Boris Zbarsky
Post by Alex Russell
Post by Boris Zbarsky
_your_ usage of querySelectorAll, where you use it with a particular limited
set of selectors, but it's not obvious that this is always a win.
Most browsers try to optimize what is common.
Yes, but what is common for Yehuda may well not be globally common.
Yehuda is representing jQuery. I'll take his opinion as the global
view unless he choses to say he's representing a personal opinion.
Right. I'm representing the position of jQuery. Sizzle (John's selector
engine, used by jQuery) chose to optimize certain common selectors after an
analysis of selectors used by jQuery found that a large percentage of all
selectors used were a few simple forms that were amenable to
getElement(s)By* optimizations. I provided a link to the code that
implements this earlier in this thread.
Post by Alex Russell
Post by Boris Zbarsky
There's also the question of premature optimization. Again, I'd love to
see
Post by Boris Zbarsky
a non-synthetic situation where any of this matters. That would be a
much
Post by Boris Zbarsky
more useful point to reason from than some sort of hypothetical
faith-based
Post by Boris Zbarsky
optimization.
The jQuery team did look to see what selector are "hottest" against
their engine at some point and explicitly optimize short selectors as
a result. The simple forms seem to be the most common.
Yep.
Post by Alex Russell
Regards
Boris Zbarsky
2011-10-19 15:17:48 UTC
Permalink
Post by Yehuda Katz
Right. I'm representing the position of jQuery. Sizzle (John's selector
engine, used by jQuery) chose to optimize certain common selectors after
an analysis of selectors used by jQuery found that a large percentage of
all selectors used were a few simple forms that were amenable to
getElement(s)By* optimizations. I provided a link to the code that
implements this earlier in this thread.
OK, so the code is at
https://github.com/jquery/sizzle/blob/master/sizzle.js#L1150-1233 and
does the following optimizations:

1) If the selector is a vanilla tag, use getElementsByTagName instead.
This works, at least in Gecko, from userland due to the fact that
getElementsByTagName results are cached until GCed. If I test with a
different argument for every call, the performance of querySelectorAll
and getElementsByTagName is basically identical.[1] Doing a similar
optimization in the C++ code would be somewhat difficult, since you
don't know when to drop your cache, but would be doable. I would expect
the common case here to be repeated queries for the same tag name....
In any case, in WebKit the caching effect is not present and the
getElementsByTagName codepath is maybe 10% faster than querySelector.

2) As above, for class names. Works for the same reasons. WebKit
seems to have a getElementsByClassName cache too.

3) Mapping Sizzle("body") to document.body. This isn't a valid
optimization for querySelector, since there can in fact be multiple
<body> tags and since furthermore document.body can be a <frameset>. A
UA could try to optimize this case by keeping track of the <body> tags
and such, at some cost on every DOM mutation.

4) Mapping Sizzle("#id") with document a context to
getElementById("id"). This isn't a valid optimization for querySelector
because there can be multiple elements with the same id; in fact that's
pretty common. A UA can work around this in various ways (e.g. WebKit
only makes the case when the id is unique take a fast path last I
checked), though.

That sound about right?

-Boris

[1] I tested by running the script below against
http://www.whatwg.org/specs/web-apps/current-work/ and got these results
in Chrome:

querySelectorAll('div'): 915
Array.prototype.slice.call(getElementsByTagName('div'), 0): 948
querySelectorAll('div'+i): 938
Array.prototype.slice.call(getElementsByTagName('div' + i), 0): 847
querySelectorAll('.div'): 889
Array.prototype.slice.call(getElementsByClassName('div'), 0): 8
querySelectorAll('.div'+i): 910
Array.prototype.slice.call(getElementsByClassName('div' + i), 0): 824

and these in Firefox:

querySelectorAll('div'): 767
Array.prototype.slice.call(getElementsByTagName('div'), 0): 20
querySelectorAll('div'+i): 773
Array.prototype.slice.call(getElementsByTagName('div' + i), 0): 749
querySelectorAll('.div'): 817
Array.prototype.slice.call(getElementsByClassName('div'), 0): 8
querySelectorAll('.div'+i): 812
Array.prototype.slice.call(getElementsByClassName('div' + i), 0): 809

Script is:

// Prevent dead-code optimizations
var holder;
function time(f) {
var start = new Date;
for (var i = 0; i < 100; ++i)
holder = f(i);
return (new Date) - start;
}
tests = [
{ description: "querySelectorAll('div')",
func: function() { return document.querySelectorAll("div") } },
{ description:
"Array.prototype.slice.call(getElementsByTagName('div'), 0)",
func: function() { return
Array.prototype.slice.call(document.getElementsByTagName("div"), 0); } },
{ description: "querySelectorAll('div'+i)",
func: function(i) { return document.querySelectorAll("div" + i)
} },
{ description:
"Array.prototype.slice.call(getElementsByTagName('div' + i), 0)",
func: function(i) { return
Array.prototype.slice.call(document.getElementsByTagName("div"+i), 0); } },
{ description: "querySelectorAll('.div')",
func: function() { return document.querySelectorAll(".div") } },
{ description:
"Array.prototype.slice.call(getElementsByClassName('div'), 0)",
func: function() { return
Array.prototype.slice.call(document.getElementsByClassName("div"), 0); } },
{ description: "querySelectorAll('.div'+i)",
func: function(i) { return document.querySelectorAll(".div" +
i) } },
{ description:
"Array.prototype.slice.call(getElementsByClassName('div' + i), 0)",
func: function(i) { return
Array.prototype.slice.call(document.getElementsByClassName("div"+i), 0); } }
];
function runTest() {
var results = []
for (var i = 0; i < tests.length; ++i)
results.push(tests[i].description + ": " + time(tests[i].func));
document.write(results.join("<br>"));
}
Brian Kardell
2011-10-19 15:52:23 UTC
Permalink
Just - for what it's worth, something that's been on my mind about
some of these discussions... Isn't there a delicate balance to
consider here about what can and cannot reasonably be inferred and how
that should impact designs. For example, when CSS was released:

- Machines were slower
- Browsers themselves were much slower and there were really no optimizations
- The language itself was pretty limited

So people learned and used simple selectors because that is what we
had. JQuery came along and added some new selectors/concepts that
people really liked. CSS added some new stuff too... Users currently
decide how and when to use them - in my opinion, pretty well. However
- add to that that people are always inclined to push the envelope as
far as they can and those people look at selector performance in
determining how to write things - so they generally try to stick with
as simple and fast a selector as possible. Then we go out and scrape
up information on what people are using and make those things more
efficient. That's not a bad model toward actually getting somewhere -
but at some level it seems important to balance that somehow with what
people _would like to use_ if it were efficient enough. Sort of like
- the current performance tail wags the future API/performance dog a
bit here... Again, not that that is 100% negative - but it's worth
thinking about.
Post by Boris Zbarsky
Post by Yehuda Katz
Right. I'm representing the position of jQuery. Sizzle (John's selector
engine, used by jQuery) chose to optimize certain common selectors after
an analysis of selectors used by jQuery found that a large percentage of
all selectors used were a few simple forms that were amenable to
getElement(s)By* optimizations. I provided a link to the code that
implements this earlier in this thread.
OK, so the code is at
https://github.com/jquery/sizzle/blob/master/sizzle.js#L1150-1233 and does
1)  If the selector is a vanilla tag, use getElementsByTagName instead.
 This works, at least in Gecko, from userland due to the fact that
getElementsByTagName results are cached until GCed.  If I test with a
different argument for every call, the performance of querySelectorAll and
getElementsByTagName is basically identical.[1]  Doing a similar
optimization in the C++ code would be somewhat difficult, since you don't
know when to drop your cache, but would be doable.  I would expect the
common case here to be repeated queries for the same tag name.... In any
case, in WebKit the caching effect is not present and the
getElementsByTagName codepath is maybe 10% faster than querySelector.
2)  As above, for class names.  Works for the same reasons.  WebKit seems to
have a getElementsByClassName cache too.
3)  Mapping Sizzle("body") to document.body.  This isn't a valid
optimization for querySelector, since there can in fact be multiple <body>
tags and since furthermore document.body can be a <frameset>.  A UA could
try to optimize this case by keeping track of the <body> tags and such, at
some cost on every DOM mutation.
4)  Mapping Sizzle("#id") with document a context to getElementById("id").
 This isn't a valid optimization for querySelector because there can be
multiple elements with the same id; in fact that's pretty common.  A UA can
work around this in various ways (e.g. WebKit only makes the case when the
id is unique take a fast path last I checked), though.
That sound about right?
-Boris
[1] I tested by running the script below against
http://www.whatwg.org/specs/web-apps/current-work/ and got these results in
querySelectorAll('div'): 915
Array.prototype.slice.call(getElementsByTagName('div'), 0): 948
querySelectorAll('div'+i): 938
Array.prototype.slice.call(getElementsByTagName('div' + i), 0): 847
querySelectorAll('.div'): 889
Array.prototype.slice.call(getElementsByClassName('div'), 0): 8
querySelectorAll('.div'+i): 910
Array.prototype.slice.call(getElementsByClassName('div' + i), 0): 824
querySelectorAll('div'): 767
Array.prototype.slice.call(getElementsByTagName('div'), 0): 20
querySelectorAll('div'+i): 773
Array.prototype.slice.call(getElementsByTagName('div' + i), 0): 749
querySelectorAll('.div'): 817
Array.prototype.slice.call(getElementsByClassName('div'), 0): 8
querySelectorAll('.div'+i): 812
Array.prototype.slice.call(getElementsByClassName('div' + i), 0): 809
   // Prevent dead-code optimizations
   var holder;
   function time(f) {
     var start = new Date;
     for (var i = 0; i < 100; ++i)
       holder = f(i);
     return (new Date) - start;
   }
   tests = [
     { description: "querySelectorAll('div')",
       func: function() { return document.querySelectorAll("div") } },
     { description: "Array.prototype.slice.call(getElementsByTagName('div'),
0)",
       func: function() { return
Array.prototype.slice.call(document.getElementsByTagName("div"), 0); } },
     { description: "querySelectorAll('div'+i)",
       func: function(i) { return document.querySelectorAll("div" + i) } },
     { description: "Array.prototype.slice.call(getElementsByTagName('div' +
i), 0)",
       func: function(i) { return
Array.prototype.slice.call(document.getElementsByTagName("div"+i), 0); } },
     { description: "querySelectorAll('.div')",
       func: function() { return document.querySelectorAll(".div") } },
"Array.prototype.slice.call(getElementsByClassName('div'), 0)",
       func: function() { return
Array.prototype.slice.call(document.getElementsByClassName("div"), 0); } },
     { description: "querySelectorAll('.div'+i)",
       func: function(i) { return document.querySelectorAll(".div" + i) } },
     { description: "Array.prototype.slice.call(getElementsByClassName('div'
+ i), 0)",
       func: function(i) { return
Array.prototype.slice.call(document.getElementsByClassName("div"+i), 0); } }
   ];
   function runTest() {
     var results = []
     for (var i = 0; i < tests.length; ++i)
       results.push(tests[i].description + ": " + time(tests[i].func));
     document.write(results.join("<br>"));
   }
Boris Zbarsky
2011-10-19 16:30:01 UTC
Permalink
4) Mapping Sizzle("#id") with document a context to
getElementById("id"). This isn't a valid optimization for querySelector
because there can be multiple elements with the same id;
And just as a note, since someone asked me off-list how this can
possibly be true... Given this markup:

<div id="x">
<div id="y"></div>
<div id="y"></div>
</div>

calling

jQuery.find("#y")

returns an array with one element in it while calling

jQuery.find("#y", document.getElementById("x"))

returns an array with two elements. I have no idea whether this is
purposeful behavior or just a bug in Sizzle brought on by the
optimization listed above.

-Boris
Boris Zbarsky
2011-10-19 14:43:45 UTC
Permalink
Post by Alex Russell
Yehuda is representing jQuery. I'll take his opinion as the global
view unless he choses to say he's representing a personal opinion.
Global jQuery view, yes? I stand by a slightly statement that what is
common and needs to be fast for Yehuda may not be common and needing to
be fast in general.

In particular, lots of jQuery selector usage is not in fact
performance-sensitive. Some obviously is. Again, I'd love to see data
on the cases where performance matters, both when jQuery is involved and
when it's not.

I should note that the larger and more complicated a web app the less
likely it is to use jQuery from what I've seen....

I'm absolutely sure that simple selectors dominate complicated ones in
all contexts, but again I'd really like to have data on what _sort_ of
simple selectors really need optimizing.
Post by Alex Russell
The jQuery team did look to see what selector are "hottest" against
their engine
Yes. See above.

-Boris
Sean Hogan
2011-10-18 23:46:46 UTC
Permalink
Post by Yehuda Katz
I agree entirely.
<div id="parent">
<p id="child"><span id="inline">Content</span></p>
</div>
document.getElementById("child").querySelectorAll("div span"); //
returns #inline
In 100% of cases, people consider this behavior *broken*. Not just
"interesting, I wouldn't have expected that", but "who came up with
that!?". In all cases involving JavaScript practitioners, people
expect querySelectorAll to operate on the element as though the
element was the root of a new document, and where combinators are
relative to the element.
It matches the definition of CSS selectors, so I don't think it can be
called broken. For this case, node.querySelectorAll("div span") finds
all span's (in document order) which are contained within the invoking
node and checks that they match the selector expression, in this case
simply checking they are a descendant of a div.

The new definition being promoted is:
- start at the containing node
- find all descendant div's
- for every div, find all descendant span's.
- with the list of span's, remove duplicates and place in document-order

Once you understand the proper definition it is hard to see this new
definition as more logical.
To me, the problem here is some (not all) Javascript practitioners not
learning the proper definition of CSS selectors.
Post by Yehuda Katz
We already knew this was true since all JavaScript libraries that
implement selectors implemented them in this way.
To me, this indicates that there's no problem here. If you want to use
an alternative definition of selectors then you use a JS lib that
supports them. If you want to use the DOM API then you learn how CSS
selectors work.

I don't see JS libs ever calling the browsers querySelectorAll (or even
a new findAll) without parsing the selector string first because:
- JS libs support selectors that haven't been implemented on all browsers
- JS libs support selectors that are never going to be part of the standard

Since JS libs will always parse selector strings and call qSA, etc as
appropriate, I can't see much benefit in creating DOM methods that
accept non-standard selector strings.

Sean
Tab Atkins Jr.
2011-10-18 23:58:20 UTC
Permalink
Post by Sean Hogan
Post by Yehuda Katz
I agree entirely.
<div id="parent">
<p id="child"><span id="inline">Content</span></p>
</div>
 document.getElementById("child").querySelectorAll("div span"); // returns
#inline
In 100% of cases, people consider this behavior *broken*. Not just
"interesting, I wouldn't have expected that", but "who came up with that!?".
In all cases involving JavaScript practitioners, people expect
querySelectorAll to operate on the element as though the element was the
root of a new document, and where combinators are relative to the element.
It matches the definition of CSS selectors, so I don't think it can be
called broken. For this case, node.querySelectorAll("div span") finds all
span's (in document order) which are contained within the invoking node and
checks that they match the selector expression, in this case simply checking
they are a descendant of a div.
- start at the containing node
- find all descendant div's
- for every div, find all descendant span's.
- with the list of span's, remove duplicates and place in document-order
Once you understand the proper definition it is hard to see this new
definition as more logical.
To me, the problem here is some (not all) Javascript practitioners not
learning the proper definition of CSS selectors.
Not at all. I'm not sure why you think this is somehow an "improper"
way to think about things.

There are two ways you can "scope" a selector. The first is to filter
the results of a selector match to only those under a certain element
(what QSA does today). The second is to scope the entire selector to
only apply underneath the scoping element, which is what Alex is
proposing.

An alternative view of this is that current QSA restricts the final
compound selector in a selector to match only elements in the scope,
while allowing the rest of the selector to match elements anywhere in
the document. Alex's proposal (and every other JS selector engine)
restricts all of the selector component to matching only elements in
the scope.

There is nothing unnatural or improper about this. The fact that
every JS selector engine works in the latter fashion, and that JS devs
are regularly surprised by the former behavior, suggests strongly that
the latter behavior is the better default behavior.

Based on discussion on the mailing list, <style scoped> will be
changing to the latter behavior as well, with the ability to invoke
the former behavior in the rare circumstances when you explicitly want
it.

~TJ
Sean Hogan
2011-10-19 00:34:53 UTC
Permalink
Post by Tab Atkins Jr.
Post by Sean Hogan
Post by Yehuda Katz
I agree entirely.
<div id="parent">
<p id="child"><span id="inline">Content</span></p>
</div>
document.getElementById("child").querySelectorAll("div span"); // returns
#inline
In 100% of cases, people consider this behavior *broken*. Not just
"interesting, I wouldn't have expected that", but "who came up with that!?".
In all cases involving JavaScript practitioners, people expect
querySelectorAll to operate on the element as though the element was the
root of a new document, and where combinators are relative to the element.
It matches the definition of CSS selectors, so I don't think it can be
called broken. For this case, node.querySelectorAll("div span") finds all
span's (in document order) which are contained within the invoking node and
checks that they match the selector expression, in this case simply checking
they are a descendant of a div.
- start at the containing node
- find all descendant div's
- for every div, find all descendant span's.
- with the list of span's, remove duplicates and place in document-order
Once you understand the proper definition it is hard to see this new
definition as more logical.
To me, the problem here is some (not all) Javascript practitioners not
learning the proper definition of CSS selectors.
Not at all. I'm not sure why you think this is somehow an "improper"
way to think about things.
There are two ways you can "scope" a selector. The first is to filter
the results of a selector match to only those under a certain element
(what QSA does today). The second is to scope the entire selector to
only apply underneath the scoping element, which is what Alex is
proposing.
An alternative view of this is that current QSA restricts the final
compound selector in a selector to match only elements in the scope,
while allowing the rest of the selector to match elements anywhere in
the document. Alex's proposal (and every other JS selector engine)
restricts all of the selector component to matching only elements in
the scope.
There is nothing unnatural or improper about this. The fact that
every JS selector engine works in the latter fashion, and that JS devs
are regularly surprised by the former behavior, suggests strongly that
the latter behavior is the better default behavior.
Based on discussion on the mailing list,<style scoped> will be
changing to the latter behavior as well, with the ability to invoke
the former behavior in the rare circumstances when you explicitly want
it.
If it becomes part of the standard definition of CSS selectors then it
can be supported by a query method. At that point the only discussion is
for the name.

However, in reading that thread I don't see any mention of selectors
such as "> div > span". Did I miss something?

Sean
Lachlan Hunt
2011-10-19 12:17:14 UTC
Permalink
Post by Tab Atkins Jr.
Based on discussion on the mailing list,<style scoped> will be
changing to the latter behavior as well, with the ability to invoke
the former behavior in the rare circumstances when you explicitly want
it.
Despite some similarities in appearance, the proposed changes to <style
scoped> will still use selectors differently from that proposed here for
a new findAll() method.

1. Syntax

In <style scoped>, selectors still can't begin with a combinator, but in
the proposed API, they can.

The @global at-rule was proposed to


2. Matching the Context Element

In scoped stylesheets, the context element itself can be the subject of
a selector. But the proposed API will never return the element itself in
the result.

div.findAll("div") // Does not match the element itself

(same as querySelectorAll() in this case)

<div>
<style scoped>
div { ... } /* Matches the context element */
</style>
</div>


3. The Subject of Selectors

In scoped stylesheets, the potential matches of a selector will only
include:
* The context element itself
* Descendants of the context element

In the proposed API, the potential matches will include:
* Descendants of the context element
* Siblings of the context element

In the existing API, the potential matches include:
* Descendants of the context element only


div.findAll("+p") // Matches sibling p elements

div.querySelectorAll(":scope+p") // Matches nothing
document.querySelectorAll(":scope+p", div) // Matches sibling p elements

<div>
<style scoped>
:scope+p { ... } /* Matches nothing */
</style>
<div>
<p>...</p>
--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/
Tab Atkins Jr.
2011-10-19 19:51:29 UTC
Permalink
Post by Lachlan Hunt
1. Syntax
In <style scoped>, selectors still can't begin with a combinator, but in the
proposed API, they can.
I agree with Lachy here. I think it's valuable to have consistency
with <style scoped>, so that a selector passed to el.findAll() and one
put in a <style scoped> that's a child of el return the same results.

You already have to explicitly add :scope if you want to do some
additional selecting of the scoping element anyway.

This breaks consistency with jQuery, but it maintains consistency with
the rest of the platform. I think this is important enough to justify
the slight loss in terseness in the situations where you want a child
or reference combinator off of the scoping element.
I'll make a reasonable assumption about what Lachy was planning to say
here, and say that QSA seems to already solve the "consistency with
@global within <style scoped>" issue. At least as far as I can tell,
it acts the same.
Post by Lachlan Hunt
2. Matching the Context Element
In scoped stylesheets, the context element itself can be the subject of a
selector. But the proposed API will never return the element itself in the
result.
div.findAll("div") // Does not match the element itself
(same as querySelectorAll() in this case)
<div>
 <style scoped>
   div { ... } /* Matches the context element */
 </style>
</div>
While I think we should match <style scoped> here, I believe the
conflict should be resolved by changing <style scoped>. A few people
in the last discussion preferred selectors to automatically match the
scoping element, but I still think that's a bad decision. The scoping
element should only be returned if a selector is a single compound
selector containing :scope. It makes selectors a little bit more
complex to understand, but in an intuitive way.

Regardless of what ends up happening in <style scoped>, I agree with
the API choice here to make div.find("div") not match the calling
element. The common case is that I'm descending into the element and
wouldn't expect the calling element to match. I'd like to write naive
algorithms that don't need to either manually check the results
against the calling element or defensively write
div.find(":not(:scope) div"). I'm okay with using the presence of
:scope in the selector as a declaration of intent here, and switch
behavior accordingly.
Post by Lachlan Hunt
3. The Subject of Selectors
In scoped stylesheets, the potential matches of a selector will only
* The context element itself
* Descendants of the context element
* Descendants of the context element
* Siblings of the context element
* Descendants of the context element only
div.findAll("+p") // Matches sibling p elements
div.querySelectorAll(":scope+p") // Matches nothing
document.querySelectorAll(":scope+p", div) // Matches sibling p elements
<div>
 <style scoped>
   :scope+p { ... } /* Matches nothing */
 </style>
<div>
<p>...</p>
I am okay with this behavioral split from <style scoped>, and believe
it's both useful and intuitive.

(Note that the function can actually return elements from *anywhere*
given the current Selectors 4 draft, as it can follow a reference
combinator which can point to an arbitrary position in the doc.)

~TJ
Alex Russell
2011-10-19 00:14:50 UTC
Permalink
Post by Sean Hogan
Post by Yehuda Katz
I agree entirely.
<div id="parent">
<p id="child"><span id="inline">Content</span></p>
</div>
 document.getElementById("child").querySelectorAll("div span"); // returns
#inline
In 100% of cases, people consider this behavior *broken*. Not just
"interesting, I wouldn't have expected that", but "who came up with that!?".
In all cases involving JavaScript practitioners, people expect
querySelectorAll to operate on the element as though the element was the
root of a new document, and where combinators are relative to the element.
It matches the definition of CSS selectors, so I don't think it can be
called broken. For this case, node.querySelectorAll("div span") finds all
span's (in document order) which are contained within the invoking node and
checks that they match the selector expression, in this case simply checking
they are a descendant of a div.
- start at the containing node
- find all descendant div's
- for every div, find all descendant span's.
- with the list of span's, remove duplicates and place in document-order
Once you understand the proper definition it is hard to see this new
definition as more logical.
To me, the problem here is some (not all) Javascript practitioners not
learning the proper definition of CSS selectors.
I'm just going to assume you're trolling and not respond to anything
else you post here.
Post by Sean Hogan
Post by Yehuda Katz
We already knew this was true since all JavaScript libraries that
implement selectors implemented them in this way.
To me, this indicates that there's no problem here. If you want to use an
alternative definition of selectors then you use a JS lib that supports
them. If you want to use the DOM API then you learn how CSS selectors work.
I don't see JS libs ever calling the browsers querySelectorAll (or even a
- JS libs support selectors that haven't been implemented on all browsers
- JS libs support selectors that are never going to be part of the standard
Since JS libs will always parse selector strings and call qSA, etc as
appropriate, I can't see much benefit in creating DOM methods that accept
non-standard selector strings.
Sean
Alex Russell
2011-10-19 00:06:45 UTC
Permalink
Post by Yehuda Katz
I agree entirely.
  <div id="parent">
    <p id="child"><span id="inline">Content</span></p>
  </div>
  document.getElementById("child").querySelectorAll("div span"); // returns
#inline
In 100% of cases, people consider this behavior *broken*. Not just
"interesting, I wouldn't have expected that", but "who came up with that!?".
In all cases involving JavaScript practitioners, people expect
querySelectorAll to operate on the element as though the element was the
root of a new document, and where combinators are relative to the element.
We already knew this was true since all JavaScript libraries that implement
selectors implemented them in this way.
Great example.
Post by Yehuda Katz
I also agree that the name querySelectorAll (like getElement(s)By*,
requestAnimationFrame, addEventListener, and most other DOM APIs), are
simply too long to use in day-to-day usage. This results in the need to use
libraries for day-to-day browser development simply to reduce this
borderline-comical verbosity.
I like find and findAll, as jQuery has a `find` which invokes the selector
engine. As to whether jQuery would benefit from the improvements, the
existing jQuery implementation (Sizzle) of the selector engine when qSA is
available is
at: https://github.com/jquery/sizzle/blob/master/sizzle.js#L1150-1233
Speeding up certain operations like `#foo` and `body`. There is *no excuse*
for it being possible to implement userland hacks that improve on the
performance of querySelectorAll. This may be the result of browsers failing
to cache the result of parsing selectors or something else, but the fact
remains that qSA can be noticably slower than the old DOM methods, even when
Sizzle needs to parse the selector to look for fast-paths.
This is likely implementation issues. Not, perhaps, apropos to this.
And in any case, if *today's* QSA is slower, then you're right. IIRC,
one of the (busted) justifications for the current API was that it
would allow re-use of the existing in-browser infrastructure, which
should indeed be fast.
Post by Yehuda Katz
Fixing the implementation mistake in qSA that we're discussing here.
Bugs in specific browsers.
jQuery also handles certain custom pseudoselectors, and it might be nice if
it was possible to register JavaScript functions that qSA would use if it
found an unknown pseudo (this would make it possible to implement most of
jQuery's selector engine in terms of qSA), but that's a discussion for
another day.
Yeah, I'm afraid it will have to be.
Post by Yehuda Katz
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
Discussions about a Scoped variant or ":scope" pseudo tacitly
no major JS library exposes the QSA semantic, instead choosing to
implement a rooted search.
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names. APIs gain
use both through naming and through use. On today's internet -- the
one where 50% of all websites include jQuery -- you could even go with
element.$("selector") and everyone would know what you mean: it's
clearly a search rooted at the element on the left-hand side of the
dot.
Ceteris peribus, shorter is better. When there's a tie that needs to
be broken, the more frequently used the API, the shorter the name it
deserves -- i.e., the larger the component of its meaning it will gain
through use and repetition and not naming and documentation.
I know some on this list might disagree, but all of the above is
incredibly non-controversial today. Even if there may have been
debates about scoping or naming when QSA was originally designed,
history has settled them. And QSA lost on both counts.
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
  element.querySelectorAll(":scope > div > .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from. I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
  element.findAll("> div > .thinger");
Out come the knives! You can't start a selector with a combinator!
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
 HTMLDocument.prototype.find =
 HTMLElement.prototype.find = function(rootedSelector) {
    return this.querySelector(":scope " + rootedSelector);
  }
  HTMLDocument.prototype.findAll =
  HTMLElement.prototype.findAll = function(rootedSelector) {
    return this.querySelectorAll(":scope " + rootedSelector);
  }
Of course, ":scope" in this case is just a special case of the ID
rooting hack, but if we're going to have it, we can kill both birds
with it.
Q.) Why do we need this at all? Don't the toolkits already just do
this internally?
A.) Are you saying everyone, everywhere, all the time should need to
use a toolkit to get sane behavior from the DOM? If so, what are we
doing here, exactly?
Q.) Shorter names? Those are for weaklings!
A.) And humans. Who still constitute most of our developers. Won't
someone please think of the humans?
Q.) You're just duplicating things!
A.) If you ignore all of the things that are different, then that's
true. If not, well, then no. This is a change. And a good one for the
reasons listed above.
Thoughts?
Lachlan Hunt
2011-10-19 12:54:02 UTC
Permalink
Post by Alex Russell
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names.
I know the names suck. The names we ended up with certainly weren't the
first choice of names we were going for, but sadly ended up with after a
long drawn out naming debate and a misguided consensus poll to override
what should have been an editorial decision. So, if we do introduce new
methods, personally I'd be happy to use sensible names for any them, if
the rest of the group will allow it this time.
Post by Alex Russell
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
element.querySelectorAll(":scope> div> .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from.
The current design is capable of handling many more use cases than the
single use case that you are trying to optimise for here.
Post by Alex Russell
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
HTMLDocument.prototype.find =
HTMLElement.prototype.find = function(rootedSelector) {
return this.querySelector(":scope " + rootedSelector);
}
HTMLDocument.prototype.findAll =
HTMLElement.prototype.findAll = function(rootedSelector) {
return this.querySelectorAll(":scope " + rootedSelector);
}
This is an incomplete way of dealing with the problem, as it doesn't
correctly handle comma separated lists of selectors, so the parsing
problem cannot be as trivial as prepending ":scope ". It would also
give a strange result if the author passed an empty string

findAll("");

":scope " + "" => ":scope" => meaning to return itself.
Post by Alex Russell
The resolution I think is most natural is to split on "," and assume
that all selectors in the list are ":scope" prefixed and that.
Simple string processing to split on "," is also ineffective as it
doesn't correctly deal with commas within functional notation
pseudo-classes, attribute selectors, etc.

I have attempted to address this problem before and the algorithm for
parsing a *scoped selector string* (basically what you're calling a
rootedSelector) existed in an old draft [1].

That draft also allowed the flexibility of including an explicit :scope
pseudo-class in the selector, which allows for conditional expressions
to be built into the selector itself that can be used to check the state
of the scope element or any of its ancestors.

(But that draft isn't perfect. It has a few known bugs in the
definition, including one that would also make it return the context
node itself under certain circumstances where an explicit :scope
selector is used.)

[1]
http://dev.w3.org/cvsweb/~checkout~/2006/webapi/selectors-api2/Overview.html?rev=1.29;content-type=text%2Fhtml#processing-selectors
--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/
Alex Russell
2011-10-19 14:08:22 UTC
Permalink
Post by Alex Russell
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names.
I know the names suck.  The names we ended up with certainly weren't the
first choice of names we were going for, but sadly ended up with after a
long drawn out naming debate and a misguided consensus poll to override what
should have been an editorial decision.  So, if we do introduce new methods,
personally I'd be happy to use sensible names for any them, if the rest of
the group will allow it this time.
It should *still* be an editorial decision. Shorter is better. This is
well-trod ground. We have plenty of evidence for what JS devs really
want. Lets get on with it.
Post by Alex Russell
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
   element.querySelectorAll(":scope>  div>  .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from.
The current design is capable of handling many more use cases than the
single use case that you are trying to optimise for here.
That's OK. I'm not stoning the current design. See below. I'm
suggesting we build on it and provide the API people are making heavy
use of today. This cow path deserves not just paving, but
streetlights, wide shoulders, and a bike lane.
Post by Alex Russell
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
  HTMLDocument.prototype.find =
  HTMLElement.prototype.find = function(rootedSelector) {
     return this.querySelector(":scope " + rootedSelector);
   }
   HTMLDocument.prototype.findAll =
   HTMLElement.prototype.findAll = function(rootedSelector) {
     return this.querySelectorAll(":scope " + rootedSelector);
   }
This is an incomplete way of dealing with the problem, as it doesn't
correctly handle comma separated lists of selectors, so the parsing problem
cannot be as trivial as prepending ":scope ".  It would also give a strange
result if the author passed an empty string
 findAll("");
 ":scope " + "" => ":scope" => meaning to return itself.
Yes, yes. Pseudo-code. I snipped other code I posted to not handle
obvious corner cases to prevent posting eye-watering walls of code as
well. Happy to draft a longer/more-complete straw-man, but nobody's
*actually* going to implement it this way in any case. As an aside,
it's shocking how nit-picky and anti-collaborative this group is.
*sigh*
Post by Alex Russell
The resolution I think is most natural is to split on "," and assume
that all selectors in the list are ":scope" prefixed and that.
Simple string processing to split on "," is also ineffective as it doesn't
correctly deal with commas within functional notation pseudo-classes,
attribute selectors, etc.
See, again, subsequent follow-ups.
I have attempted to address this problem before and the algorithm for
parsing a *scoped selector string* (basically what you're calling a
rootedSelector) existed in an old draft [1].
That draft also allowed the flexibility of including an explicit :scope
pseudo-class in the selector, which allows for conditional expressions to be
built into the selector itself that can be used to check the state of the
scope element or any of its ancestors.
We could accomodate that by looking at the passed selector and trying
to determine if it includes a ":scope" term. If so, avoid prefixing.
That'd allow this sort of flexibility for folks who want to write
things out long-hand or target the scope root in the selector,
possibly returning itself. I''d also support a resolution for this
sort of power-tool that forces people to use document.qsa("...",
scopeEl) to get at that sort of thing.
(But that draft isn't perfect.  It has a few known bugs in the definition,
including one that would also make it return the context node itself under
certain circumstances where an explicit :scope selector is used.)
[1]
http://dev.w3.org/cvsweb/~checkout~/2006/webapi/selectors-api2/Overview.html?rev=1.29;content-type=text%2Fhtml#processing-selectors
Lachlan Hunt
2011-10-19 18:01:30 UTC
Permalink
Post by Alex Russell
Post by Lachlan Hunt
I have attempted to address this problem before and the algorithm for
parsing a *scoped selector string* (basically what you're calling a
rootedSelector) existed in an old draft [1].
That draft also allowed the flexibility of including an explicit :scope
pseudo-class in the selector, which allows for conditional expressions to be
built into the selector itself that can be used to check the state of the
scope element or any of its ancestors.
We could accomodate that by looking at the passed selector and trying
to determine if it includes a ":scope" term. If so, avoid prefixing.
Yes, that's exactly what the draft specified.
Post by Alex Russell
That'd allow this sort of flexibility for folks who want to write
things out long-hand or target the scope root in the selector,
possibly returning itself.
I don't see a use case for wanting the proposed method to be able to
return the element itself. The case where it's useful for elements
matching :scope to be the subject of a selector is where you're trying
to filter a list of elements.

e.g.
document.querySelectorAll(".foo:scope", list);
// Returns all elements from list that match.

But this wouldn't make sense

el.find(".foo:scope") // Return itself if it matches.

That result seems effectively like a less efficient boolean check that
is already handled by el.matchesSelector(".foo").
Post by Alex Russell
I''d also support a resolution for this sort of power-tool that
forces people to use document.qsa("...",scopeEl) to get at that sort
of thing.
If there was no special handling to check for an explicit :scope, that
would mean that any selector that does include :scope explicitly would
not match anything at all.

e.g. el.findAll(":scope>p");

That would be equivalent to:

document.querySelectorAll(":scope :scope>p", el);

Which won't match anything.

That might keep things simpler from an implementation perspective and
doesn't sacrifice any functionality being requested.
--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/
Alex Russell
2011-10-20 10:39:23 UTC
Permalink
Post by Lachlan Hunt
Post by Alex Russell
Post by Lachlan Hunt
I have attempted to address this problem before and the algorithm for
parsing a *scoped selector string* (basically what you're calling a
rootedSelector) existed in an old draft [1].
That draft also allowed the flexibility of including an explicit :scope
pseudo-class in the selector, which allows for conditional expressions to be
built into the selector itself that can be used to check the state of the
scope element or any of its ancestors.
We could accomodate that by looking at the passed selector and trying
to determine if it includes a ":scope" term. If so, avoid prefixing.
Yes, that's exactly what the draft specified.
Great! So if we specify this behavior for .find() too, I think we're
in good shape.
Post by Lachlan Hunt
Post by Alex Russell
That'd allow this sort of flexibility for folks who want to write
things out long-hand or target the scope root in the selector,
possibly returning itself.
I don't see a use case for wanting the proposed method to be able to return
the element itself.  The case where it's useful for elements matching :scope
to be the subject of a selector is where you're trying to filter a list of
elements.
e.g.
 document.querySelectorAll(".foo:scope", list);
 // Returns all elements from list that match.
But this wouldn't make sense
 el.find(".foo:scope") // Return itself if it matches.
Ok, I'm fine with not allowing that.
Post by Lachlan Hunt
That result seems effectively like a less efficient boolean check that is
already handled by el.matchesSelector(".foo").
"matchesSelector"...really? We've gotta get a better name for that = )
Post by Lachlan Hunt
Post by Alex Russell
I''d also support a resolution for this sort of power-tool that
forces people to use document.qsa("...",scopeEl) to get at that sort
of thing.
If there was no special handling to check for an explicit :scope, that would
mean that any selector that does include :scope explicitly would not match
anything at all.
e.g. el.findAll(":scope>p");
yeah, that occurred to me after sending the last mail.
Post by Lachlan Hunt
 document.querySelectorAll(":scope :scope>p", el);
Which won't match anything.
That might keep things simpler from an implementation perspective and
doesn't sacrifice any functionality being requested.
Eh, I'm not sure it's sane though. Putting in checking for :scope in
the selector and not prefixing if it occurs seems the only reasonable
thing. There's a corner case I haven't formed an opinion on though:

el.find("div span :scope .whatevs");

...does what? I think it's an error. ":scope" will need to occur in
the first term or not at all for .find().
Post by Lachlan Hunt
--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/
Tab Atkins Jr.
2011-10-20 11:40:11 UTC
Permalink
  el.find("div span :scope .whatevs");
...does what? I think it's an error. ":scope" will need to occur in
the first term or not at all for .find().
Disagree. If :scope appears in the selector, just match across the
whole document. It's simple and useful. (It's equivalent to
document.querySelector("div span :scope .whatevs", el), except shorter
and amenable to chaining.)

~TJ
Lachlan Hunt
2011-10-19 18:17:50 UTC
Permalink
Post by Alex Russell
HTMLDocument.prototype.find =
HTMLElement.prototype.find = function(rootedSelector) {
return this.querySelector(":scope " + rootedSelector);
}
HTMLDocument.prototype.findAll =
HTMLElement.prototype.findAll = function(rootedSelector) {
return this.querySelectorAll(":scope " + rootedSelector);
}
What exactly does it mean to have a "rootedSelector" applied to the
Document object? As I understand it, the scoping problem explained only
seems to apply to running the query on elements, whereas the existing
document.qsa already behaves as expected by authors. It doesn't seem to
make sense to try and prepend :scope to selectors in that case.

e.g. document.find("html") shouldn't be equivalent to
document.querySelector(":scope html");

So, either we introduce the new method only for elements, or we use a
similarly named method on document for a similar, but slightly different
purpose.

A previous use case discussed on this list is the ability to take a
collection of elements, and execute the same selector on all all of
them, as if iterating the list, collecting the results and returning a
single merged collection.

The current API handles this use case with document.querySelectorAll,
explicitly specifying :scope and passing a collection of refNodes.

e.g.
var list = ...; // Elements (Array, NodeList or indexed object)

// Find the sibling p elements of all elements in the list
document.querySelectorAll(":scope+p", list);

Thus, if we do introduce the proposed method, should it behave
similarly, but with the implied rather than explicit :scope?

e.g.
document.findAll("+p", list);
--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/
Jonas Sicking
2011-10-20 02:07:39 UTC
Permalink
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
Discussions about a Scoped variant or ":scope" pseudo tacitly
no major JS library exposes the QSA semantic, instead choosing to
implement a rooted search.
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names. APIs gain
use both through naming and through use. On today's internet -- the
one where 50% of all websites include jQuery -- you could even go with
element.$("selector") and everyone would know what you mean: it's
clearly a search rooted at the element on the left-hand side of the
dot.
Ceteris peribus, shorter is better. When there's a tie that needs to
be broken, the more frequently used the API, the shorter the name it
deserves -- i.e., the larger the component of its meaning it will gain
through use and repetition and not naming and documentation.
I know some on this list might disagree, but all of the above is
incredibly non-controversial today. Even if there may have been
debates about scoping or naming when QSA was originally designed,
history has settled them. And QSA lost on both counts.
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
  element.querySelectorAll(":scope > div > .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from. I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
  element.findAll("> div > .thinger");
Out come the knives! You can't start a selector with a combinator!
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
 HTMLDocument.prototype.find =
 HTMLElement.prototype.find = function(rootedSelector) {
    return this.querySelector(":scope " + rootedSelector);
  }
  HTMLDocument.prototype.findAll =
  HTMLElement.prototype.findAll = function(rootedSelector) {
    return this.querySelectorAll(":scope " + rootedSelector);
  }
Of course, ":scope" in this case is just a special case of the ID
rooting hack, but if we're going to have it, we can kill both birds
with it.
Q.) Why do we need this at all? Don't the toolkits already just do
this internally?
A.) Are you saying everyone, everywhere, all the time should need to
use a toolkit to get sane behavior from the DOM? If so, what are we
doing here, exactly?
Q.) Shorter names? Those are for weaklings!
A.) And humans. Who still constitute most of our developers. Won't
someone please think of the humans?
Q.) You're just duplicating things!
A.) If you ignore all of the things that are different, then that's
true. If not, well, then no. This is a change. And a good one for the
reasons listed above.
Thoughts?
I like the general idea here. And since we're changing behavior, I
think it's a good opportunity to come up with shorter names. Naming is
really hard. The shorter names we use, the more likely it is that
we're going to break webpages which are messing around with the
prototype chain and it increases the risk that we'll regret it later
when we come up with even better functions which should use those
names. Say that we come up with an even better query language than
selectors, at that point .find will simply not be available to us.

However, it does seem like selectors are here to stay. And as much as
they have shortcomings, people seem to really like them for querying.

So with that out of the way, I agree that the CSS working group
shouldn't be what is holding us back. However we do need a precise
definition of what the new function does. Is prepending ":scope " and
then parsing as a normal selector always going to give the behavior we
want? This is actually what I think we got stuck on when the original
querySelector was designed.

So let's get into specifics about how things should work. According to
your proposal of simply prepending a conceptual ":scope" to each
selector group, for the following DOM:

<body id="3">
<div id="context" foo=bar>
<div id=1></div>
<div class="class" id=2></div>
<div class="withChildren" id=3><div class=child id=4></div></div>
</div>
</body>

you'd get the following behavior:

.findAll("div") // returns ids 1,2,3,4
.findAll("") // returns the context node itself. This was
indicated undesirable
.findAll("body > :scope > div") // returns nothing
.findAll("#3") // returns id 3, but not the body node
.findAll("> div") // returns ids 1,2,3
.findAll("[foo=bar]") // returns nothing
.findAll("[id=1]") // returns id 1
.findAll(":first-child") // returns id 1

Is this desired behavior in all cases except the empty string? If so
this seems very doable to me. We can easily make an exception for the
case when the passed in string contains no selectors and make that an
error or some such.

I do however like the idea that if :scope appears in the selector,
then this removes the prepending of ":scope " to that selector group.
Is there a reason not to do that?

Additionally it seems to me that we could allow the same syntax for
<style scoped>. But maybe others disagree?

I think appropriate optimizations as well as extensible functions
should be out-of-scope for this thread. They are both big subjects on
their own and we're approaching 50 emails in this thread.

/ Jonas
Ojan Vafai
2011-10-20 02:22:23 UTC
Permalink
Post by Alex Russell
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are
mis-designed.
Post by Alex Russell
Discussions about a Scoped variant or ":scope" pseudo tacitly
no major JS library exposes the QSA semantic, instead choosing to
implement a rooted search.
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names. APIs gain
use both through naming and through use. On today's internet -- the
one where 50% of all websites include jQuery -- you could even go with
element.$("selector") and everyone would know what you mean: it's
clearly a search rooted at the element on the left-hand side of the
dot.
Ceteris peribus, shorter is better. When there's a tie that needs to
be broken, the more frequently used the API, the shorter the name it
deserves -- i.e., the larger the component of its meaning it will gain
through use and repetition and not naming and documentation.
I know some on this list might disagree, but all of the above is
incredibly non-controversial today. Even if there may have been
debates about scoping or naming when QSA was originally designed,
history has settled them. And QSA lost on both counts.
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
element.querySelectorAll(":scope > div > .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from. I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
element.findAll("> div > .thinger");
Out come the knives! You can't start a selector with a combinator!
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
HTMLDocument.prototype.find =
HTMLElement.prototype.find = function(rootedSelector) {
return this.querySelector(":scope " + rootedSelector);
}
HTMLDocument.prototype.findAll =
HTMLElement.prototype.findAll = function(rootedSelector) {
return this.querySelectorAll(":scope " + rootedSelector);
}
Of course, ":scope" in this case is just a special case of the ID
rooting hack, but if we're going to have it, we can kill both birds
with it.
Q.) Why do we need this at all? Don't the toolkits already just do
this internally?
A.) Are you saying everyone, everywhere, all the time should need to
use a toolkit to get sane behavior from the DOM? If so, what are we
doing here, exactly?
Q.) Shorter names? Those are for weaklings!
A.) And humans. Who still constitute most of our developers. Won't
someone please think of the humans?
Q.) You're just duplicating things!
A.) If you ignore all of the things that are different, then that's
true. If not, well, then no. This is a change. And a good one for the
reasons listed above.
Thoughts?
I like the general idea here. And since we're changing behavior, I
think it's a good opportunity to come up with shorter names. Naming is
really hard. The shorter names we use, the more likely it is that
we're going to break webpages which are messing around with the
prototype chain and it increases the risk that we'll regret it later
when we come up with even better functions which should use those
names. Say that we come up with an even better query language than
selectors, at that point .find will simply not be available to us.
However, it does seem like selectors are here to stay. And as much as
they have shortcomings, people seem to really like them for querying.
So with that out of the way, I agree that the CSS working group
shouldn't be what is holding us back. However we do need a precise
definition of what the new function does. Is prepending ":scope " and
then parsing as a normal selector always going to give the behavior we
want? This is actually what I think we got stuck on when the original
querySelector was designed.
So let's get into specifics about how things should work. According to
your proposal of simply prepending a conceptual ":scope" to each
<body id="3">
<div id="context" foo=bar>
<div id=1></div>
<div class="class" id=2></div>
<div class="withChildren" id=3><div class=child id=4></div></div>
</div>
</body>
.findAll("div") // returns ids 1,2,3,4
.findAll("") // returns the context node itself. This was
indicated undesirable
.findAll("body > :scope > div") // returns nothing
Wouldn't this return ids 1,2,3 if we're not prepending :scope as you say
below?
Post by Alex Russell
.findAll("#3") // returns id 3, but not the body node
.findAll("> div") // returns ids 1,2,3
.findAll("[foo=bar]") // returns nothing
.findAll("[id=1]") // returns id 1
.findAll(":first-child") // returns id 1
Is this desired behavior in all cases except the empty string? If so
this seems very doable to me. We can easily make an exception for the
case when the passed in string contains no selectors and make that an
error or some such.
I do however like the idea that if :scope appears in the selector,
then this removes the prepending of ":scope " to that selector group.
Is there a reason not to do that?
Additionally it seems to me that we could allow the same syntax for
<style scoped>. But maybe others disagree?
Sounds good to me. A sticky case you left out is parent, sibling and
reference combinators.

.findAll("+ div")

Assuming the context node has siblings, should that return them? If so,
should it match siblings when using <style scoped>.

IMO, it shouldn't match anything in either case. We should assert that only
descendants of the scope element will ever be returned. This would also make
it naturally match <style scoped> where only descendants of the scope
element are ever affected.

I think appropriate optimizations as well as extensible functions
Post by Alex Russell
should be out-of-scope for this thread. They are both big subjects on
their own and we're approaching 50 emails in this thread.
/ Jonas
Tab Atkins Jr.
2011-10-20 05:08:29 UTC
Permalink
Post by Ojan Vafai
.findAll("body > :scope > div")  // returns nothing
Wouldn't this return ids 1,2,3 if we're not prepending :scope as you say
below?
Yes, but he was answering those questions based on the assumption of
always prepending :scope.
Post by Ojan Vafai
Additionally it seems to me that we could allow the same syntax for
<style scoped>. But maybe others disagree?
Sounds good to me. A sticky case you left out is parent, sibling and
reference combinators.
.findAll("+ div")
Assuming the context node has siblings, should that return them? If so,
should it match siblings when using <style scoped>.
IMO, it shouldn't match anything in either case. We should assert that
only descendants of the scope element will ever be returned. This would also
make it naturally match <style scoped> where only descendants of the scope
element are ever affected.
I disagree. It's extremely useful and natural for .find(":scope +
div") to match sibling of the context node. Basically, the presence
of :scope would turn off *all* the limitations; the only thing that
the context node still does is match the :scope pseudo. The selector
should match across and return elements from anywhere in the document.

This is where I think that .find and <style scoped> should diverge in behavior.

.find should have two cases:

1. Selector without :scope - run the selector only across the
descendants of the context node. (No need to explicitly filter, since
the results will only contain descendants of the context node
already.)
2. Selector with :scope - run the selector across the entire document,
with :scope matching the context node. (No filtering here, either.)

<style scoped> should (I think) have three cases:

1. Selector without :scope - same as .find
2. Selector with :scope - Same as #1, but also including the context node.
3. Selector in @global - run the selector across the entire document,
filter the results to only be the context node and its descendants.

(Some people disagree with me on this, and think that #1 and #2 should
be merged to always include the context node. That's acceptable, but
I don't like it as much.)

I think it's perfectly okay that these two APIs have different cases.

~TJ
Jonas Sicking
2011-10-20 05:52:59 UTC
Permalink
Post by Tab Atkins Jr.
Post by Ojan Vafai
.findAll("body > :scope > div")  // returns nothing
Wouldn't this return ids 1,2,3 if we're not prepending :scope as you say
below?
Yes, but he was answering those questions based on the assumption of
always prepending :scope.
Exactly.
Post by Tab Atkins Jr.
Post by Ojan Vafai
Additionally it seems to me that we could allow the same syntax for
<style scoped>. But maybe others disagree?
Sounds good to me. A sticky case you left out is parent, sibling and
reference combinators.
.findAll("+ div")
Assuming the context node has siblings, should that return them? If so,
should it match siblings when using <style scoped>.
IMO, it shouldn't match anything in either case. We should assert that
only descendants of the scope element will ever be returned. This would also
make it naturally match <style scoped> where only descendants of the scope
element are ever affected.
I disagree.  It's extremely useful and natural for .find(":scope +
div") to match sibling of the context node.  Basically, the presence
of :scope would turn off *all* the limitations; the only thing that
the context node still does is match the :scope pseudo.  The selector
should match across and return elements from anywhere in the document.
This is where I think that .find and <style scoped> should diverge in behavior.
1. Selector without :scope - run the selector only across the
descendants of the context node.  (No need to explicitly filter, since
the results will only contain descendants of the context node
already.)
2. Selector with :scope - run the selector across the entire document,
with :scope matching the context node.  (No filtering here, either.)
1. Selector without :scope - same as .find
2. Selector with :scope - Same as #1, but also including the context node.
filter the results to only be the context node and its descendants.
(Some people disagree with me on this, and think that #1 and #2 should
be merged to always include the context node.  That's acceptable, but
I don't like it as much.)
I think it's perfectly okay that these two APIs have different cases.
I'm not sure I understand what you are proposing here. Are you saying that

<div>
<style scoped>
:scope {
background: green;
}
</style>
</div>

should set the background of the <div> green? This does seem intuitive
I agree, but it might also lead to strange behavior since the
rendering of the <div> will change once the stylesheet is parsed. In
other words, it's very easy to get flash-of-unstyled-content behavior.

/ Jonas
Roland Steiner
2011-10-20 08:09:14 UTC
Permalink
Post by Jonas Sicking
Post by Tab Atkins Jr.
1. Selector without :scope - same as .find
2. Selector with :scope - Same as #1, but also including the context
node.
Post by Tab Atkins Jr.
filter the results to only be the context node and its descendants.
(Some people disagree with me on this, and think that #1 and #2 should
be merged to always include the context node. That's acceptable, but
I don't like it as much.)
I think it's perfectly okay that these two APIs have different cases.
I'm not sure I understand what you are proposing here. Are you saying that
<div>
<style scoped>
:scope {
background: green;
}
</style>
</div>
should set the background of the <div> green? This does seem intuitive
I agree, but it might also lead to strange behavior since the
rendering of the <div> will change once the stylesheet is parsed. In
other words, it's very easy to get flash-of-unstyled-content behavior.
Hixie's - again valid IMHO - counterargument for this was that, with the
above proposal:

div { background-color-green }

would not color the scoping element, while the more specific (!)

div:scope { background-color: green }

would. I.e., a more specific selector suddenly selecting MORE elements than
a not so specific one.


- Roland
Lachlan Hunt
2011-10-20 09:13:56 UTC
Permalink
Post by Jonas Sicking
I'm not sure I understand what you are proposing here. Are you saying that
<div>
<style scoped>
:scope {
background: green;
}
</style>
</div>
should set the background of the<div> green? This does seem intuitive
I agree, but it might also lead to strange behavior since the
rendering of the<div> will change once the stylesheet is parsed. In
other words, it's very easy to get flash-of-unstyled-content behavior.
In the majority of cases, that's a very easy problem for authors to
avoid by always putting <style scoped> as the first child of the
element. Since a <div> is invisible in most cases without any content
or other styles, any change in rendering from invisible to visible
wouldn't be any different from normal incremental rendering.
--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/
Jonas Sicking
2011-10-20 09:35:10 UTC
Permalink
Post by Jonas Sicking
I'm not sure I understand what you are proposing here. Are you saying that
<div>
<style scoped>
:scope {
  background: green;
}
</style>
</div>
should set the background of the<div>  green? This does seem intuitive
I agree, but it might also lead to strange behavior since the
rendering of the<div>  will change once the stylesheet is parsed. In
other words, it's very easy to get flash-of-unstyled-content behavior.
In the majority of cases, that's a very easy problem for authors to avoid by
always putting <style scoped> as the first child of the element.  Since a
<div> is invisible in most cases without any content or other styles, any
change in rendering from invisible to visible wouldn't be any different from
normal incremental rendering.
You'd also get the same effect since earlier siblings of the <style
scoped> would be affected, right?

I.e. in the following markup, both <span>s would be blue and the <div>
would be green.
<div>
<span>text here</span>
<style scoped>
:scope span { background: green }
:scope { background: red }
</style>
<span>more text here</span>
</div>

Another problem though is one of performance. Allowing <style>
elements which are "later" in the DOM affect earlier nodes means that
you have to walk significantly more nodes to look for which sheet
could apply. You'll have to both look at the elements children, as
well as all following siblings.

If <style scoped> elements only affect nodes which are later in DOM
order, it's much easier to keep a list of all currently applying
<style scoped> elements as you walk through the DOM tree.

However it's possible that this can be optimized satisfactory. But
it's something that we need implementation feedback on.

/ Jonas
Roland Steiner
2011-10-20 08:06:12 UTC
Permalink
Post by Tab Atkins Jr.
1. Selector without :scope - same as .find
2. Selector with :scope - Same as #1, but also including the context node.
filter the results to only be the context node and its descendants.
(Some people disagree with me on this, and think that #1 and #2 should
be merged to always include the context node. That's acceptable, but
I don't like it as much.)
The - very valid IMHO - main argument for <style scoped> to always include
the scoping element was to allow for easy migration. I.e., where currently
you'd use

<style>
#menu .foo { color: green }
</style>

<div id="menu">
<div class=foo>
Will be green
</div>
</div>
<div class=foo>
Will NOT be green
</div>

You could just stick the stylesheet under the div and add 'scoped':

<div id="menu">
<style scoped>
#menu .foo { color: green }
</style>
<div class=foo>
Will be green
</div>
</div>
<div class=foo>
Will NOT be green
</div>

In browsers that don't support 'scoped', this would still work. Where
'scoped' is supported, this doesn't change much per se, except that those
style rules don't need to be checked outside the scope. Once a majority of
browsers support <style scoped> one can then proceed to simplify the rules
and remove '#menu' (admitted caveat: where this then doesn't create an
ambiguity with the scoping <div>).


- Roland
Boris Zbarsky
2011-10-20 14:23:30 UTC
Permalink
Post by Tab Atkins Jr.
I disagree. It's extremely useful and natural for .find(":scope +
div") to match sibling of the context node.
I really don't think it is. If you want that, use document.find(":scope
+ div", context).
Post by Tab Atkins Jr.
Basically, the presence of :scope would turn off *all* the limitations
That's a _really_ bizarre behavior. So in this case:

foo.find(":scope + div, div")

what all divs in the document would be found? Or is the "oh, ignore the
reference node except for matching :scope" meant to only apply on a
per-selector basis inside the selector list? That has its own issues,
especially with performance (e.g. merging nodesets while preserving DOM
order).

-Boris
Lachlan Hunt
2011-10-20 15:01:00 UTC
Permalink
I disagree. It's extremely useful and natural for .find(":scope +
div") to match sibling of the context node.
I really don't think it is. If you want that, use document.find(":scope
+ div", context).
Basically, the presence of :scope would turn off *all* the limitations
foo.find(":scope + div, div")
what all divs in the document would be found? Or is the "oh, ignore the
reference node except for matching :scope" meant to only apply on a
per-selector basis inside the selector list? That has its own issues,
especially with performance (e.g. merging nodesets while preserving DOM
order).
As it was specified in the old draft of queryScopedSelector (which is
the definition I start with if find/findAll get introduced), it was done
on a per selector basis, so the above would be equivalent to:

document.querySelector(":scope + div, :scope div", foo);
--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/
Tab Atkins Jr.
2011-10-20 20:32:11 UTC
Permalink
Post by Tab Atkins Jr.
I disagree.  It's extremely useful and natural for .find(":scope +
div") to match sibling of the context node.
I really don't think it is.  If you want that, use document.find(":scope +
div", context).
Why do that, when the previous one is shorter and simpler, and unambiguous?

The whole point is that we *know* the behavior I suggest is well-known
and easy to use, because jQuery (and probably other selector engines?)
does it already. I know for a fact that I've appreciated that
behavior in my own coding. It would have been very annoying to me had
I been forced to break my chaining just to select a sibling, when the
exact same style works fine to select a child. It's intuitive and
useful.

The behavior is useful in jQuery because it lets me evaluate a
selector, do some work to the matched elements, and then just
"continue" the selector to grab more, regardless of what form the
continuation takes. Forcing me to think about the continuation's form
(and even worse, completely rearrange the call structure) is just
mean. ^_^
Post by Tab Atkins Jr.
Basically, the presence of :scope would turn off *all* the limitations
 foo.find(":scope + div, div")
what all divs in the document would be found?  Or is the "oh, ignore the
reference node except for matching :scope" meant to only apply on a
per-selector basis inside the selector list?  That has its own issues,
especially with performance (e.g. merging nodesets while preserving DOM
order).
Per-selector basis; we're not talking about naive string manipulation
here. Your example would return divs that are descendants or an
adjacent sibling of the scoping element.

I don't really see the performance issues. If you use + or ~ off of
:scope, you know for a fact that all the nodes come *after* any
selectors that don't have :scope. If you use the subject indicator or
the reference combinator that's not necessarily true, but those
selectors will be slow already. Even then, sorting them into DOM
order should be relatively easy:

1. Run the :scope-carrying selectors across the document together,
automatically yielding a dom-ordered list.
2. Run the :scope-absent selectors together, automatically yielding a
dom-ordered list.
3. Find where the scoping element would be inserted in the #1 list,
and insert the entire #2 list there.

There's no further interleaving that could cause trouble.

~TJ
Boris Zbarsky
2011-10-20 21:04:39 UTC
Permalink
Post by Tab Atkins Jr.
I don't really see the performance issues. If you use + or ~ off of
:scope, you know for a fact that all the nodes come *after* any
selectors that don't have :scope.
Yes.
Post by Tab Atkins Jr.
1. Run the :scope-carrying selectors across the document together,
automatically yielding a dom-ordered list.
2. Run the :scope-absent selectors together, automatically yielding a
dom-ordered list.
3. Find where the scoping element would be inserted in the #1 list,
and insert the entire #2 list there.
foo.find(":scope + div, :scope div")

begs to differ.

So does:

foo.find("span :scope ~ div, span > :scope div)

(which is not quite as trivial to analyze).

You could try to look at the combinator following the part(s) that have
:scope, but that can get tricky.

And worse yet, the current :scope proposals allow an arbitrary nodeset
to be specified as matching :scope, at which point this whole thing is
out the window.

And yes, if you use a subject indicator then performance goes out the
window too; you basically have to search the whole DOM.

As long as you're ok with searching the whole DOM any time anything
funny is happening, of course, there's no other performance issue here.
But then I suspect this will be slow to start with....

-Boris
Tab Atkins Jr.
2011-10-20 21:15:55 UTC
Permalink
Post by Tab Atkins Jr.
1. Run the :scope-carrying selectors across the document together,
automatically yielding a dom-ordered list.
2. Run the :scope-absent selectors together, automatically yielding a
dom-ordered list.
3. Find where the scoping element would be inserted in the #1 list,
and insert the entire #2 list there.
 foo.find(":scope + div, :scope div")
begs to differ.
Well, that would only cause a problem if there were also a
:scope-absent selector in the list, like:

foo.find(":scope + div, :scope div, span")

If they *all* carry :scope, then you can just run them all over the
whole tree and get the ordered set in the normal fashion.
 foo.find("span :scope ~ div, span > :scope div)
(which is not quite as trivial to analyze).
Same here.

Yeah, it's possible to make it necessary to interleave the
:scope-carrying and :scope-absent sets, if you run the two separately
(perhaps because you can optimize the :scope-absent ones to fail when
the search would escape the subtree).

If you internally modify the :scope-absent selectors to start with
:scope and a descendant combinator (in other words, do the "add a
unique id" trick that selector engines do in this situation), and then
run it with all the rest, though, that disappears. You might lose
some possible optimization (based on knowing you only need to search a
subtree), but not necessarily, and you're avoiding a potential
slowdown from having to interleave.
You could try to look at the combinator following the part(s) that have
:scope, but that can get tricky.
Yeah, I can see some possibilities, but they're not exhaustive.
And worse yet, the current :scope proposals allow an arbitrary nodeset to be
specified as matching :scope, at which point this whole thing is out the
window.
QSA allows that (or plans to?). Alex's find() proposal does not. The
scoping element is solely the 'this' in .find.
And yes, if you use a subject indicator then performance goes out the window
too; you basically have to search the whole DOM.
As long as you're ok with searching the whole DOM any time anything funny is
happening, of course, there's no other performance issue here.  But then I
suspect this will be slow to start with....
It was good enough for jQuery in the pre-QSA days, and it's still good
enough for jQuery now when it can't use QSA, I don't see why it's not
good enough for the rest of us. We can do at least as good, and
probably still better.

~TJ
Boris Zbarsky
2011-10-20 21:43:00 UTC
Permalink
Post by Tab Atkins Jr.
If they *all* carry :scope, then you can just run them all over the
whole tree and get the ordered set in the normal fashion.
You can just prepend :scope to the ones missing it and run them over the
whole tree too.

But that means that now you're doing work proportional to the size of
your whole DOM, not the subtree rooted at the context element, which is
a pretty big difference.
Post by Tab Atkins Jr.
And worse yet, the current :scope proposals allow an arbitrary nodeset to be
specified as matching :scope, at which point this whole thing is out the
window.
QSA allows that (or plans to?). Alex's find() proposal does not. The
scoping element is solely the 'this' in .find.
I was assuming we were discussing find() in preference to the QSA
extensions. Maybe I was confused?
Post by Tab Atkins Jr.
It was good enough for jQuery in the pre-QSA days, and it's still good
enough for jQuery now when it can't use QSA, I don't see why it's not
good enough for the rest of us.
jQuery takes some shortcuts we can't take (note the getElementById
comments elsewhere in this thread).

Maybe it'll be ok. Maybe not. I'd rather not paint ourselves into the
"not" corner if we can avoid it....

-Boris
Lachlan Hunt
2011-10-20 21:33:25 UTC
Permalink
Post by Tab Atkins Jr.
Post by Boris Zbarsky
Post by Tab Atkins Jr.
Basically, the presence of :scope would turn off *all* the limitations
foo.find(":scope + div, div")
what all divs in the document would be found? Or is the "oh, ignore the
reference node except for matching :scope" meant to only apply on a
per-selector basis inside the selector list? That has its own issues,
especially with performance (e.g. merging nodesets while preserving DOM
order).
Per-selector basis; we're not talking about naive string manipulation
here. Your example would return divs that are descendants or an
adjacent sibling of the scoping element.
Not necessarily. It depends what exactly it means for a selector to
contain :scope for determining whether or not to enable the implied
:scope behaviour. Consider:

foo.find(":not(:scope)");

If that is deemed to contain :scope and turn off the prepending of
scope, making it equivalent to:

document.querySelectorAll(":not(:scope)", foo);

Then it matches every element in the document except the context node.

Otherwise, if it we decide that containing :scope means that it contains
a :scope selector that is not within a functional notation
pseudo-element, then it would prepend :scope, equivalent to:

document.querySelectorAll(":scope :not(:scope)", foo)

Then it matches all descendants of the context element.

In the latter case, then it would only ever be possible for matches to
be found as descendants, siblings or descendants of siblings of the
context element.

That would even be true in cases like:

foo.find("section:scope+div, div, ~p span, .x :scope>h1+span")

With the selector pre-processing, that selector becomes

"section:scope+div, :scope div, :scope~p span, .x :scope>h1+span"
--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/
Tab Atkins Jr.
2011-10-20 21:41:30 UTC
Permalink
Post by Tab Atkins Jr.
Post by Tab Atkins Jr.
Basically, the presence of :scope would turn off *all* the limitations
 foo.find(":scope + div, div")
what all divs in the document would be found?  Or is the "oh, ignore the
reference node except for matching :scope" meant to only apply on a
per-selector basis inside the selector list?  That has its own issues,
especially with performance (e.g. merging nodesets while preserving DOM
order).
Per-selector basis; we're not talking about naive string manipulation
here.  Your example would return divs that are descendants or an
adjacent sibling of the scoping element.
Not necessarily.  It depends what exactly it means for a selector to contain
:scope for determining whether or not to enable the implied :scope
 foo.find(":not(:scope)");
If that is deemed to contain :scope and turn off the prepending of scope,
 document.querySelectorAll(":not(:scope)", foo);
Then it matches every element in the document except the context node.
This seems perfectly fine, since if you just want all the elements
*underneath* the scoping element, you can instead do the much simpler:

foo.find("*")
Otherwise, if it we decide that containing :scope means that it contains a
:scope selector that is not within a functional notation pseudo-element,
 document.querySelectorAll(":scope :not(:scope)", foo)
Then it matches all descendants of the context element.
This prevents us from doing things like ":matches(:scope, #foo)",
which seems potentially useful. (Plus, :matches(X) should always be
equivalent to just X, possibly modulo specificity differences.)
In the latter case, then it would only ever be possible for matches to be
found as descendants, siblings or descendants of siblings of the context
element.
 foo.find("section:scope+div, div, ~p span, .x :scope>h1+span")
With the selector pre-processing, that selector becomes
 "section:scope+div, :scope div, :scope~p span, .x :scope>h1+span"
Unless you use the reference combinator or the subject indicator, or
something else we come up with in the future that lets us do more
complicated searching.

~TJ
Jonas Sicking
2011-10-20 05:40:13 UTC
Permalink
Post by Ojan Vafai
Post by Jonas Sicking
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
Discussions about a Scoped variant or ":scope" pseudo tacitly
no major JS library exposes the QSA semantic, instead choosing to
implement a rooted search.
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names. APIs gain
use both through naming and through use. On today's internet -- the
one where 50% of all websites include jQuery -- you could even go with
element.$("selector") and everyone would know what you mean: it's
clearly a search rooted at the element on the left-hand side of the
dot.
Ceteris peribus, shorter is better. When there's a tie that needs to
be broken, the more frequently used the API, the shorter the name it
deserves -- i.e., the larger the component of its meaning it will gain
through use and repetition and not naming and documentation.
I know some on this list might disagree, but all of the above is
incredibly non-controversial today. Even if there may have been
debates about scoping or naming when QSA was originally designed,
history has settled them. And QSA lost on both counts.
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
  element.querySelectorAll(":scope > div > .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from. I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
  element.findAll("> div > .thinger");
Out come the knives! You can't start a selector with a combinator!
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
 HTMLDocument.prototype.find =
 HTMLElement.prototype.find = function(rootedSelector) {
    return this.querySelector(":scope " + rootedSelector);
  }
  HTMLDocument.prototype.findAll =
  HTMLElement.prototype.findAll = function(rootedSelector) {
    return this.querySelectorAll(":scope " + rootedSelector);
  }
Of course, ":scope" in this case is just a special case of the ID
rooting hack, but if we're going to have it, we can kill both birds
with it.
Q.) Why do we need this at all? Don't the toolkits already just do
this internally?
A.) Are you saying everyone, everywhere, all the time should need to
use a toolkit to get sane behavior from the DOM? If so, what are we
doing here, exactly?
Q.) Shorter names? Those are for weaklings!
A.) And humans. Who still constitute most of our developers. Won't
someone please think of the humans?
Q.) You're just duplicating things!
A.) If you ignore all of the things that are different, then that's
true. If not, well, then no. This is a change. And a good one for the
reasons listed above.
Thoughts?
I like the general idea here. And since we're changing behavior, I
think it's a good opportunity to come up with shorter names. Naming is
really hard. The shorter names we use, the more likely it is that
we're going to break webpages which are messing around with the
prototype chain and it increases the risk that we'll regret it later
when we come up with even better functions which should use those
names. Say that we come up with an even better query language than
selectors, at that point .find will simply not be available to us.
However, it does seem like selectors are here to stay. And as much as
they have shortcomings, people seem to really like them for querying.
So with that out of the way, I agree that the CSS working group
shouldn't be what is holding us back. However we do need a precise
definition of what the new function does. Is prepending ":scope " and
then parsing as a normal selector always going to give the behavior we
want? This is actually what I think we got stuck on when the original
querySelector was designed.
So let's get into specifics about how things should work. According to
your proposal of simply prepending a conceptual ":scope" to each
<body id="3">
 <div id="context" foo=bar>
   <div id=1></div>
   <div class="class" id=2></div>
   <div class="withChildren" id=3><div class=child id=4></div></div>
 </div>
</body>
.findAll("div")  // returns ids 1,2,3,4
.findAll("")      // returns the context node itself. This was
indicated undesirable
.findAll("body > :scope > div")  // returns nothing
Wouldn't this return ids 1,2,3 if we're not prepending :scope as you say
below?
Post by Jonas Sicking
.findAll("#3")  // returns id 3, but not the body node
.findAll("> div") // returns ids 1,2,3
.findAll("[foo=bar]") // returns nothing
.findAll("[id=1]") // returns id 1
.findAll(":first-child") // returns id 1
Is this desired behavior in all cases except the empty string? If so
this seems very doable to me. We can easily make an exception for the
case when the passed in string contains no selectors and make that an
error or some such.
I do however like the idea that if :scope appears in the selector,
then this removes the prepending of ":scope " to that selector group.
Is there a reason not to do that?
Additionally it seems to me that we could allow the same syntax for
<style scoped>. But maybe others disagree?
Sounds good to me. A sticky case you left out is parent, sibling and
reference combinators.
.findAll("+ div")
Assuming the context node has siblings, should that return them?
Indeed, this is a very sticky case. Depending on how you interpret the
selector, this would either never return anything, or it would return
all following siblings to the context node with localName "div".

I think that most authors which use jQuery today would expect all
siblings to be returned. But it's arguably either significantly harder
to implement, or significantly slower to execute.

I.e. either implementations would have to completely change their
implementation strategy, which currently is "test all nodes that could
possibly match and see if the match against the selector", to
"evaluate each step of the selector as an expression which return a
set of nodes, use that set of nodes as input into the next step of the
selector".

It's definitely implementable, but will require significantly more
work for implementations. But given how commonly selector-querying is
done, it just might be worth doing.
Post by Ojan Vafai
If so,
should it match siblings when using <style scoped>.
IMO, it shouldn't match anything in either case. We should assert that
only descendants of the scope element will ever be returned. This would also
make it naturally match <style scoped> where only descendants of the scope
element are ever affected.
Indeed, in scoped stylesheets it seems very awkward to match siblings
of the stylesheet scope. I think I agree that in that context
selectors like "+ div" (or ":scope + div") shouldn't match anything.

I'm less convinced that that is a good idea for .find/.findAll.

Would love input from Alex and Yehuda here.

/ Jonas
Boris Zbarsky
2011-10-20 02:31:10 UTC
Permalink
Post by Jonas Sicking
I like the general idea here. And since we're changing behavior, I
think it's a good opportunity to come up with shorter names. Naming is
really hard. The shorter names we use, the more likely it is that
we're going to break webpages which are messing around with the
prototype chain
Not just the proto chain. Every method you add on Element or Document
will break any inline event handler attributes that happen to use that
name as a bareword. We had some amount of that with the "list" property
on inputs, and that only added a property on HTMLInputElement....

Again, in this case a shorter name may make sense, but in general there
are good reasons for not using short names all the time.

-Boris
Sean Hogan
2011-10-20 06:14:12 UTC
Permalink
Post by Jonas Sicking
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
I like the general idea here. And since we're changing behavior, I
think it's a good opportunity to come up with shorter names. Naming is
really hard. The shorter names we use, the more likely it is that
we're going to break webpages which are messing around with the
prototype chain and it increases the risk that we'll regret it later
when we come up with even better functions which should use those
names. Say that we come up with an even better query language than
selectors, at that point .find will simply not be available to us.
However, it does seem like selectors are here to stay. And as much as
they have shortcomings, people seem to really like them for querying.
So with that out of the way, I agree that the CSS working group
shouldn't be what is holding us back.
I don't agree with Selectors API supporting invalid selectors, but I
guess the discussion is more appropriate here than there.
Post by Jonas Sicking
However we do need a precise
definition of what the new function does. Is prepending ":scope " and
then parsing as a normal selector always going to give the behavior we
want? This is actually what I think we got stuck on when the original
querySelector was designed.
So let's get into specifics about how things should work. According to
your proposal of simply prepending a conceptual ":scope" to each
<body id="3">
<div id="context" foo=bar>
<div id=1></div>
<div class="class" id=2></div>
<div class="withChildren" id=3><div class=child id=4></div></div>
</div>
</body>
.findAll("div") // returns ids 1,2,3,4
.findAll("") // returns the context node itself. This was
indicated undesirable
.findAll("body> :scope> div") // returns nothing
.findAll("#3") // returns id 3, but not the body node
.findAll("> div") // returns ids 1,2,3
.findAll("[foo=bar]") // returns nothing
.findAll("[id=1]") // returns id 1
.findAll(":first-child") // returns id 1
Is this desired behavior in all cases except the empty string? If so
this seems very doable to me. We can easily make an exception for the
case when the passed in string contains no selectors and make that an
error or some such.
I know everyone knows this, but...

These specific examples (where the selector is not a comma separated
list) plus most instances of selector lists (e.g. "th, td", "> ul > li,
Post by Jonas Sicking
ol > li") can be trivially supported by a tiny wrapper around
querySelectorAll() as defined in Selectors API v2. In fact, I've never
seen a selector list that couldn't be successfully split on "," and I
wouldn't be surprised if they are never used outside of stylesheets.
Post by Jonas Sicking
I do however like the idea that if :scope appears in the selector,
then this removes the prepending of ":scope " to that selector group.
Is there a reason not to do that?
1. Already supported (in the draft spec) by querySelectorAll().
2. Not supported by JS libs.
3. No use cases requiring it.
Post by Jonas Sicking
Additionally it seems to me that we could allow the same syntax for
<style scoped>. But maybe others disagree?
Surely it is both or neither. You don't want to set a precedent for DOM
selectors not matching CSS selectors.
Post by Jonas Sicking
I think appropriate optimizations as well as extensible functions
should be out-of-scope for this thread. They are both big subjects on
their own and we're approaching 50 emails in this thread.
Jonas Sicking
2011-10-20 06:41:11 UTC
Permalink
Post by Sean Hogan
Post by Jonas Sicking
I do however like the idea that if :scope appears in the selector,
then this removes the prepending of ":scope " to that selector group.
Is there a reason not to do that?
1. Already supported (in the draft spec) by querySelectorAll().
2. Not supported by JS libs.
3. No use cases requiring it.
It's annoying if querying engines have to work with two different
query methods (.findAll and .querySelectorAll) and know when to call
which. So I don't think 1 is a particularly good point.

However 3 is a very good point. If there aren't use cases, then we
shouldn't support it. And 2 is a good indicator that there aren't use
cases.

But if someone knows of use cases then I'm all ears.

It's also something that can be added at a later point if use cases arise.

/ Jonas
Sean Hogan
2011-10-20 07:01:50 UTC
Permalink
Post by Jonas Sicking
Post by Sean Hogan
Post by Jonas Sicking
I do however like the idea that if :scope appears in the selector,
then this removes the prepending of ":scope " to that selector group.
Is there a reason not to do that?
1. Already supported (in the draft spec) by querySelectorAll().
2. Not supported by JS libs.
3. No use cases requiring it.
It's annoying if querying engines have to work with two different
query methods (.findAll and .querySelectorAll) and know when to call
which. So I don't think 1 is a particularly good point.
I don't follow that.
If you want <style scoped> behavior you call findAll().
If not you call querySelectorAll().

Sean
Sean Hogan
2011-10-20 08:14:23 UTC
Permalink
Post by Jonas Sicking
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
element.findAll("> div> .thinger");
I like the general idea here.
I think appropriate optimizations as well as extensible functions
should be out-of-scope for this thread. They are both big subjects on
their own and we're approaching 50 emails in this thread.
If find / findAll are added to the spec there should also be an
equivalent of matchesSelector that handles implicitly scoped selector,
e.g. "> div > .thinger". To aid discussion I will call this matches(),
but I don't think it is a good final choice.

The primary use-case for matchesSelector() has been event-delegation,
and this is the same for matches(). More specifically, consider the
following scenario:

jQuery adds a new event registration method that uses event delegation
to mimic the behavior of:
$(elem).find("> div > .thinger").bind(eventType, fn);
The new method is called proxybind(), and the equivalent of the above is:
$(elem).proxybind("> div > .thinger", eventType, fn);

The event handling for proxybind() would invoke matches("> div >
.thinger", [elem]) on elements between the event target and elem to find
matching elements.

Sean
Jonas Sicking
2011-10-20 08:32:08 UTC
Permalink
Post by Jonas Sicking
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
 I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
  element.findAll(">  div>  .thinger");
I like the general idea here.
I think appropriate optimizations as well as extensible functions
should be out-of-scope for this thread. They are both big subjects on
their own and we're approaching 50 emails in this thread.
If find / findAll are added to the spec there should also be an equivalent
of matchesSelector that handles implicitly scoped selector, e.g. "> div >
.thinger". To aid discussion I will call this matches(), but I don't think
it is a good final choice.
How would .matches() work? For .findAll we basically prepend a ":scope
" selector step where the :scope pseudo-class matches the element on
which .findAll was called.

If we did the same for .matches() then

elem.matches("foo")

would try to match elem against the selector ":scope foo" where
":scope" only matches elem and thus the selector only matches elements
which are descendants of the element on which .matches() was called.

In other words, .matches() would never match anything.

Clearly you must have some other behavior in mind as a function which
always returns false isn't particularly interesting.

/ Jonas
Sean Hogan
2011-10-20 09:20:34 UTC
Permalink
Post by Jonas Sicking
Post by Jonas Sicking
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
element.findAll("> div> .thinger");
I like the general idea here.
I think appropriate optimizations as well as extensible functions
should be out-of-scope for this thread. They are both big subjects on
their own and we're approaching 50 emails in this thread.
If find / findAll are added to the spec there should also be an equivalent
of matchesSelector that handles implicitly scoped selector, e.g. "> div>
.thinger". To aid discussion I will call this matches(), but I don't think
it is a good final choice.
How would .matches() work? For .findAll we basically prepend a ":scope
" selector step where the :scope pseudo-class matches the element on
which .findAll was called.
If we did the same for .matches() then
elem.matches("foo")
would try to match elem against the selector ":scope foo" where
":scope" only matches elem and thus the selector only matches elements
which are descendants of the element on which .matches() was called.
In other words, .matches() would never match anything.
Clearly you must have some other behavior in mind as a function which
always returns false isn't particularly interesting.
/ Jonas
See the definition of matchesSelector(selector, [ refNodes ]) in the spec:
http://www.w3.org/TR/selectors-api2/#matchtesting
Lachlan Hunt
2011-10-20 09:42:11 UTC
Permalink
Post by Sean Hogan
The primary use-case for matchesSelector() has been event-delegation,
and this is the same for matches(). More specifically, consider the
jQuery adds a new event registration method that uses event delegation
$(elem).find("> div > .thinger").bind(eventType, fn);
$(elem).proxybind("> div > .thinger", eventType, fn);
The event handling for proxybind() would invoke matches("> div >
.thinger", [elem]) on elements between the event target and elem to find
matching elements.
It may not be too late to introduce that behaviour into matchesSelector,
with a switch based on the presence or absence of the
refNodes/refElement parameter.

As currently specified, calling the following doesn't and shouldn't
prepend :scope.

el.matchesSelector("div .foo");

This one also matches the prefixed implementations in browsers, since
most haven't started supporting :scope yet, and I don't believe
Mozilla's experimental implementation [1] has landed yet.

As currently specified, calling this:

el.matchesSelector("div .foo", ref);

Also doesn't prepend :scope automatically, but in that case, the ref
nodes do nothing useful. Authors have to use :scope explicitly for them
to be useful as in something like:

el.matchesSelector(":scope div .foo", ref);

Or

el.matchesSelector("div:scope .foo", ref);

One thing we could possibly do is define that if ref nodes are passed,
and the selector doesn't explicitly use :scope, then effectively prepend
":scope ". This would be exactly the same behaviour as that discussed
for .findAll();

That wouldn't break compatibility with anything, optimises for a common
case and avoids introducing two separate match methods.

e.g.
el.matchesSelector("div .foo"); // No ref, no magic :scope

el.matchesSelector("div .foo", ref); // Implied, magic :scope
el.matchesSelector("+.foo", ref); // Implied, magic :scope

el.matchesSelector(":scope div .foo", ref); // Explicit, no magic :scope
el.matchesSelector("div:scope .foo", ref); // Explicit, no magic :scope

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=648722
--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/
Sean Hogan
2011-10-20 10:29:03 UTC
Permalink
Post by Lachlan Hunt
Post by Sean Hogan
The primary use-case for matchesSelector() has been event-delegation,
and this is the same for matches(). More specifically, consider the
jQuery adds a new event registration method that uses event delegation
$(elem).find("> div > .thinger").bind(eventType, fn);
$(elem).proxybind("> div > .thinger", eventType, fn);
The event handling for proxybind() would invoke matches("> div >
.thinger", [elem]) on elements between the event target and elem to find
matching elements.
It may not be too late to introduce that behaviour into
matchesSelector, with a switch based on the presence or absence of the
refNodes/refElement parameter.
As currently specified, calling the following doesn't and shouldn't
prepend :scope.
el.matchesSelector("div .foo");
This one also matches the prefixed implementations in browsers, since
most haven't started supporting :scope yet, and I don't believe
Mozilla's experimental implementation [1] has landed yet.
el.matchesSelector("div .foo", ref);
Also doesn't prepend :scope automatically, but in that case, the ref
nodes do nothing useful.
But this selector can still match elements. Admittedly I can't think of
a use-case for this, but it is conceivable for someone to expect this to
work without an implied :scope.
Post by Lachlan Hunt
Authors have to use :scope explicitly for them to be useful as in
el.matchesSelector(":scope div .foo", ref);
Or
el.matchesSelector("div:scope .foo", ref);
One thing we could possibly do is define that if ref nodes are passed,
and the selector doesn't explicitly use :scope, then effectively
prepend ":scope ". This would be exactly the same behaviour as that
discussed for .findAll();
That wouldn't break compatibility with anything, optimises for a
common case and avoids introducing two separate match methods.
I don't see the need for findAll(), but if it is added I think it should
always imply ":scope " at the start of a selector, and I think a
separate match method that does the same should be added. To do
otherwise seems too ambiguous for a DOM API.
Post by Lachlan Hunt
e.g.
el.matchesSelector("div .foo"); // No ref, no magic :scope
el.matchesSelector("div .foo", ref); // Implied, magic :scope
el.matchesSelector("+.foo", ref); // Implied, magic :scope
el.matchesSelector(":scope div .foo", ref); // Explicit, no magic :scope
el.matchesSelector("div:scope .foo", ref); // Explicit, no magic :scope
[1] https://bugzilla.mozilla.org/show_bug.cgi?id=648722
Alex Russell
2011-10-20 10:46:46 UTC
Permalink
Post by Jonas Sicking
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
Discussions about a Scoped variant or ":scope" pseudo tacitly
no major JS library exposes the QSA semantic, instead choosing to
implement a rooted search.
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names. APIs gain
use both through naming and through use. On today's internet -- the
one where 50% of all websites include jQuery -- you could even go with
element.$("selector") and everyone would know what you mean: it's
clearly a search rooted at the element on the left-hand side of the
dot.
Ceteris peribus, shorter is better. When there's a tie that needs to
be broken, the more frequently used the API, the shorter the name it
deserves -- i.e., the larger the component of its meaning it will gain
through use and repetition and not naming and documentation.
I know some on this list might disagree, but all of the above is
incredibly non-controversial today. Even if there may have been
debates about scoping or naming when QSA was originally designed,
history has settled them. And QSA lost on both counts.
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
  element.querySelectorAll(":scope > div > .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from. I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
  element.findAll("> div > .thinger");
Out come the knives! You can't start a selector with a combinator!
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
 HTMLDocument.prototype.find =
 HTMLElement.prototype.find = function(rootedSelector) {
    return this.querySelector(":scope " + rootedSelector);
  }
  HTMLDocument.prototype.findAll =
  HTMLElement.prototype.findAll = function(rootedSelector) {
    return this.querySelectorAll(":scope " + rootedSelector);
  }
Of course, ":scope" in this case is just a special case of the ID
rooting hack, but if we're going to have it, we can kill both birds
with it.
Q.) Why do we need this at all? Don't the toolkits already just do
this internally?
A.) Are you saying everyone, everywhere, all the time should need to
use a toolkit to get sane behavior from the DOM? If so, what are we
doing here, exactly?
Q.) Shorter names? Those are for weaklings!
A.) And humans. Who still constitute most of our developers. Won't
someone please think of the humans?
Q.) You're just duplicating things!
A.) If you ignore all of the things that are different, then that's
true. If not, well, then no. This is a change. And a good one for the
reasons listed above.
Thoughts?
I like the general idea here. And since we're changing behavior, I
think it's a good opportunity to come up with shorter names. Naming is
really hard. The shorter names we use, the more likely it is that
we're going to break webpages which are messing around with the
prototype chain and it increases the risk that we'll regret it later
when we come up with even better functions which should use those
names.
So long as the slots are still writable, no loss. Their patches into
the prototype chain still exist. Being afraid of this when we're "on
top" seems really, *REALLY* strange to me.
Post by Jonas Sicking
Say that we come up with an even better query language than
selectors, at that point .find will simply not be available to us.
Premature optimization. And "$" is still available ;-)
Post by Jonas Sicking
However, it does seem like selectors are here to stay. And as much as
they have shortcomings, people seem to really like them for querying.
So with that out of the way, I agree that the CSS working group
shouldn't be what is holding us back. However we do need a precise
definition of what the new function does. Is prepending ":scope " and
then parsing as a normal selector always going to give the behavior we
want? This is actually what I think we got stuck on when the original
querySelector was designed.
So let's get into specifics about how things should work. According to
your proposal of simply prepending a conceptual ":scope" to each
<body id="3">
 <div id="context" foo=bar>
   <div id=1></div>
   <div class="class" id=2></div>
   <div class="withChildren" id=3><div class=child id=4></div></div>
 </div>
</body>
.findAll("div")  // returns ids 1,2,3,4
.findAll("")      // returns the context node itself. This was
indicated undesirable
And, in follow-up mail, we talked extensively about why I didn't
*really* mean "just prepend the string ':scope '".

I think empty string is a special case that we should treat as "return
an empty list".
Post by Jonas Sicking
.findAll("body > :scope > div")  // returns nothing
I suggest we treat ":scope" occurring after the first term of the
selector as an error.
Post by Jonas Sicking
.findAll("#3")  // returns id 3, but not the body node
Correct. Assuming the query is
document.find("#context").findAll("#3"), which is what I think you
mean for the root to be in these examples?
Post by Jonas Sicking
.findAll("> div") // returns ids 1,2,3
Yep.
Post by Jonas Sicking
.findAll("[foo=bar]") // returns nothing
Right.
Post by Jonas Sicking
.findAll("[id=1]") // returns id 1
Right.
Post by Jonas Sicking
.findAll(":first-child") // returns id 1
Agreed.
Post by Jonas Sicking
Is this desired behavior in all cases except the empty string? If so
this seems very doable to me. We can easily make an exception for the
case when the passed in string contains no selectors and make that an
error or some such.
I do however like the idea that if :scope appears in the selector,
then this removes the prepending of ":scope " to that selector group.
Is there a reason not to do that?
Hmm. I think I might like that better than an execption. Worried about
it being too magical, but I don't have a strong opinion either way.
Post by Jonas Sicking
Additionally it seems to me that we could allow the same syntax for
<style scoped>. But maybe others disagree?
I think appropriate optimizations as well as extensible functions
should be out-of-scope for this thread. They are both big subjects on
their own and we're approaching 50 emails in this thread.
Agreed.
Jonas Sicking
2011-10-20 18:55:49 UTC
Permalink
Post by Alex Russell
Post by Jonas Sicking
However, it does seem like selectors are here to stay. And as much as
they have shortcomings, people seem to really like them for querying.
So with that out of the way, I agree that the CSS working group
shouldn't be what is holding us back. However we do need a precise
definition of what the new function does. Is prepending ":scope " and
then parsing as a normal selector always going to give the behavior we
want? This is actually what I think we got stuck on when the original
querySelector was designed.
So let's get into specifics about how things should work. According to
your proposal of simply prepending a conceptual ":scope" to each
<body id="3">
 <div id="context" foo=bar>
   <div id=1></div>
   <div class="class" id=2></div>
   <div class="withChildren" id=3><div class=child id=4></div></div>
 </div>
</body>
.findAll("div")  // returns ids 1,2,3,4
.findAll("")      // returns the context node itself. This was
indicated undesirable
And, in follow-up mail, we talked extensively about why I didn't
*really* mean "just prepend the string ':scope '".
I think empty string is a special case that we should treat as "return
an empty list".
Sounds reasonable (sorry to bring this case up again, I just wanted to
be comprehensive, though I failed at that, see below)
Post by Alex Russell
Post by Jonas Sicking
.findAll("body > :scope > div")  // returns nothing
I suggest we treat ":scope" occurring after the first term of the
selector as an error.
So how should it work in the first term?

I.e. what should

.findAll(":scope")
.findAll("div:scope")
.findAll("[foo=bar]:scope")
.findAll(":scope div")
.findAll("div:scope div")
.findAll("div:scope #3")

return?

Also, why should :scope appearing in the first term be different from
appearing in any other term?

What is the use case?

Do libraries have anything equivalent today?
Post by Alex Russell
Post by Jonas Sicking
.findAll("#3")  // returns id 3, but not the body node
Correct. Assuming the query is
document.find("#context").findAll("#3"), which is what I think you
mean for the root to be in these examples?
Yup.

/ Jonas
Jonas Sicking
2011-10-20 05:55:27 UTC
Permalink
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
Discussions about a Scoped variant or ":scope" pseudo tacitly
no major JS library exposes the QSA semantic, instead choosing to
implement a rooted search.
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names. APIs gain
use both through naming and through use. On today's internet -- the
one where 50% of all websites include jQuery -- you could even go with
element.$("selector") and everyone would know what you mean: it's
clearly a search rooted at the element on the left-hand side of the
dot.
Ceteris peribus, shorter is better. When there's a tie that needs to
be broken, the more frequently used the API, the shorter the name it
deserves -- i.e., the larger the component of its meaning it will gain
through use and repetition and not naming and documentation.
I know some on this list might disagree, but all of the above is
incredibly non-controversial today. Even if there may have been
debates about scoping or naming when QSA was originally designed,
history has settled them. And QSA lost on both counts.
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
  element.querySelectorAll(":scope > div > .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from. I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
  element.findAll("> div > .thinger");
Out come the knives! You can't start a selector with a combinator!
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
 HTMLDocument.prototype.find =
 HTMLElement.prototype.find = function(rootedSelector) {
    return this.querySelector(":scope " + rootedSelector);
  }
  HTMLDocument.prototype.findAll =
  HTMLElement.prototype.findAll = function(rootedSelector) {
    return this.querySelectorAll(":scope " + rootedSelector);
  }
Of course, ":scope" in this case is just a special case of the ID
rooting hack, but if we're going to have it, we can kill both birds
with it.
Q.) Why do we need this at all? Don't the toolkits already just do
this internally?
A.) Are you saying everyone, everywhere, all the time should need to
use a toolkit to get sane behavior from the DOM? If so, what are we
doing here, exactly?
Q.) Shorter names? Those are for weaklings!
A.) And humans. Who still constitute most of our developers. Won't
someone please think of the humans?
Q.) You're just duplicating things!
A.) If you ignore all of the things that are different, then that's
true. If not, well, then no. This is a change. And a good one for the
reasons listed above.
Thoughts?
Oh, and as a separate issue. I think .findAll should return a plain
old JS Array. Not a NodeList or any other type of host object. One of
the use cases is being able to mutate the returned value. This is
useful if you're for example doing multiple .findAll calls (possibly
with different context nodes) and want to merge the resulting lists
into a single list.

/ Jonas
Alex Russell
2011-10-20 10:50:35 UTC
Permalink
Post by Jonas Sicking
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
Discussions about a Scoped variant or ":scope" pseudo tacitly
no major JS library exposes the QSA semantic, instead choosing to
implement a rooted search.
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names. APIs gain
use both through naming and through use. On today's internet -- the
one where 50% of all websites include jQuery -- you could even go with
element.$("selector") and everyone would know what you mean: it's
clearly a search rooted at the element on the left-hand side of the
dot.
Ceteris peribus, shorter is better. When there's a tie that needs to
be broken, the more frequently used the API, the shorter the name it
deserves -- i.e., the larger the component of its meaning it will gain
through use and repetition and not naming and documentation.
I know some on this list might disagree, but all of the above is
incredibly non-controversial today. Even if there may have been
debates about scoping or naming when QSA was originally designed,
history has settled them. And QSA lost on both counts.
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
  element.querySelectorAll(":scope > div > .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from. I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
  element.findAll("> div > .thinger");
Out come the knives! You can't start a selector with a combinator!
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
 HTMLDocument.prototype.find =
 HTMLElement.prototype.find = function(rootedSelector) {
    return this.querySelector(":scope " + rootedSelector);
  }
  HTMLDocument.prototype.findAll =
  HTMLElement.prototype.findAll = function(rootedSelector) {
    return this.querySelectorAll(":scope " + rootedSelector);
  }
Of course, ":scope" in this case is just a special case of the ID
rooting hack, but if we're going to have it, we can kill both birds
with it.
Q.) Why do we need this at all? Don't the toolkits already just do
this internally?
A.) Are you saying everyone, everywhere, all the time should need to
use a toolkit to get sane behavior from the DOM? If so, what are we
doing here, exactly?
Q.) Shorter names? Those are for weaklings!
A.) And humans. Who still constitute most of our developers. Won't
someone please think of the humans?
Q.) You're just duplicating things!
A.) If you ignore all of the things that are different, then that's
true. If not, well, then no. This is a change. And a good one for the
reasons listed above.
Thoughts?
Oh, and as a separate issue. I think .findAll should return a plain
old JS Array. Not a NodeList or any other type of host object.
I strongly agree that it should be an Array *type*, but I think just
returning a plain Array is the wrong resolution to our NodeList
problem. WebIDL should specify that DOM List types *are* Array types.
It's insane that we even have a NodeList type which isn't a real array
at all. Adding a parallel system when we could just fix the one we
have (and preserve the value of a separate prototype for extension) is
wonky to me.

That said, I'd *also* support the ability to have some sort of
decorator mechanism before return on .find() or a way to re-route the
prototype of the returned Array.

+heycam to debate this point.
Post by Jonas Sicking
One of
the use cases is being able to mutate the returned value. This is
useful if you're for example doing multiple .findAll calls (possibly
with different context nodes) and want to merge the resulting lists
into a single list.
Agreed. An end to the Array.slice() hacks would be great.
Lachlan Hunt
2011-10-20 11:05:53 UTC
Permalink
Post by Alex Russell
Post by Jonas Sicking
Oh, and as a separate issue. I think .findAll should return a plain
old JS Array. Not a NodeList or any other type of host object.
I strongly agree that it should be an Array *type*, but I think just
returning a plain Array is the wrong resolution to our NodeList
problem. WebIDL should specify that DOM List types *are* Array types.
We need NodeList separate from Array where they are live lists. I
forget the reason we originally opted for a static NodeList rather than
Array when this issue was originally discussed a few years ago.
--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/
Alex Russell
2011-10-20 11:18:32 UTC
Permalink
Post by Lachlan Hunt
Post by Alex Russell
Post by Jonas Sicking
Oh, and as a separate issue. I think .findAll should return a plain
old JS Array. Not a NodeList or any other type of host object.
I strongly agree that it should be an Array *type*, but I think just
returning a plain Array is the wrong resolution to our NodeList
problem. WebIDL should specify that DOM List types *are* Array types.
We need NodeList separate from Array where they are live lists.
No we don't. The fact that there's someone else who has a handle to
the list and can mutate it underneath you is a documentation issue,
not a question of type...unless the argument is that the slots should
be non-configurable, non-writable except by the browser that's also
holding a ref to it.
Post by Lachlan Hunt
 I forget
the reason we originally opted for a static NodeList rather than Array when
this issue was originally discussed a few years ago.
Lachlan Hunt
2011-10-20 11:37:42 UTC
Permalink
Post by Alex Russell
Post by Lachlan Hunt
We need NodeList separate from Array where they are live lists.
No we don't. The fact that there's someone else who has a handle to
the list and can mutate it underneath you is a documentation issue,
not a question of type...unless the argument is that the slots should
be non-configurable, non-writable except by the browser that's also
holding a ref to it.
The author cannot be allowed to directly modify a live list, as such it
must be an immutable object from the script's perspective. Otherwise,
things would get really complicated if this happened:

var p = document.getElementsByTagName("p");
p.reverse();
p.push("x");
p.shift();
document.body.insertBefore(document.createElement("p"), p[2]);

Where in the array would that new P element get added?

NodeLists are supposed to be live and in document order. Ordinarily,
that new P element would be inserted into the document and the change
reflected in the NodeList. By allowing an author to modify the list in
some way, that completely breaks the way NodeLists are defined to work.
Now while it's arguable that live node lists were a mistake and that
it would have been better if static Arrays were returned, we are stuck
with them and cannot change that.
--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/
Erik Arvidsson
2011-10-20 16:05:41 UTC
Permalink
Post by Alex Russell
Post by Lachlan Hunt
We need NodeList separate from Array where they are live lists.
No we don't. The fact that there's someone else who has a handle to
the list and can mutate it underneath you is a documentation issue,
not a question of type...unless the argument is that the slots should
be non-configurable, non-writable except by the browser that's also
holding a ref to it.
The author cannot be allowed to directly modify a live list, as such it must
be an immutable object from the script's perspective.  Otherwise, things
var p = document.getElementsByTagName("p");
p.reverse();
Just define [[Put]] to throw (by only having a getter). Since reverse
is defined using [[Put]] things would work as expected.
--
erik
Boris Zbarsky
2011-10-20 14:14:41 UTC
Permalink
Post by Alex Russell
No we don't. The fact that there's someone else who has a handle to
the list and can mutate it underneath you
There is no sane way to mutate the list on the part of the browser if
someone else is also messing with it, because the someone else can
violate basic invariants the browser's behavior needs to maintain.
Post by Alex Russell
unless the argument is that the slots should
be non-configurable, non-writable except by the browser that's also
holding a ref to it.
"Yes".

Though I don't know what "slots" you're talking about; the only sane JS
implementation of live nodelists is as a proxy. There's no way to get
the behaviors that browsers have for them otherwise.

-Boris
Alex Russell
2011-10-20 14:23:24 UTC
Permalink
Post by Boris Zbarsky
Post by Alex Russell
No we don't. The fact that there's someone else who has a handle to
the list and can mutate it underneath you
There is no sane way to mutate the list on the part of the browser if
someone else is also messing with it, because the someone else can violate
basic invariants the browser's behavior needs to maintain.
Right. So you need to vend an apparently-immutable Array, one which
can only be changed by the browser. I think that could be accomplished
in terms of Proxies. But it's still an Array type.
Post by Boris Zbarsky
Post by Alex Russell
unless the argument is that the slots should
be non-configurable, non-writable except by the browser that's also
holding a ref to it.
"Yes".
Though I don't know what "slots" you're talking about; the only sane JS
implementation of live nodelists is as a proxy.  There's no way to get the
behaviors that browsers have for them otherwise.
But it can be a Proxy to an *Array*, not to some weird non-Array type.
Boris Zbarsky
2011-10-20 14:29:47 UTC
Permalink
Post by Alex Russell
Right. So you need to vend an apparently-immutable Array, one which
can only be changed by the browser. I think that could be accomplished
in terms of Proxies. But it's still an Array type.
I have no problem with Array being on the prototype chain or whatnot.
But it's not "an array" in the sense that you can't do a bunch of things
with it that people do with arrays.
Post by Alex Russell
Post by Boris Zbarsky
Though I don't know what "slots" you're talking about; the only sane JS
implementation of live nodelists is as a proxy. There's no way to get the
behaviors that browsers have for them otherwise.
But it can be a Proxy to an *Array*, not to some weird non-Array type.
Why does it matter what it's a proxy to? The whole point of being a
proxy is that you can't tell what it's proxying.

Case in point, in Gecko it's a proxy to something that's not a JS object
at all and not even implementable in JS (because it uses internal engine
information that's not available to JS).

So what exactly do you want here other than nodelists having
Array.prototype on their prototype chain, which is discussed elsewhere?

And again, for static nodelists none of this applies; there's absolutely
no reason I can think of to not make them arrays, unless you really want
a .item() on them or unless you really think the length getter should be
hookable.

-Boris
Jonas Sicking
2011-10-20 19:09:00 UTC
Permalink
Post by Alex Russell
Post by Boris Zbarsky
Post by Alex Russell
No we don't. The fact that there's someone else who has a handle to
the list and can mutate it underneath you
There is no sane way to mutate the list on the part of the browser if
someone else is also messing with it, because the someone else can violate
basic invariants the browser's behavior needs to maintain.
Right. So you need to vend an apparently-immutable Array, one which
can only be changed by the browser. I think that could be accomplished
in terms of Proxies. But it's still an Array type.
Post by Boris Zbarsky
Post by Alex Russell
unless the argument is that the slots should
be non-configurable, non-writable except by the browser that's also
holding a ref to it.
"Yes".
Though I don't know what "slots" you're talking about; the only sane JS
implementation of live nodelists is as a proxy.  There's no way to get the
behaviors that browsers have for them otherwise.
But it can be a Proxy to an *Array*, not to some weird non-Array type.
Let's do the general discussion about how live and non-live NodeLists
should behave in a separate thread.

The immediate question here is how should the returned object from
.findAll behave? Should it be mutable? Should you be able to insert
non-Nodes into it? Should it have all of the functions of
Array.prototype or just some subset? Should it have any additional
functions?

Since .findAll is a new function we have absolutely no constraints as
far as how NodeLists behave, we can simply return something that isn't
a NodeList.

/ Jonas
Tab Atkins Jr.
2011-10-20 20:34:25 UTC
Permalink
Post by Jonas Sicking
Let's do the general discussion about how live and non-live NodeLists
should behave in a separate thread.
Yes, let's. ^_^
Post by Jonas Sicking
The immediate question here is how should the returned object from
.findAll behave? Should it be mutable? Should you be able to insert
non-Nodes into it? Should it have all of the functions of
Array.prototype or just some subset? Should it have any additional
functions?
Since .findAll is a new function we have absolutely no constraints as
far as how NodeLists behave, we can simply return something that isn't
a NodeList.
It should absolutely have all the Array functions. I know that I want
to be able to slice, append, forEach, map, and reduce the list
returned by .find.

~TJ
Willison, Timothy
2011-10-20 20:43:55 UTC
Permalink
Post by Tab Atkins Jr.
Post by Jonas Sicking
Let's do the general discussion about how live and non-live NodeLists
should behave in a separate thread.
Yes, let's. ^_^
Post by Jonas Sicking
The immediate question here is how should the returned object from
.findAll behave? Should it be mutable? Should you be able to insert
non-Nodes into it? Should it have all of the functions of
Array.prototype or just some subset? Should it have any additional
functions?
Since .findAll is a new function we have absolutely no constraints as
far as how NodeLists behave, we can simply return something that isn't
a NodeList.
It should absolutely have all the Array functions. I know that I want
to be able to slice, append, forEach, map, and reduce the list
returned by .find.
IMHO, the most useful thing would be to just return an Array of nodes so no further adjustment of the return value is required in selector engines.
Post by Tab Atkins Jr.
~TJ
Erik Arvidsson
2011-10-20 16:02:10 UTC
Permalink
Post by Alex Russell
No we don't. The fact that there's someone else who has a handle to
the list and can mutate it underneath you is a documentation issue,
not a question of type...unless the argument is that the slots should
be non-configurable, non-writable except by the browser that's also
holding a ref to it.
That is an ES violation. A non configurable, non writable data
property is not allowed to change its value.

var descr = Object.getOwnPropertyDescription(object, name);
if (!descr.configurable && !decsr.writable && ('value' in descr)) {
var value = descr.value;
setInterval(function() {
// Must never change
assert(object[name] === value);
});
}

Therefore there is no such thing as an immutable live NodeList.

There are ways around this.

1. Use a getter
2. Make it configurable
--
erik
Boris Zbarsky
2011-10-20 16:28:40 UTC
Permalink
Post by Erik Arvidsson
That is an ES violation. A non configurable, non writable data
property is not allowed to change its value.
It's not clear what that means in proxy-land; esp. since it's not clear
whether proxies can even have non-configurable properties... or did that
discussion come to a conclusion?

-Boris
Erik Arvidsson
2011-10-20 18:42:13 UTC
Permalink
Post by Boris Zbarsky
Post by Erik Arvidsson
That is an ES violation. A non configurable, non writable data
property is not allowed to change its value.
It's not clear what that means in proxy-land; esp. since it's not clear
whether proxies can even have non-configurable properties... or did that
discussion come to a conclusion?
We have a solution to that:

http://wiki.ecmascript.org/doku.php?id=strawman:direct_proxies
--
erik
Sean Hogan
2011-10-20 11:35:10 UTC
Permalink
Post by Lachlan Hunt
Post by Alex Russell
Post by Jonas Sicking
Oh, and as a separate issue. I think .findAll should return a plain
old JS Array. Not a NodeList or any other type of host object.
I strongly agree that it should be an Array *type*, but I think just
returning a plain Array is the wrong resolution to our NodeList
problem. WebIDL should specify that DOM List types *are* Array types.
We need NodeList separate from Array where they are live lists. I
forget the reason we originally opted for a static NodeList rather
than Array when this issue was originally discussed a few years ago.
I wonder if anyone is relying on querySelectorAll() returning a
StaticNodeList?
Lachlan Hunt
2011-10-20 11:49:42 UTC
Permalink
Post by Sean Hogan
I wonder if anyone is relying on querySelectorAll() returning a
StaticNodeList?
Only if there are people out there using list.item(n) instead of
list[n], or people extending the NodeList interface and expecting such
methods to be available on the result. Though I suspect the former is
very rare, and the latter doesn't work in all browsers.
--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/
Sean Hogan
2011-10-20 12:18:13 UTC
Permalink
Post by Lachlan Hunt
Post by Sean Hogan
I wonder if anyone is relying on querySelectorAll() returning a
StaticNodeList?
Only if there are people out there using list.item(n) instead of
list[n], or people extending the NodeList interface and expecting such
methods to be available on the result. Though I suspect the former is
very rare, and the latter doesn't work in all browsers.
And I wonder if one of the browser vendors would be willing to silently
change the behavior and see if they get any bug reports.
Erik Arvidsson
2011-10-20 16:12:08 UTC
Permalink
Post by Sean Hogan
I wonder if anyone is relying on querySelectorAll() returning a
StaticNodeList?
Only if there are people out there using list.item(n) instead of list[n], or
people extending the NodeList interface and expecting such methods to be
available on the result.  Though I suspect the former is very rare, and the
latter doesn't work in all browsers.
Both are rare but they do happen

http://codesearch.google.com/#search/&q=%5CsNodeList%5C.prototype%5C.(%5Cw%2B)%5Cs*=&type=cs

What is funny is that code search only found one instance of item
being used directly after querySelectorAll(...). Of course, that
search does not tell the whole story.

http://codesearch.google.com/#SjGak5n5VAM/trunk/parsehtml_util.py&q=querySelectorAll%5C(%5B%5E)%5D%2B%5C)%5C.item%20-file:layouttests&type=cs&l=24
--
erik
Boris Zbarsky
2011-10-20 14:12:38 UTC
Permalink
Post by Alex Russell
Post by Jonas Sicking
Oh, and as a separate issue. I think .findAll should return a plain
old JS Array. Not a NodeList or any other type of host object.
I strongly agree that it should be an Array *type*, but I think just
returning a plain Array is the wrong resolution to our NodeList
problem. WebIDL should specify that DOM List types *are* Array types.
You missed the point of Jonas's suggestion.

DOM NodeLists are not mutable via direct manipulation of indexed
properties. Jonas is saying that we want the return value here to be
thus mutable.

Making all DOM NodeLists mutable won't really work because a bunch of
them are live; there's no sane way to combine liveness and mutability.

The non-live lists (e.g. the one involved here) could be made mutable,
but then why make then nodelists at all? What's the point?
Post by Alex Russell
It's insane that we even have a NodeList type which isn't a real array
at all.
It's not at all insane for the live lists. See previous discussion
about this.

I think any solution here that tries to treat live and non-live lists
identically is doomed to failure.

-Boris
Jonas Sicking
2011-10-20 18:58:54 UTC
Permalink
Post by Alex Russell
Post by Jonas Sicking
Post by Alex Russell
Lachlan and I have been having an...um...*spirited* twitter discussion
regarding querySelectorAll, the (deceased?) queryScopedSelectorAll,
and ":scope". He asked me to continue here, so I'll try to keep it
The rooted forms of "querySelector" and "querySelectorAll" are mis-designed.
Discussions about a Scoped variant or ":scope" pseudo tacitly
no major JS library exposes the QSA semantic, instead choosing to
implement a rooted search.
Related and equally important, that querySelector and querySelectorAll
are often referred to by the abbreviation "QSA" suggests that its name
is bloated and improved versions should have shorter names. APIs gain
use both through naming and through use. On today's internet -- the
one where 50% of all websites include jQuery -- you could even go with
element.$("selector") and everyone would know what you mean: it's
clearly a search rooted at the element on the left-hand side of the
dot.
Ceteris peribus, shorter is better. When there's a tie that needs to
be broken, the more frequently used the API, the shorter the name it
deserves -- i.e., the larger the component of its meaning it will gain
through use and repetition and not naming and documentation.
I know some on this list might disagree, but all of the above is
incredibly non-controversial today. Even if there may have been
debates about scoping or naming when QSA was originally designed,
history has settled them. And QSA lost on both counts.
I therefore believe that this group's current design for scoped
selection could be improved significantly. If I understand the latest
draft (http://www.w3.org/TR/selectors-api2/#the-scope-pseudo-class)
  element.querySelectorAll(":scope > div > .thinger");
Both then name and the need to specify ":scope" are punitive to
readers and writers of this code. The selector is *obviously*
happening in relationship to "element" somehow. The only sane
relationship (from a modern JS hacker's perspective) is that it's
where our selector starts from. I'd like to instead propose that we
shorten all of this up and kill both stones by introducing a new API
pair, "find" and "findAll", that are rooted as JS devs expect. The
  element.findAll("> div > .thinger");
Out come the knives! You can't start a selector with a combinator!
Ah, but we don't need to care what CSS thinks of our DOM-only API. We
can live and let live by building on ":scope" and specifying find* as
 HTMLDocument.prototype.find =
 HTMLElement.prototype.find = function(rootedSelector) {
    return this.querySelector(":scope " + rootedSelector);
  }
  HTMLDocument.prototype.findAll =
  HTMLElement.prototype.findAll = function(rootedSelector) {
    return this.querySelectorAll(":scope " + rootedSelector);
  }
Of course, ":scope" in this case is just a special case of the ID
rooting hack, but if we're going to have it, we can kill both birds
with it.
Q.) Why do we need this at all? Don't the toolkits already just do
this internally?
A.) Are you saying everyone, everywhere, all the time should need to
use a toolkit to get sane behavior from the DOM? If so, what are we
doing here, exactly?
Q.) Shorter names? Those are for weaklings!
A.) And humans. Who still constitute most of our developers. Won't
someone please think of the humans?
Q.) You're just duplicating things!
A.) If you ignore all of the things that are different, then that's
true. If not, well, then no. This is a change. And a good one for the
reasons listed above.
Thoughts?
Oh, and as a separate issue. I think .findAll should return a plain
old JS Array. Not a NodeList or any other type of host object.
I strongly agree that it should be an Array *type*, but I think just
returning a plain Array is the wrong resolution to our NodeList
problem. WebIDL should specify that DOM List types *are* Array types.
It's insane that we even have a NodeList type which isn't a real array
at all. Adding a parallel system when we could just fix the one we
have (and preserve the value of a separate prototype for extension) is
wonky to me.
That said, I'd *also* support the ability to have some sort of
decorator mechanism before return on .find() or a way to re-route the
prototype of the returned Array.
+heycam to debate this point.
How would this new Array-type be different from an Array? Would it
mutable (your answer below seems to indicate 'yes')? Would it allow
inserting things that aren't Nodes?
Post by Alex Russell
Post by Jonas Sicking
One of
the use cases is being able to mutate the returned value. This is
useful if you're for example doing multiple .findAll calls (possibly
with different context nodes) and want to merge the resulting lists
into a single list.
Agreed. An end to the Array.slice() hacks would be great.
Yup.

/ Jonas
Bjoern Hoehrmann
2011-10-20 20:42:54 UTC
Permalink
Post by Alex Russell
I strongly agree that it should be an Array *type*, but I think just
returning a plain Array is the wrong resolution to our NodeList
problem. WebIDL should specify that DOM List types *are* Array types.
It's insane that we even have a NodeList type which isn't a real array
at all.
It is quite normal to consider lists and arrays to be different things.
In Perl for instance you can use list operations like `grep` on arrays,
but you cannot use array operations like `push` on lists. For JavaScript
programmers it actually seems common to confuse the two, like with

var node_list = document.getElementsByTagName('example');
for (var ix = 0; ix < node_list.length; ++ix)
node_list[ix].parentNode.removeChild(node_list[ix]);

which would remove all the children if node_list was an array like any
other. Pretending node lists are arrays in nomenclature would likely add
to that.
--
Björn Höhrmann · mailto:***@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Jonas Sicking
2011-10-21 07:41:53 UTC
Permalink
Post by Lachlan Hunt
Not necessarily. It depends what exactly it means for a selector to
contain
Post by Lachlan Hunt
:scope for determining whether or not to enable the implied :scope
foo.find(":not(:scope)");
Ooh, this is an interesting case too. So here's the full list of cases which
we need defined behavior for (again looking at Alex and Yehuda here).

In the following DOM

<body id="3">
<div id=0></div>
<div id="context" foo=bar>
<div id=1></div>
<div class="class" id=2></div>
<div class="withChildren" id=3><div class=child id=4></div></div>
</div>
<div id=5></div>
<div id=6></div>
</body>

What would each of the following .findAll calls return. I've included my
guessed based on the discussions so far:

var e = document.getElementById('context');

e.findAll("div") // returns ids 1,2,3,4
e.findAll("") // returns an empty list
e.findAll("#3") // returns id 3, but not the body node
e.findAll("> div") // returns ids 1,2,3
e.findAll("[foo=bar]") // returns nothing
e.findAll("[id=1]") // returns id 1
e.findAll(":first-child") // returns id 1
e.findAll("+ div") // returns id 5
e.findAll("~ div") // returns id 5, 6
e.findAll(":scope")
e.findAll("div:scope")
e.findAll("[foo=bar]:scope")
e.findAll(":scope div")
e.findAll("div:scope div")
e.findAll("div:scope #3")
e.findAll("body > :scope > div")
e.findAll("div, :scope")
e.findAll("body > :scope > div, :scope")
e.findAll(":not(:scope)")

/ Jonas
Loading...