I don’t normally post about boring work topics, but I wanted to talk about this because it’s a gigantic WTF, and because it might come in handy for someone else who’s stuck on a Javascript project.
It turns out that the thing getElementsByName and getElementsByTagName returns isn’t actually an array. It looks like an array, it walks like an array, and it quacks like a duck, but it’s not actually an array at all. It’s actually a “dispHTMLElementCollection”.
I’ve been doing tons and tons of Javascript work for years, and I’ve actually never come up against this particular quirk before. The only reason I figured it out is that I tried to push() an element into a dispHTMLElementCollection, and it turns out that dispHTMLElementCollections don’t have a push() method. Why doesn’t it have a push() method? Who knows.
Oh, and to make it worse: it’s not documented. Anywhere. This forum post is all MSDN (Microsoft’s developer site, maker of the most popular web browser on Earth) has. Mozilla (makers of the second-most popular web browser on Earth) has absolutely nothing on it, or at least nothing Google’s indexed. Nor does w3.org, the maintainers of the DOM standards (most relevant to this issue.)
What. The. Fuck.
Here’s a test page to demonstrate the issue:
<html>
<head>
<title>Test</title>
</head>
<body>
<p>h</p>
<p>e</p>
<hr />
<p>l</p>
<p>l</p>
<hr />
<p>o</p>
<script type="text/javascript">
// Call the "broken" version of CombinedElementList
// This function fails in both IE and Firefox, even
// though at first glance it looks fine. Reason?
// getElementsByTagName *doesn't* return an array,
// instead it returns a "dispHTMLElementCollection"
// which looks and acts exactly like an array, but
// has no .push() method.
//var combo = CombinedElementListBroken();
// The "fixed" version uses a Javascript array to
// store the results of the two
// getElementsByTagName calls.
var combo = CombinedElementListWorks();
alert(combo.length); // Expect: 7
function CombinedElementListBroken()
{
// Create two "arrays" of HTML elements
var paras = document.getElementsByTagName('P');
var hrs = document.getElementsByTagName('HR');
// Attempt to combine the two using a simple FOR loop
for (var i = 0; i < hrs.length; i++)
{
// IE: Object doesn't support this property or method
// Firefox: paras.push is not a function
paras.push(hrs[i]);
}
return (paras);
}
function CombinedElementListWorks()
{
// Create two "arrays" of HTML elements
var paras = document.getElementsByTagName('P');
var hrs = document.getElementsByTagName('HR');
// Create a third, blank, array to store the combined list
var combinedArr = new Array();
// Puts elements from the first "array" into the combined array
for (var i = 0; i < paras.length; i++)
{
combinedArr.push(paras[i]);
}
// And the second
for (var i = 0; i < hrs.length; i++)
{
combinedArr.push(hrs[i]);
}
return (combinedArr);
}
</script>
</body>
</html>
Ok, so I posted this to TheDailyWTF, thinking it’d be a laugh: it’s not. Don’t do that, ever. You’d never know it from the frontpage, but the WTF forums are full, apparently, of programmers with psychic or telekinetic powers. To them, it’s my own fault that I couldn’t tell with only my mind that a dispHTMLElementCollection is actually the same thing as a NodeList as documented in the DOM2 standards.
Read the thread if you like.
One useful piece of information I did glean from this, though, the reason that getElementsByTagName (and similar functions) return something other than an array: the list they return is “live”, meaning they can update as elements are added or removed from the page. I don’t see this as being particularly useful, but, hey, at least it explains why it’s not an array.
Several non-useful pieces of information I received: link after link after link to documentation that doesn’t have the terms “dispHTMLElementCollection” and “NodeList” on the same page, and thus have absolutely nothing to do with the WTF I reported.
Apparently, to the WTF posters, this is all “common knowledge” that I should have gotten based on vague comments in a Javascript library I don’t even use. Or I was supposed to look up getElementsByTagName in the DOM, then assume that the type returned (NodeList) just happens to be the same thing as a dispHTMLElementCollection even though there’s nothing to indicate that that is the case.
To the WTF posters, writing a simple page on either Mozilla or Microsoft’s site saying, “oh BTW, dispHTMLElementCollection is the interface we use to DOM2′s NodeList, here’s a link” is a horrible burden that should be never be inflicted on anybody.
BTW, kudos to tgape who not only agrees with me that the lack of documentation is a WTF, but who wasn’t a jerk about it.
Anyway, some good came out of all of this: the next person to search for this completely undocumented class (or interface, or whatever the hell it is) will find either the WTF post or this one, and hopefully won’t waste as much time and energy on it as I have.
It brings up a question, though, that I’m too lazy to test on a Sunday morning (but maybe I will tomorrow): Since the DOM NodeList can be a different class in a different browser, how the holy hell are you supposed to use typeof(x) to find whether something is a NodeList in a cross-browser way? Alternatively, if it’s represented internally to IE as a dispHTMLElementCollection, but typeof(x) returns NodeList (which is what I suspect happens), then why the holy hell would the debugger show dispHTMLElementCollection instead of NodeList?
There’s WTFs all around.