Cheerio and `.find()` return too many elements compared to jQuery
Basically I am trying to parse an HTML string and extract some information using Cheerio.js.
My HTML is a follow (of course I reduced and simplified it):
<html> <head></head> <body> <div> <table> <tr> <td> <a href="/link_1.php">Link 1</a> </td> <td> <a href="/link_2.php">Link 2</a> <a href="/link_3.php">Link 3</a> </td> <td> <a href="/link_4.php">Link 4</a> <a href="/link_5.php">Link 5</a> </td> </tr> </table> </div> </body> </html>
My code is this one:
var cheerio = require("cheerio"); var $ = cheerio.load(html); var page = $.root(); var tr = page.find("tr"); console.log(tr.find("> :nth-child(2) a").length);
You can try it here.
What I would expect is the code to return
2 because there is two links in the second direct child of the
tr element. However, this returns
5, all the links which are in the
tr are returned.
I tried the same thing with jQuery and the result is as it should be, see.
I also noticed that removing
<html> tag makes it work correctly, but I do not know why.
Am I doing something wrong or should I report this to developers as a bug?
Edit: I just opened an issue on GitHub.
One Solution collect form web for “Cheerio and `.find()` return too many elements compared to jQuery”
That fixes your issue, it helps if you find the items by children opposed to just a general find() statement!
var $ = cheerio.load(html); var page = $.root(); var tr = page.find("tr"); console.log(tr.children('td:nth-child(2)').children("a").length) or console.log(tr.find("> :nth-child(2)").find('a').length)