Whats up everybody!

I was a away for a long time, deep projects and shit, but I am back and will continue doing some course and more freebies.

Anyway, here is a brief domain crawler + email extractor I did with Node.JS using roboto library which is cool and easy. So here I go, step by step:

1.- Create a dir.
2.- Go inside of it.
3.- You need to install roboto and htmlstrip-native with npm.
4.- Create a crawl.js file inside that folder you created.
5.- Paste the source code on it:

Then just run the command using node like:
node crawler.js domain.com

Thats it, it will create a domain.com.txt with all the emails.

Your console will look like this:

And emails grabbed like this:


Obviously change domain.com by any domain you want to crawl and get emails.

IRC was the best 10 years ago, now other softwares have eclipsed, but many savvy people keep using it for proper communication with special individuals. So, if you never heard or configured eggdrops, this is something similar.

So first you will need to install NodeJS, and I will consider that you already have it, then you should have NPM also installed, then just install the irc library:

npm install irc

Once you have that, you need to specify that you are using this module and then set the configure on connecting to know which channels the bot will join automatically, and then set up the port, which is not a regular port for IRC but you need to set something more stable for a bot.

var irc = require('irc');
var bot = new irc.Client('chat.freenode.net','w0bot', {
    channels: ['#botplace', '#w0b'],
    port: 8001,
    debug: true,
    userName: 'wizard', // on the host is like wizard@host/ip
    realName: 'Im a bot from Wizard of Bots ;)',  // real name on whois
    showErrors: true, 
    autoRejoin: true, // auto rejoin channel when kicked
    autoConnect: true, // persistence to connect
});

So as you see, first we set up the server, then the nick, and then we open encapsulation to add the channels, the port and a mode where we can see what is going on on your console.

NOTE: You might get some errors when connecting, but try again and again. When you are able to connect, no matter if you sleep your localhost, it will continue connecting. Also this is for leaving it working on a VPS or shell so you have a always online bot.

But now that you have your bot online, what’s next? You need to know wtf is going on in the channels that you join or the PM you receive, and for that we have listeners.

Before getting into this I will explain the functions we have available so we can use them when we get events on the listeners:

bot.join('#yourchannel yourpass'); // this will join a channel with a pass (or no pass)
bot.part('#yourchannel'); //part a channel
bot.say('#yourchannel', "I'm a bot for w0b!"); // send a message to a channel
bot.whois(nickname, function (whois) {
    console.log(whois); // you need this callback to log the results once it finished doing the whois
bot.notice(target, "your message for the notice"); //target is either a nickname or a channel
});
bot.send('MODE', '#yourchannel', '+o', 'yournick'); // Gives OP to yournick in #yourchannel

So now that you know the commands to use on the Events, we now are listing the Listeners for this stuff:

bot.addListener('pm', function (from, message) {
    console.log(from + ' => ME: ' + message); // when you get a PM you log into console
    bot.say(from, 'Hello I am a bot from Wizard of Bots '); // Also you can automatically respond to that message with the command say
});
bot.addListener('message#yourchannel', function (from, message) {
    console.log(from + ' => #yourchannel: ' + message); // if someone sends a message to a specific channel
});
bot.addListener('join', function(channel, who) {
    console.log('%s has joined %s', who, channel);
    bot.say(who, 'Hello and welcome to ' + channel); // When someone joins a channel automatically welcome him
});
bot.addListener('kick', function(channel, who, by, reason) {
    console.log('%s was kicked from %s by %s: %s', who, channel, by, reason); // when someone is kicked log into the console what happend.
});
bot.addListener('part', function(channel, who, reason) {
    console.log('%s has left %s: %s', who, channel, reason); // when someone part
    // you can also send a PM to this guy to convince to stay.
});
bot.addListener('message', function(from, to, message) {
    if(  message.indexOf('Know any good jokes?') > -1
      || message.indexOf('good joke') > -1
    ) {
        bot.say(to, 'Knock knock!');
    }
});  // and in like other eggdrops, if you say those words, it will answer Knock Knock

 

Hey fellas, sorry for being absent for a long time, mainly it was lots of work on other projects.

In this post I am going to teach you how to screen scrape using NodeJS and JQuery (cheerio). Its relatively easy, here is the code:

var request = require('request'); // we need request library
var cheerio = require('cheerio'); // and cheerio library/ JQuery
// set some defaults
req = request.defaults({
  jar: true,                 // save cookies to jar
  rejectUnauthorized: false, 
  followAllRedirects: true   // allow redirections
});
// scrape the page
req.get({
    url: "http://www.whatsmyip.org/",
    headers: {
        'User-Agent': 'Google' // You can put the user-agent that you want
     }
  }, function(err, resp, body) {
  
  // load the html into cheerio
  var $ = cheerio.load(body);
  
  // get the data and output to console
  console.log( 'IP: ' + $('#ip').text() );  //scrape using CSS selector
  console.log( 'Host: ' + $('#hostname').text() );
  console.log( 'User-Agent: ' + $('#useragent').text() );
});

 

Hell yeah, this also deserves the place along nightmareJS, even though this doesnt use Electron for simulating browser, but a headless parser native libxml C bindings that will do a great job.

To start, you have to make sure you have previously installed nodejs libraries along with npm and if you get an error for the libxmljs library, make sure that you install this:

npm install node-gyp 
npm install libxmljs 
npm install osmosis

And then it should be working properly if you create a file.js and run the example script.

Here you have a bunch of examples to copy and paste to test it. In order to explain it further please ask your questions or requirements for video tutorials. I might open a premium spot for it, so you better fucking invite me a cup of oil so I can continue my compromise with you.

Craiglist example:

var osmosis = require('osmosis');

osmosis
.get('www.craigslist.org/about/sites')
.find('h1 + div a')
.set('location')
.follow('@href')
.find('header + div + div li > a')
.set('category')
.follow('@href')
.paginate('.totallink + a.button.next:first')
.find('p > a')
.follow('@href')
.set({
    'title':        'section > h2',
    'description':  '#postingbody',
    'subcategory':  'div.breadbox > span[4]',
    'date':         'time@datetime',
    'latitude':     '#map@data-latitude',
    'longitude':    '#map@data-longitude',
    'images':       ['img@src']
})
.data(function(listing) {
    // do something with listing data
})
.log(console.log)
.error(console.log)
.debug(console.log)

This is the official repo for Osmosis: https://github.com/rchipka/node-osmosis

It works like this: