So I am going to start coding my custom script so I can know how to do this properly in order to keep my personal tool for growth, this way I will include AI in this process.

This script is free and works to promote my next private script. If you are interested just subscribe to the Newsletter so I can notify you when I have it in production for a Lifetime price, including SetUp.

Way to go so here is the script:

import requests
import json

# Replace with your own access token
access_token = 'YOUR_ACCESS_TOKEN'

# Get list of users you follow
url = 'https://api.instagram.com/v1/users/self/follows?access_token=' + access_token
response = requests.get(url)
data = json.loads(response.text)
follows = data["data"]

# Get list of users who follow you
url = 'https://api.instagram.com/v1/users/self/followed-by?access_token=' + access_token
response = requests.get(url)
data = json.loads(response.text)
followers = data['data']

# Create sets for quick comparison
follows_set = set([f['username'] for f in follows])
followers_set = set([f['username'] for f in followers])

# Find users who don't follow you back
not_following_back = follows_set.difference(followers_set)

# Print the list of users who don't follow you back
print(not_following_back)

BTW the image Feature Image is from the Crazy MidJourney

What’s up guys, here I present you a way of scraping all of a user’s tweets into a CSV, you can use it into an AI or whatever you want. It’s your choice.

First, you will need to install the tweepy library of Python3

Then you need to get your API credentials by making an app on Twitter

#Twitter API credentials 
consumer_key = "" 
consumer_secret = "" 
access_key = "" 
access_secret = ""

Here is the full source:

html2textI made an article about this because it took me about 2 hours to solve it. Seems like all sources are outdated with their methods. Many people were talking about using module html2text which is now deprecated, then many others recommended nltk and well.. The final result is that now BeautifulSoup does a better job that them. Buuuut…..

All resources were saying about this function called get_text() which is completely incorrect, it must be getText() camel case in order to make this conversion work.

Why I would need to do this? Well, specially when some networks try to encode with strange HTML DOM characters that makes your scraping a nightmare. But here is the chunk of code explained.

 

from BeautifulSoup import BeautifulSoup as Soup
import re, urllib2, nltk
 
url = 'http://google.com'
html = urllib2.urlopen(url).read() #make the request to the url
soup = Soup(html) #using Soup on the responde read
for script in soup(["script", "style"]): #You need to extract this <script> and <style> tags
    script.extract() #strip them off
text = soup.getText() #this is the method that I had like 40 min problems
text = text.encode('utf-8') #make sure to encode your text to be compatible
#raw = nltk.clean_html(document)
print(text.encode('utf-8'))

So you now know how to get text from a response, it will be now easy to get some data using Regular Expressions 🙂