前段时间一直忙于公司项目的测试工作导致一周多没有撸代码,所以今天通过悠悠是博客学习了一下如何让用接口登录拉勾网
下面介绍一下吧!分享下经验以及心得,,虽然我知道也没什么人会看!
首先呢要先登录拉钩网的网站:https://www.lagou.com/
登录后用fiddler抓包查看如下:
当再登陆一次时发现每次登陆的头部有一个code和token是随机变化的,如图箭头所示,那么就要找到这两个参数是如何产生的了。通过查看登录页面的前端代码发现如下图所示的这段注释暴露了,哈哈哈
下面就要通过beautifulsoup来抓取前段页面的这两个参数啦
直接上代码,注释部分我已经解释的很清楚啦!
通过前端抓取过这两个参数后就开始用接口去登录吧:
整体代码如下!::
#coding:utf-8 import requests,hashlib import re from bs4 import BeautifulSoup import urllib3 urllib3.disable_warnings()class Lagou(object):def __init__(self,s):self.s = sdef get_token(self):''' </script> <!-- 页面样式 --> <!-- 动态token,防御伪造请求,重复提交 --> <script> window.X_Anti_Forge_Token = '286fd3ae-ef82-4019-89c4-9408947a0e26'; window.X_Anti_Forge_Code = '74603111'; </script> ''' url = 'https://passport.lagou.com/login/login.html' header = {"User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36" }self.s.headers.update(header)data = self.s.get(url,headers = header)soup = BeautifulSoup(data.content,'html.parser',from_encoding='utf-8')token_code = {}try:#先找到第2个script节点 t = soup.find_all('script')[1].get_text()#print(t) #再用正则取key token_code['X_Anti_Forge_Token'] = re.findall(r"Token = '(.+?)'", t)[0]token_code['X_Anti_Forge_Code'] = re.findall(r"Code = '(.+?)'", t)[0]return token_codeexcept Exception as e:print('获取token等信息失败')token_code['X_Anti_Forge_Token'] ='' token_code['X_Anti_Forge_Code'] = '' #print(token_code) return token_code''' def encryptPwd(self,passwd): # 对密码进行了md5双重加密 passwd = hashlib.md5(passwd.encode('utf-8')).hexdigest() # veennike 这个值是在js文件找到的一个写死的值 passwd = 'veenike'+passwd+'veenike' passwd = hashlib.md5(passwd.encode('utf-8')).hexdigest() return passwd ''' def login(self,user,password):''' function:登录拉勾网网站 :param s: 传s = requests.session() :param gtoken: 上一函数getTokenCode返回的tokenCode :param user: 账号 :param psw: 密码 :return: 返回json ''' gtoken = self.get_token()url = 'https://passport.lagou.com/login/login.json' h = {"User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36","Content-Type": "application/x-www-form-urlencoded; charset=UTF-8","X-Requested-With": "XMLHttpRequest","X-Anit-Forge-Token": gtoken['X_Anti_Forge_Token'], #取上个方法的返回值 "X-Anit-Forge-Code": gtoken['X_Anti_Forge_Code'],"Referer": "https://passport.lagou.com/login/login.html",}self.s.headers.update(h) #更新头部 d = {"isValidate":'true',"username": user,"password": password,"request_form_verifyCode": "","submit": "" }r2 = self.s.post(url,headers = h,data=d,verify=False)#print(r2.text) print(r2.json())if __name__=='__main__':s = requests.session()la = Lagou(s)la.login('xxxxx','xxxxxxx')
其实通过fiddler也能看出来密码是被加密过的,具体加密方式我不知道,但是网上说是通过md5加密后再次加盐后用md5加密
这里我就不说了,你直接复制你的加密后的密码就行了!
写的比较简单,有不懂的可以qq问我 970185127