抓包分析能够使用Http Analyzer,Filders,可是看起来很复杂,仍是使用火狐好(chrome远远没有火狐好用)。javascript
首先,在输入用户名后,会进行预登陆,网址为:http://login.sina.com.cn/sso/prelogin.php?entry=weibo&callback=sinaSSOController.preloginCallBack&su=ZW5nbGFuZHNldSU0MDE2My5jb20%3D&rsakt=mod&checkpin=1&client=ssologin.js(v1.4.18)&_=1443156845536,经过响应(sinaSSOController.preloginCallBack({"retcode":0,"servertime":1443156842,"pcid":"gz-e88b75a929252baec7c12c741985eaa45627","nonce":"2L4IZ3","pubkey":"EB2A38568661887FA180BDDB5CABD5F21C7BFD59C090CB2D245A87AC253062882729293E5506350508E7F9AA3BB77F4333231490F915F6D63C55FE2F08A49B353F444AD3993CACC02DB784ABBB8E42A9B1BBFFFB38BE18D78E87A0E41B9B8F73A928EE0CCEE1F6739884B9777E4FE9E88A1BBE495927AC4A799B3181D6442443","rsakv":"1330428213","showpin":0,"exectime":16})),咱们能够得到四个有用的变量,servertime、nonce、pubkey和rsakv。php
新浪微博的用户名加密目前采用Base64加密算法,而新浪微博登陆密码的加密算法使用RSA2,这是模拟登录的重点,须要先建立一个rsa公钥,公钥的两个参数新浪微博都给了固定值,第一个参数是登陆第一步中的pubkey,第二个参数是js加密文件中的‘10001’(针对网友的提问进行更新:这个其实就是在ssologin.js的响应中). 这两个值须要先从16进制转换成10进制,把10001转成十进制为65537,随后加入servertime和nonce再次加密。java
要提交的数据是:python
postdata = { 'entry': 'weibo', 'gateway': '1', 'from': '', 'savestate': '7', 'useticket': '1', 'pagerefer': "http://login.sina.com.cn/sso/logout.php?entry=miniblog&r=http%3A%2F%2Fweibo.com%2Flogout.php%3Fbackurl", 'vsnf': '1', 'su': su, 'service': 'miniblog', 'servertime': servertime, 'nonce': nonce, 'pwencode': 'rsa2', 'rsakv': rsakv, 'sp': password_secret, 'sr': '1366*768', 'encoding': 'UTF-8', 'prelt': '115', 'url': 'http://weibo.com/ajaxlogin.php?framelogin=1&callback=parent.sinaSSOController.feedBackUrlCallBack', 'returntype': 'META' }
提交以后,由于重定向,还要获取重定向的url。git
使用python3。github
import time import base64 import rsa import binascii import requests import re import random try: from PIL import Image except: pass try: from urllib.parse import quote_plus except: from urllib import quote_plus ''' 若是没有开启登陆保护,不用输入验证码就能够登陆 若是开启登陆保护,须要输入验证码 ''' # 构造 Request headers agent = 'Mozilla/5.0 (Windows NT 6.3; WOW64; rv:41.0) Gecko/20100101 Firefox/41.0' headers = { 'User-Agent': agent } session = requests.session() # 访问 初始页面带上 cookie index_url = "http://weibo.com/login.php" try: session.get(index_url, headers=headers, timeout=2) except: session.get(index_url, headers=headers) try: input = raw_input except: pass def get_su(username): """ 对 email 地址和手机号码 先 javascript 中 encodeURIComponent 对应 Python 3 中的是 urllib.parse.quote_plus 而后在 base64 加密后decode """ username_quote = quote_plus(username) username_base64 = base64.b64encode(username_quote.encode("utf-8")) return username_base64.decode("utf-8") # 预登录得到 servertime, nonce, pubkey, rsakv def get_server_data(su): pre_url = "http://login.sina.com.cn/sso/prelogin.php?entry=weibo&callback=sinaSSOController.preloginCallBack&su=" pre_url = pre_url + su + "&rsakt=mod&checkpin=1&client=ssologin.js(v1.4.18)&_=" pre_url = pre_url + str(int(time.time() * 1000)) pre_data_res = session.get(pre_url, headers=headers) sever_data = eval(pre_data_res.content.decode("utf-8").replace("sinaSSOController.preloginCallBack", '')) return sever_data # print(sever_data) def get_password(password, servertime, nonce, pubkey): rsaPublickey = int(pubkey, 16) key = rsa.PublicKey(rsaPublickey, 65537) # 建立公钥 message = str(servertime) + '\t' + str(nonce) + '\n' + str(password) # 拼接明文js加密文件中获得 message = message.encode("utf-8") passwd = rsa.encrypt(message, key) # 加密 passwd = binascii.b2a_hex(passwd) # 将加密信息转换为16进制。 return passwd def get_cha(pcid): cha_url = "http://login.sina.com.cn/cgi/pin.php?r=" cha_url = cha_url + str(int(random.random() * 100000000)) + "&s=0&p=" cha_url = cha_url + pcid cha_page = session.get(cha_url, headers=headers) with open("cha.jpg", 'wb') as f: f.write(cha_page.content) f.close() try: im = Image.open("cha.jpg") im.show() im.close() except: print(u"请到当前目录下,找到验证码后输入") def login(username, password): # su 是加密后的用户名 su = get_su(username) sever_data = get_server_data(su) servertime = sever_data["servertime"] nonce = sever_data['nonce'] rsakv = sever_data["rsakv"] pubkey = sever_data["pubkey"] showpin = sever_data["showpin"] password_secret = get_password(password, servertime, nonce, pubkey) postdata = { 'entry': 'weibo', 'gateway': '1', 'from': '', 'savestate': '7', 'useticket': '1', 'pagerefer': "http://login.sina.com.cn/sso/logout.php?entry=miniblog&r=http%3A%2F%2Fweibo.com%2Flogout.php%3Fbackurl", 'vsnf': '1', 'su': su, 'service': 'miniblog', 'servertime': servertime, 'nonce': nonce, 'pwencode': 'rsa2', 'rsakv': rsakv, 'sp': password_secret, 'sr': '1366*768', 'encoding': 'UTF-8', 'prelt': '115', 'url': 'http://weibo.com/ajaxlogin.php?framelogin=1&callback=parent.sinaSSOController.feedBackUrlCallBack', 'returntype': 'META' } login_url = 'http://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.18)' if showpin == 0: login_page = session.post(login_url, data=postdata, headers=headers) else: pcid = sever_data["pcid"] get_cha(pcid) postdata['door'] = input(u"请输入验证码") login_page = session.post(login_url, data=postdata, headers=headers) login_loop = (login_page.content.decode("GBK")) # print(login_loop) pa = r'location\.replace\([\'"](.*?)[\'"]\)' loop_url = re.findall(pa, login_loop)[0] # print(loop_url) # 此出还能够加上一个是否登陆成功的判断,下次改进的时候写上 login_index = session.get(loop_url, headers=headers) uuid = login_index.text uuid_pa = r'"uniqueid":"(.*?)"' uuid_res = re.findall(uuid_pa, uuid, re.S)[0] web_weibo_url = "http://weibo.com/%s/profile?topnav=1&wvr=6&is_all=1" % uuid_res weibo_page = session.get(web_weibo_url, headers=headers) weibo_pa = r'<title>(.*?)</title>' # print(weibo_page.content.decode("utf-8")) userID = re.findall(weibo_pa, weibo_page.content.decode("utf-8", 'ignore'), re.S)[0] print(u"欢迎你 %s, 登录成功" % userID) if __name__ == "__main__": username = input(u'用户名:') password = input(u'密码:') login(username, password)
另可参考:
http://blog.csdn.net/andrewseu/article/details/48730735
http://www.jianshu.com/p/816594c83c74
https://github.com/ResolveWangweb