使用tcp实现简单web服务器

使用TCP实现简单web服务器

原理

web服务器的原理其实就是通过tcp发送特定的字段给某一个ip,也就是http协议,当服务器接收到请求后,发送相应的字段并把需要显示的网页数据发送给客户端,这样客户端就会在浏览器内显示服务器的数据。

request headers

request headers

随便点开一个网页,按F12,然后选Network,就可以看到每次发送申请的headers,也就是说我们想要创建自己的服务器就需要在浏览器发送请求时,接收这段字符。

发送数据给客户端

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def service_client(new_socket):
request = new_socket.recv(1024).decode('utf-8')
print(request)
request_list = request.splitlines()
print("")
print("*"*20)
print(request_list)
response = "HTTP/1.1 200 OK\r\n"
response += "\r\n"
#response += "hahahaha"
with open("index.html", 'rb') as fp:
html_content = fp.read()
fp.close()

new_socket.send(response.encode('utf-8'))
new_socket.send(html_content)

new_socket.close()

这边服务器接收到连接请求,就先回复一个response的包,然后读取需要显示的网页数据发送给客户端,这样客户端就可以显示网页了

Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import socket
import re


def service_client(new_socket):
request = new_socket.recv(1024).decode('utf-8')
#print(request)
request_list = request.splitlines()
print("")
print("*"*20)
print(request_list)
response = "HTTP/1.1 200 OK\r\n"
response += "\r\n"
#response += "hahahaha"
with open("index.html", 'rb') as fp:
html_content = fp.read()
fp.close()

new_socket.send(response.encode('utf-8'))
new_socket.send(html_content)

new_socket.close()


def main():
tcp_server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
tcp_server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

tcp_server_socket.bind(("", 7890))
tcp_server_socket.listen(128)

while True:
new_client_socket, new_client_addr = tcp_server_socket.accept()
service_client(new_client_socket)

tcp_server_socket.close()


if __name__ == '__main__':
main()

优化

我演示时就是拉了一张baidu的网页,里面有很多的图片,上面的代码打开后你会发现有些图片,还有样式是看不到的,下面把这些也加上去。

上面的代码开启服务器后,客户端访问会在终端打印出以下类似的信息:

1
2
3
4
5
6
7
8
9
['GET /index_files/soutu.css HTTP/1.1', 'Host: 127.0.0.1:7890', 'Connection: keep-alive', 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36', 'Accept: text/css,*/*;q=0.1', 'Referer: http://127.0.0.1:7890/', 'Accept-Encoding: gzip, deflate, br', 'Accept-Language: zh-CN,zh;q=0.9,en;q=0.8', '']
GET /index_files/swfobject_0178953.js.%E4%B8%8B%E8%BD%BD HTTP/1.1
Host: 127.0.0.1:7890
Connection: keep-alive
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36
Accept: */*
Referer: http://127.0.0.1:7890/
Accept-Encoding: gzip, deflate, br
Accept-Language: zh-CN,zh;q=0.9,en;q=0.8

我们就会发现这上面第一行 GET,到HTTP前面就是我们要访问的一个图片或样式。所以我们就需要通过正则表达式把这些字段过滤出来

ret = re.match(r”[^/]+(/[^ ]*)”, request_list[0])
file_name = ret.group(1)

我们可以添加打印把file_name 打印出来

/
/index_files/swfobject_0178953.js.%E4%B8%8B%E8%BD%BD
/index_files/tu_77547af.js.%E4%B8%8B%E8%BD%BD
/index_files/soutu.css
/index_files/search-sug_b3528ce.js.%E4%B8%8B%E8%BD%BD
/index_files/bd_logo1.png
/index_files/bd_logo1(1).png
/index_files/baidu_jgylogo3.gif
/index_files/baidu_resultlogo@2.png
/index_files/jquery-1.10.2.min_65682a2.js.%E4%B8%8B%E8%BD%BD
/index_files/all_async_search_71ba635.js.%E4%B8%8B%E8%BD%BD

然后我们就可以在打开文件时候读这个绝对路径下的地址,把数据发送给客户端