How does the socket API accept() function work?

networking sockets tcp

The socket API is the de-facto standard for TCP/IP and UDP/IP communications (that is, networking code as we know it). However, one of its core functions, accept() is a bit magical.

To borrow a semi-formal definition:

accept() is used on the server side. It accepts a received incoming attempt to create a new TCP connection from the remote client, and creates a new socket associated with the socket address pair of this connection.

In other words, accept returns a new socket through which the server can communicate with the newly connected client. The old socket (on which accept was called) stays open, on the same port, listening for new connections.

How does accept work? How is it implemented? There's a lot of confusion on this topic. Many people claim accept opens a new port and you communicate with the client through it. But this obviously isn't true, as no new port is opened. You actually can communicate through the same port with different clients, but how? When several threads call recv on the same port, how does the data know where to go?

I guess it's something along the lines of the client's address being associated with a socket descriptor, and whenever data comes through recv it's routed to the correct socket, but I'm not sure.

It'd be great to get a thorough explanation of the inner-workings of this mechanism.

so for every client request, a brand NEW socket connection at server end is opened. The server must be open at 80 always to listen for incoming calls. If it receives a call, it then immediately creates a NEW socket with the four tuples as mentioned below, which will make a TCP connection between client and server. Is my understanding correct?

This is a very fundamental question and I was recently tested on this in an interview: stackoverflow.com/questions/24871827/… If you have any comments on this, please post

@brainstorm Only if you completely ignore the existence of HTTP keep-alive.

Stefan van den Akker

Your confusion lies in thinking that a socket is identified by Server IP : Server Port. When in actuality, sockets are uniquely identified by a quartet of information:

Client IP : Client Port and Server IP : Server Port

So while the Server IP and Server Port are constant in all accepted connections, the client side information is what allows it to keep track of where everything is going.

Example to clarify things:

Say we have a server at 192.168.1.1:80 and two clients, 10.0.0.1 and 10.0.0.2.

10.0.0.1 opens a connection on local port 1234 and connects to the server. Now the server has one socket identified as follows:

10.0.0.1:1234 - 192.168.1.1:80

Now 10.0.0.2 opens a connection on local port 5678 and connects to the server. Now the server has two sockets identified as follows:

10.0.0.1:1234 - 192.168.1.1:80  
10.0.0.2:5678 - 192.168.1.1:80

I don't know the implementation details (which probably vary from platform to platform), I just know that conceptually the sockets are identified by the quartet of information I described.

Do you have any reference on this?

Random question: What happens if NAT is being used, and two clients on the same network attempt to use the same local port when connecting to the server? For instance, if 10.0.0.1 and 10.0.0.2 are both connected to a router with an external IP of 192.168.0.1, so the server at 192.168.1.1 sees two connections from 192.168.0.1. What happens in that case if by some fluke of the random-number-generator both 10.0.0.1 and 10.0.0.2 choose the same local port?

The NAT support in the router takes care of the details there. The network traffic is actually going over two connections - client to router, and router to server. The router makes the outgoing connections on two different ports 192.168.0.1:1234 and 192.168.0.1:5678. The incoming traffic is then redirected by the router to the correct client.

If a socket is identified by the quartet, what is quartet information of a listening socket?

Methos

Just to add to the answer given by user "17 of 26"

The socket actually consists of 5 tuple - (source ip, source port, destination ip, destination port, protocol). Here the protocol could TCP or UDP or any transport layer protocol. This protocol is identified in the packet from the 'protocol' field in the IP datagram.

Thus it is possible to have to different applications on the server communicating to to the same client on exactly the same 4-tuples but different in protocol field. For example

Apache at server side talking on (server1.com:880-client1:1234 on TCP) and World of Warcraft talking on (server1.com:880-client1:1234 on UDP)

Both the client and server will handle this as protocol field in the IP packet in both cases is different even if all the other 4 fields are same.

a2800276

What confused me when I was learning this, was that the terms socket and port suggest that they are something physical, when in fact they're just data structures the kernel uses to abstract the details of networking.

As such, the data structures are implemented to be able to distinguish connections from different clients. As to how they're implemented, the answer is either a.) it doesn't matter, the purpose of the sockets API is precisely that the implementation shouldn't matter or b.) just have a look. Apart from the highly recommended Stevens books providing a detailed description of one implementation, check out the source in Linux or Solaris or one of the BSD's.

Yes, most of the networking terminology is just assigning names to certain collections of bits and to decisions taken based on their values ("protocol identifier", "routing", "binding", "socket" etc.). All your network card's hardware is designed to receive is a stream of bits. What happens to them in relation to programs on your computer is decided by the driver and OS. We could get rid of all of that terminology tomorrow if we wanted, but the principle of delivering a stream of bits seems fundamental...

佚

佚名

As the other guy said, a socket is uniquely identified by a 4-tuple (Client IP, Client Port, Server IP, Server Port).

The server process running on the Server IP maintains a database (meaning I don't care what kind of table/list/tree/array/magic data structure it uses) of active sockets and listens on the Server Port. When it receives a message (via the server's TCP/IP stack), it checks the Client IP and Port against the database. If the Client IP and Client Port are found in a database entry, the message is handed off to an existing handler, else a new database entry is created and a new handler spawned to handle that socket.

In the early days of the ARPAnet, certain protocols (FTP for one) would listen to a specified port for connection requests, and reply with a handoff port. Further communications for that connection would go over the handoff port. This was done to improve per-packet performance: computers were several orders of magnitude slower in those days.

can you elaborate on the 'handoff port' part?

This is either a description of some pre-TCP protocol, or overly simplified. A client attempting to connect to a listening socket sends a special packet to establish the connection (SYN bit set). There's a clear distinction between a packet creating a new socket and one using an existing socket.

...sends a special packet to establish the connection (SYN bit set). Which (as I understand it) causes the protocol stack to give it to 'the' listener (if any) which is why there can be only one listening port per address/port/protocol combination. I'm not sure if this is in the spec or merely implementation convention though.

The second paragraph does not correctly describe what happens either at the TCP layer or within a server process. Server processes don't need to maintain data structures of sockets of any kind, or to check incoming IP:port pairs against anything whatsoever. That's what sockets are there for. FTP uses a separate port for data, not for all 'further communications', and hats done to simplify the protocol, not for performance reasons. Using an new port while not improve performance in any way whatsoever.

"maintains a database (meaning I don't care what kind of table/list/tree/array/magic data structure it uses)" :) I usually call this a "Table" (or maybe "Graph" or "Decision tree"). "Database" suggests some implementation to me.

How does the socket API accept() function work?

Follow WeChat

Want to stay one step ahead of the latest teleworks?

相似问题

Platform

Support

Links

Contact US