glugt-mgl

 

Web Servers

Page history last edited by ankit 2 yrs ago

 What is a Web Server ?

A web server is a program that runs on a host computer (also, confusingly enough, called a web server) that serves up web sites. In other words, the web server program sits around awaiting requests from visitors' web browsers for objects it has in its possession, and then sends these objects back for the visitor's viewing pleasure. Objects that web servers can serve include HTML documents, plain text, images, sounds, video, and other forms of data. These objects may not necessarily exist in static form, but instead are generated on-the-fly by programs run by the server; CGI scripts are the most common of these programs.

                                                                         or

A computer program that is responsible for accepting HTTP requests from clients, which are known as Web browsers, and serving them HTTP responses along with optional data contents, which usually are Web pages such as HTML documents and linked objects (images, etc.).

Web servers and browsers communicate using HTTP, Hypertext Transfer Protocol, a simple but effective language for requesting and transmitting data over a network. Thus, you'll sometimes hear web servers referred to as HTTP servers.

Features of a webserver

Some basic common features are :

 

  1. HTTP : every Web server program operates by accepting HTTP requests from the network, and providing an HTTP response to the requester. The HTTP response typically consists of an HTML document, but can also be a raw text file, an image, or some other type of document; if something bad is found in client request or while trying to serve the request, a Web server has to send an error response which may include some custom HTML or text messages to better explain the problem to end users.
  2. LOGGING: usually Web servers have also the capability of logging some detailed information, about client requests and server responses, to log files; this allows the Webadmin to collect statistics by running log analyzers on log files.
  3. Authentication, optional authorization request (request of username and password) before allowing access to some or all kind of resources.
  4. Handling of not only static content (file content recorded in server's filesystem(s)) but of dynamic content too by supporting one or more related interfaces  (SSI, CGI,  SCGI, PHP, ASP, SEVER API, ASP .NET,etc) .
  5. HTTPS support (by SSL or TLS) to allow secure (encrypted) connections to the server on the standard port 443 instead of usual port 80.
  6. Content compression(i.e. by gzip encoding) to reduce the size of the responses (to lower bandwidth usage, etc.).
  7. Virtual Hosting to serve many web sites using one IP adresses, etc.

Path translation

Web servers usually translate the path component of a Uniform Resource Locator(URL) into a local file system resource. The URL path specified by the client is relative to the Web server's root directory.

Consider the following URL as it would be requested by a client:

http://www.example.com/path/file.html

The client's Web browser will translate it into a connection to www.example.com with the following HTTP 1.1 request:

GET /path/file.html HTTP/1.1
Host: www.example.com

The Web server on www.example.com will append the given path to the path of its root directory. On LInux machines, this is commonly /var/www/html. The result is the local file system resource:

/var/www/html/path/file.html

The Web server will then read the file, if it exists, and send a response to the client's Web browser. The response will describe the content of the file and contain the file itself.

 

 

Socket

A socket is fundamentally nothing but an end point of communication.

It can be of two types: Physical socket and Logical socket. In Logical socket operating system has its system calls, which creates them. Now for client-server access the socket needs three things to provide service or ask for service.

 

1) Service name (example: telnet)

2) Protocol (TCP-stream)

3) Port no (23)

 

The service uses protocol and protocol uses port number to provide service at server end and to get service at client end. Ultimately we find that the port number is mainly responsible for a client server communication. The protocols supported by Linux is shown by /etc/protocols and the services can be seen in /etc/services.

* telnet service uses TCP/IP protocol and communicate through port no. 23

* ftp service uses TCP/IP protocol and communicate through 20,21 port numbers

* www service uses http protocol and communicate through port no 80.

Web communication

Web communication deals with a browser type of client process and Web server type of server process.

What actually happens when a user writes http://www.yahoo.com? Well, the browser transfers the URL to current machine's operating system with a destination address' operating system, which is responsible for extracting protocol i.e. "http" from the client socket (browsers) and then it packets data using layer software and over the packet it attaches the header http. This enables the remote machine to hand over the request to Web server of remote machine. Why so? Because there can be many a server running on the same machine so the particular services are distinguished by their protocol.

But how should we explain when telnet and ftp both are using same protocol but have different server Processes? The answer is that they are distinguished by their port numbers. Services may have same protocol but not the same port number. After this the operating system throws the data to network interface card through the ram and then network interface card gives it to nearest gateway, which sends the data to the server machine at server end.

The network card gives a signal back to operating system that a data enclosed with http header using TCP/IP header has arrived. One's operating system checks that data has http wrapper and searches for Web server on that machine. When it finds, it hands over the data and pays attention to other processes.

Before the Web server processes the data, it goes through a filtration by the gateway process implemented on the Web server, which actually filters the raw data. This concept implemented is called as common gateway interface that has the Web server environment variables, which stores the data in different variable. When the user asks for some unnecessary data, headers also get attached with data and so the need for filtration.

Performance

 

Web servers (programs) are supposed to serve requests quickly from more than one TCP/IP connection at a time.

Main key performance parameters (measured under a varying load of clients and requests per client), are:

  • number of requests per second (depending on the type of request, etc.);
  • latency time in milliseconds for each new connection or request;
  • throughput in bytes per second (depending on file size, cached or not cached content, available network bandwidth, etc.).

 

 

Software

 

The few most common HTTP serving programs are : Apache HTTP Sever(Apache Software Foundation), Internet Information Services,IIS(Microsoft), Sun Java System Web Server(Sun Microsystem).

Some Basic Knowledge about APACHE

The Web server is meant for keeping Websites. There are three ways a Website can be stored. They are:

1) default directory hosting

2) virtual directory hosting

3) virtual domain hosting

 

We have to first configure the DNS(Domain Name Server). If we use Apache as a Web server whether on Windows platform or Linux, the main file which is used is called /etc/httpd/conf/httpd.conf

The root directory of Web server is /etc/httpd, which is divided into three parts:

1) /etc/httpd/conf (where configuration files stays)

2) /etc/httpd/logs (where the logs of Web server and site accessing stay)

3) /etc/httpd/modules (where the module stays, which enables the server side programmer to do programming in the languages supported by Web server)

* httpd.conf-Apache HTTP server configuration file

Comments (4)

ankit said

at 11:50 am on Jun 5, 2007

done by ankit...suren, have a look once, it needs some editing!!

ankit said

at 11:52 am on Jun 5, 2007

and ya i havent concentrated on apache much! just a bit of introduction...if u want i can add it...but then the time constraint! don think will get that much time....bcoz anshu will also be needing some time for perl,python,etc!

Suren said

at 1:02 pm on Jun 6, 2007

ankit : great!! really nice work !!! You are inspiring me to work out of my project !!! Do work only as much as u can and only when you are free and all!! i mean if u feel like adding something will improve the book then please do it!!! and u don have to ask me and all :)

python perl: just basics. how to write a python file, sample hello world, sample prog with i/p and o/p and their advantages of being used in linux..that will itself come to around one hour !!!

msk said

at 12:07 pm on Jun 14, 2007

i feel its too elaborate .. !! .. please cut down on the "introduction to server-client architecture" and put more on apache server .. keep is short and simple ..

You don't have permission to comment on this page.