How does the web server locate a file on server through URL? - webserver

Has anyone ever tried to implement a web server? Or know something about the underhood of a working web server program? I am wondering what happens exactly from when a URL is received by the web server to a file on the web server is located and sent back as response.
Does the server just keep an internal table to remember the mapping between the URLs it supports and the corresponding local paths? Or is there anything more tricky?
Thanks!
Update
Thanks for your replies. Here's my understanding for now.
I checked with the Microsoft IIS (Internet Information Service), I noticed that IIS can host multiple sites, and foreach site IIS memorize its root path on the local file system. Different sites on the same host share the same host name or IP, and they are differentiated by separate ports. For example:
http://www.myServer.com:1111/folderA/pageA.htm
The web server will use www.myServer.com:1111 part of the URL string to locate which path on its local file system will be used, and then in that local path, it searches for subfolder folderA and then the file pageA.htm.
The web server only need to memorize the following mapping between 2 plain strings:
"http://www.myServer.com:1111/" <---> "D:\myWebRoot"
I don't know where this kind of mapping info is stored, maybe some config files for the Web Server Program in question.
But the result of this mapping granularity is that we could only access content within that mapped local folder. We couldn't do arbitray mapping.
Update - 2 -
I found where the IIS keep the mapping, here's some quotes from applicationHost.config:
<sites>
<site name="Default Web Site" id="1" serverAutoStart="false">
<application path="/">
<virtualDirectory path="/" physicalPath="%SystemDrive%\inetpub\wwwroot" />
</application>
<bindings>
<binding protocol="http" bindingInformation="*:80:" />
<binding protocol="net.tcp" bindingInformation="808:*" />
<binding protocol="net.pipe" bindingInformation="*" />
<binding protocol="net.msmq" bindingInformation="localhost" />
<binding protocol="msmq.formatname" bindingInformation="localhost" />
</bindings>
</site>
<site name="myIISService" id="2" serverAutoStart="true">
<application path="/" applicationPool="myIISService">
<virtualDirectory path="/" physicalPath="D:\MySites\MyIISService" />
</application>
<bindings>
<binding protocol="http" bindingInformation="*:8022:" />
</bindings>
</site>
<siteDefaults>
<logFile logFormat="W3C" directory="%SystemDrive%\inetpub\logs\LogFiles" />
<traceFailedRequestsLogging directory="%SystemDrive%\inetpub\logs\FailedReqLogFiles" />
</siteDefaults>
<applicationDefaults applicationPool="DefaultAppPool" />
<virtualDirectoryDefaults allowSubDirConfig="true" />
</sites>
Update - 3 -
After I read foo's reply, my undersanding of a "server" is enlarged. I want to make some comment based on my recent learning of WCF.
No matter what kind of server it is, we could always send messages to them by specifying the protocol, URL, port. For example:
[http://www.myserver.com:1111/]page.htm
[net.tcp://www.myserver.com/]someService.svc/someMethod
[net.msmq://www.myserver.com/]someService.svc
[net.pipe://localhost/]
After the messages arrives at the server program using the parts in square bracket of above URLs, the rest part of the url will send to the server program as input for further processing. And the following behaviour could be as simple as static content feeding or as complex as dynamic content generating.

Depends on the webserver and what its focus is.
(For all items, checking access rights, remapping and such steps apply of course.)
General-purpose webservers like Apache start out with files and directories, so they split up the URL into a hierarchical path description, try to find a file at the given location, and serve it if it exists. (This gets more complex with modules and filetypes; some filetypes imply processing the file as a script and returning the script output rather than just piping out the file contents, and so on).
Application servers like Tomcat do a mapping to servlets; if they have found a servlet that will handle the URL, they call it and pass any leftover URL parts/parameters to it for further handling.
Embedded webservers may even use hardcoded lookup tables for available URL patterns, directly mapping to functions to be called.
Special-purpose webservers will do whatever is required; some won't even parse the URL but just the other headers (like some streaming servers do).
It all depends on what you want to achieve. In most cases, you will be best off with nginx or Apache and maybe some modules and/or finetuning.
Be aware that any HTTP header can be used for mapping the request to whatever means of producing output you have. Hostname, port and URL are used most often, but you may as well take language or client IP or other header data and use them in the mapping.
So for your question: Yes, it can be as simple as that; and yes, it can be substantially more tricky (with mapping, rewriting, and complex processing).

For servers that serve "files", a typical approach is to treat the path portion of the URL as a relative path starting at a "web root" directory defined in the server's configuration. However, a URL doesn't have to correspond to a file on disk at all; it could correspond to an object or method in a running web application, or a database record, or anything else.

For static files there's usually no means of a mapping. The only what the webserver need to know is the absolute disk file system path to the public web document root which is usually definied somewhere in some deployment configuration file (httpd.conf for Apache HTTPD, server.xml and/or context.xml for Apache Tomcat, etc). The webserver extracts the relevant part from the URL, converts it to an absolute disk file system path based on the path to the web document root, locates the file on disk and streams it.

Related

Octopus Web.config transforms for client endpoint address

I have a web project that has uses two service endpoints located in the Web.config file under the client --> endpoint --> address portion
I have found the following per Octopus Variables section but cannot seem to find any reference of how to address the changes using and actual variable like you normally would
I am using the webui for Octopus which would be
http://{server-name}/app#/projects/{project-name}/variables
variable-substitution-syntax
I attempted assigning the variables like so, but the values never updated
the original entries look like the following
<endpoint address="http://services-test.example.com/test.svc/soap" binding="basicHttpBinding" bindingConfiguration="soap" contract="test.service" name="soap" />
Name Address Instance
Endpoint[A].Address test-service-a.example.com 1
Endpoint[B].Address test-service-b.example.com 2
Is this something that is ever possible using Octopus Variables? (I know it can be done using regular Web.config Transforms, as we are doing that already).
If it is possible what is would the correct replacement value for the
endpoint address
be and how would I accomplish this for multiple different endpoint addresses?
Sounds like you are most of the way there. If it's already working with your Web.config transforms then all you need to do is replace the value IN the transform with the variable replacement token.
For example: Web.Release.Config
<configuration xmlns:xdt="http://schemas.microsoft.com/XML-Document-Transform">
<system.serviceModel>
<client>
<endpoint address="http://#{Server1}/test.svc/soap" name="x1"
xdt:Locator="Match(name)" xdt:Transform="SetAttributes(address)" />
<endpoint address="#{Endpoint2}" name="x2"
xdt:Locator="Match(name)" xdt:Transform="SetAttributes(address)" />
</client>
</system.serviceModel>
</configuration>
Of course there are lots of options here.
For filenames you could stick with default conventions 'Web.Release.config' or go with 'Web.[Environment].config' or go with something custom. We use 'Web.Octopus.Config' so that it wont get picked up by any other process.
More on naming transforms here: https://octopus.com/docs/deploying-applications/configuration-files#Configurationfiles-Namingconfigurationtransformfiles
More on custom transforms (Web.Octopus.com) here: https://octopus.com/docs/deploying-applications/configuration-files#Configurationfiles-AdditionalConfigurationTransforms
For variables, you could define a variable for just the server (name=x1) which is simpler or just put the whole address in a variable with gives Octopus a lot more control (name=x2).
The key part is getting the variable replacement tokens into the config. Octopus runs the variable substitution on config files first, and then runs the transforms. What that means is the first pass will replace the tokens in your Web.Release.Config, and then Octopus will run the transforms against Web.config
Hope that helps.

application configuration in biztalk

I developed an BizTalk orchestration where I am calling custom library method.
Since my custom library is consuming a web service and writing data into database therefore it reads various info like database connection string , WCF service endpoint address from appconfig. I put my custom library into GAC and deployed the BizTalk application but I am unable to find a place where I can put the appconfig which is used by custom library.
I Googled and found to append the config file in BTSNTSVc.exe placed under :\Program Files (x86)\Microsoft BizTalk Server 2013, however its not the recommended way.
You can save your configuration in BTSNTSvc.exe.config, but that file contains biztalk host configurations.
Take in mind, that if you'll have a syntax error in that config file - you'll have troubles running biztalk engine.
Best solution is to use a cache layer which you will consume by C# class library from your orchestration.
A better option is add to BTSNTSvc.exe.config a redirection to your config file, for example:
<appSettings>
<add key="myConfigFile" value="C:\MyProject\Config\myConfigFile.config" />
</appSettings>
This allows you to modify the configuration of your application without modify each time the BTSNTSvc.exe.config.

Configure IIS7 HTTP redirect at a site level

When configuring IIS 7 HTTP redirect it appears that there is an option to set this at the server level and also an option to set this for individual sites.
However, when using IIS Manager, if I alter the settings for one site it overwrites the settings for other sites.
for example: if I set up:
gubbin.com >> www.gubbin.com
that works fine
if I then go into monkey.com and add a redirection
monkey.com >> www.monkey.com
when I go back to gubbin.com I find it has been overwritten:
gubbin.com >> www.monkey.com
Is this a limitation of IIS (i.e. can only handle one redirect at a time) or a bug in the Manager application?
Can I get the desired results by editing a config someplace - or do I need to get a URL re-writer or something?
D'oh.
The IIS Manager is just editing the web.config for the site - and hence putting a redirect section into the config:
<configuration>
<system.webServer>
<httpRedirect enabled="true" exactDestination="true" httpResponseStatus="Found">
<add wildcard="*.php" destination="/default.htm" />
</httpRedirect>
</system.webServer>
</configuration>
The issue I had was that I set up the redirect sites to point that the same folder (since there's no content it didn't seem worth having a whole folder structure for each one)
The solution is to have a folder for each site, just to store the web.config for the redirect.

AppSettings values not being picked up in web.config in subdirectory

We have a set of WCF Services being hosted in IIS. In our architecture, we have different instances of the service that vary in configuration hosted in subdirectories under the root.
I have a Web.Config in the root that has some configuration information in the AppSettings section. Specifically, it looks like this:
<appSettings>
<add key="Environment" value="Local" />
</appSettings>
In the different subdirectories, I add another Web.config which adds other settings that are specific to the services in that subdirectory.
<appSettings>
<add key="Subdir" value="ABC" />
</appSettings>
However, when I invoke a service (.svc) file in the subdirectory, there is only one value in the ConfigurationManager.AppSettings collection ("Environment"). The "Subdir" key is nowhere to be found. (BTW, the code is written using ConfigurationManager, but I've also tried using WebConfigurationManager, and the result is the same.)
The documentation on MSDN clearly says that the Web.config files in nested directories are supposed to be cumulative. So why isn't my "Subdir" key showing up in that collection?
Thanks in advance for any help you can give me.
For the benefit of anyone else that encounters this, I found the answer.
Even though this app is a service running under IIS, and is being accessed through IIS, the point where this config element is needed is deep down in a derived class of System.ServiceModel.ServiceHostFactory. At that point, HttpContext.Current is null, so the ConfigurationManager has no access to Web directory structure information, and has to depend simply on the application configuration, which is at the top level.
So our solution is to make each Subdirectory an IIS application, and give it all of its own config information in one place.

httphandler shared hosting deployment

I have the httphandler on shared webhosting.
It works.
The httphandler webapp (virtual) dir of this httphandler does not have web.config
and the whole shared user's website has web.config with only one uncommented statement:
<compilation defaultLanguage="c#" debug="false"/>
Now, I change it to:
<system.web>
<urlMappings enabled="true">
<add url="~/CheckLoad" mappedUrl="~/BackupLicense.ashx?key=CheckLoad"/>
<compilation defaultLanguage="c#" debug="false">
</compilation>
</system.web>
This(*) works locally (on VS2008 internal webserver)
but not on shared hosting.
What do I miss?
(*) means calling [1a], which works only locally but on shared hosting it gives
"The page not found" "HTTP Error 404"
[1a] Calling as:
http://www.MySharedSite.com/CheckLoad
(additionally to always and evertwhere working
[1b] http://www.MySharedSite.com/BackupLicense.ashx?key=CheckLoad
There are some subtle differences in how URL's are handled on the built-in webserver and on IIS6 and 7. You need to know the version of IIS running on your shared host.
Specifically, IIS6 does not support URL's without the extension being mapped to aspnet_isapi.dll - and since you are not using an extension for the URL, this could be the case.
If your host is using IIS7 with integrated pipeline mode, you propably need to configure the system.webServer section with your url mapping, instead of system.web. (Also, see this question for an explanation of the difference).
Edit
I see in the comments your webhost is using IIS6. Then you need to ask your webhost to allow the /Checkload URL to be processed by ASP .NET. Another easy way to make it work would be to just use .ashx on the end of the url; since the .ashx extension is already mapped to ASP .NET in the standard configuration.